This is a remote position.
Position Type: Contract (1 FTE)
Compensation: Daily rate available
Location: Remote, with occasional onsite visits in Germany
Language Requirement: English and German fluent
CI/CD Support & Operational Readiness: Validate deployment artifacts from an operational perspective, enforce strict quality assurance measures, and ensure robust rollback strategies and observability are in place for production deployments.
Platform Operations & Incident Management: Monitor system health, performance metrics, and service availability across multi-tenant environments to ensure stability and responsiveness.
Problem Resolution: Identify, analyze, and resolve incidents swiftly to minimize disruption, triggering root cause analyses and implementing preventive actions.
Automation & SRE Practices: Reduce operational toil by automating recurring standard processes and validating procedures through a structured software development lifecycle (staging, testing, and review).
Security & Compliance: Implement comprehensive logging and monitoring strategies to support audit requirements, perform routine security scans, and remediate identified vulnerabilities.
Mid-Level Platform Operations Experience: Proven track record in the operations management of private cloud solutions and managing containerised environments.
Kubernetes Expertise: At least 3 years of hands-on operational experience with self-managed clusters and productive applications in on-premise environments.
CI/CD & GitOps Proficiency: Profound knowledge and implementation experience with continuous integration and delivery processes, workflows, and associated quality/security assurance tools.
Networking Concepts: Deep understanding of networking concepts, including enterprise protocols, load balancing, and network security.
Core Operations & SRE Knowledge: Fundamental understanding of IT Service Management processes (incident, change, and problem management) alongside core Site Reliability Engineering concepts.
Observability: Practical experience gathering insights from monitoring, logging, and observability tools, including the management and tracking of SLIs, SLAs, and SLOs.
Documentation: Proven ability to structure operational topics, properly document technical procedures, and enforce clear runbooks or playbooks.
Language Skills: Professional proficiency in both spoken and written English and German (at least C1 level for both).
Eligibility: Residency in the EU, EEA, UK, or Switzerland.
System Operations Engineer - AI Trainer - Freelance - 8-20hrs/week - Remote
10Xteam
Operations Research Engineer (m/f/d)
Idealworks GmbH
Operational Excellence Engineer
Jobs Sf
QA Engineer (Operations & Automatisierungsfokus (w/m/d)
Westernacher Solutions GmbH
(Senior) Site Reliability Engineer (m/f/d) - Platform & Agentic Operations
1komma5°
Senior Operations Engineer
Mlabs