Role Overview:
We are hiring for one of our clients, seeking a RunOps Support – Platform and Infra expert to work on a contract basis. This individual will play a critical role in supporting the development of next-generation AI systems, leveraging their expertise to monitor, troubleshoot, and maintain production infrastructure systems. By contributing to the reliability and scalability of these systems, this role will have a direct impact on the AI industry's ability to deliver high-quality, real-world input to next-generation AI models. The ideal candidate will have a strong understanding of Linux, Docker, Kubernetes, Jenkins, AWS, Azure, and GCP.
Key Responsibilities:
• Monitor, troubleshoot, and support production infrastructure systems, including servers, containers, storage, and networks, to ensure optimal system performance and minimize downtime.
• Respond promptly to operational incidents and alerts, leading the resolution of platform-related issues to minimize downtime and maintain high system availability.
• Manage and optimize CI/CD pipelines using tools such as Jenkins and oversee container orchestration with Kubernetes and Helm.
• Collaborate closely with DevOps and SRE stakeholders to drive automation and enhance system reliability, ensuring a seamless and efficient experience for users.
• Maintain, support, and improve internal tooling and platform services to continuously improve system performance and user experience.
Required Skills & Qualifications:
• Strong understanding of Linux, Docker, Kubernetes, Jenkins, and container orchestration tools.
• Experience with cloud-based infrastructure, including AWS, Azure, and GCP, with a strong understanding of their respective services and tools.
• Excellent troubleshooting and problem-solving skills, with the ability to respond promptly to operational incidents and alerts.
• Strong collaboration and communication skills, with the ability to work effectively with cross-functional teams, including DevOps and SRE stakeholders.
• Experience with CI/CD pipeline management and optimization, including tools such as Jenkins.
More About the Opportunity:
This role offers a unique opportunity to work with a global leader in the AI industry, contributing to the development of next-generation AI systems and leveraging expertise to drive system reliability and scalability. As a key member of the team, this individual will have the opportunity to work with a global network of experts and make a meaningful impact on the AI industry's ability to deliver high-quality, real-world input to next-generation AI models.
Equal Opportunity Employer:
We hire based on skills and expertise. All qualified candidates are welcome regardless of background, experience, or prior employment history. Applications are reviewed solely on demonstrated technical ability and qualifications.
Apply Now!
FBS Mainframe Infrastructure Specialist
Capgemini Insurance
FBS Infrastructure Service Delivery Specialist
Capgemini
GDD/GD - Yard Management DevOps - Back-End (M/F/D)
Basf
Developer or Technical Lead (Niche or Regular), Cloud Consultants, QA Tester, QA Lead, Data Migratio
Careers Inc
Manager, Software Engineer Integration - SRE
Msd
Senior Manager, Software Engineer Integration
Msd