Seeking Experienced Site Reliability Engineering (SRE) / Lead Engineer for Exciting Projects Remote in Guadalajara, Jalisco We are looking for skilled Site Reliability Engineering (SRE) / Lead Engineer with a minimum of 8 years of experience to join a dynamic team within a leading organization. This role must have deep expertise in Application Performance Monitoring (APM), Infrastructure as Code (IaC), automation, and distributed tracing using OpenTelemetry. As a SRE lead, he will guide the design, implementation, and continuous improvement of observability solutions, ensuring system reliability, performance, and scalability while fostering best practices in SRE and DevOps. Key Responsibilities: · - Lead the strategic development and management of observability and reliability frameworks across the organization, ensuring alignment with business goals and technical requirements. · - Design and implementation of monitoring and observability solutions, collaborating with engineering teams to define standards and best practices. · - Manage Infrastructure as Code (IaC) initiatives using Terraform, coordinating with cloud and infrastructure teams to ensure scalable and secure deployments. · - Drive automation strategies for monitoring, alerting, and logging pipelines, focusing on process improvements and operational efficiency. · - Develop and maintain comprehensive observability roadmaps, including distributed tracing, logging, and metrics collection strategies. · - Collaborate with product management, sales, and pre-sales teams to provide technical expertise and support during solution design and customer engagements. · - Lead cross-functional teams to enhance CI/CD pipelines and deployment reliability, ensuring smooth integration of observability tools and practices. · - Engage with vendors and strategic partners to evaluate, select, and integrate observability and monitoring solutions, ensuring alignment with organizational needs and fostering strong collaborative relationships. · - Mentor and develop junior engineers and analysts, fostering a culture of reliability, observability, and operational excellence. Technical Skills Required: · - 8-10+ years of experience in SRE, Observability, or DevOps roles, with leadership responsibilities. · - Hands-on experience with OpenTelemetry for distributed tracing and observability instrumentation. · - Proven expertise with Application Performance Monitoring (APM) tools such as New Relic, Datadog, AppDynamics, or Dynatrace. · - Strong proficiency in Infrastructure as Code (IaC) using Terraform. · - Solid understanding of cloud platforms including AWS, GCP, or Azure. · - Experience with automation/configuration management tools like Ansible, Chef, or Puppet. · - Deep knowledge of CI/CD pipelines and tools such as GitHub Actions, Jenkins, or Azure DevOps. · - Experience managing Kubernetes and containerized environments (Docker, Helm). · - Familiarity with log aggregation and analysis platforms like ELK Stack or Splunk. · - Excellent leadership, communication, and collaboration skills. Location & Schedule: Remote work in Guadalaraja, Jaliso Work hours Monday to Friday, 09:00 – 18:00 Advanced English skills are mandatory Benefits: · Attractive Salary + Premium Benefits · Performance bonuses, grocery coupons, and savings are found. · Aguinaldo, premium vacations, and vacations paid · SGMM Medical insurance, family, and Life insurance. Candidates must include their compensation expectations in their applications and resumes in English. Interested? Apply now through this link
la26.042 - Data Engineering Lead
Tiger Analytics Inc.
Junior Automation Engineer
Swissre
Manager, Salesforce Applications Support
Thomsonreuters
Líder Técnico de Desarrollo de Software - Home Office
Towasoftware
Senior Full Stack Engineer - RepairPal (Remote - Mexico)
Yelp
Software Engineer Integrator
Cummins Latin America