As the Technical Lead for DevOps & Site Reliability Engineers(SREs), you will lead and mentor a multidisciplinary team of DevOps and SRE engineers, overseeing infrastructure reliability, automation, monitoring, and incident response.
While leading a cloud migration initiative, this role will also focus on enhancing system resilience, optimizing CI/CD pipelines, and fostering a culture of continuous improvement.
Rrsponsibilities
Lead, mentor, and grow a high-performing team of DevOps and SRE engineers, promoting best practices in automation, reliability, and operational excellence.
Design and implement strategies for infrastructure reliability, scalability, and security across cloud and on-premises environments.
Plan and execute cloud migration projects, ensuring smooth transition with minimal downtime and risk mitigation.
Oversee and improve CI/CD pipelines, infrastructure as code (IaC), and deployment automation to accelerate software delivery.
Develop and maintain monitoring, alerting, and observability solutions to proactively detect and resolve issues.
Collaborate with development, QA, security, and product teams to align operational goals with business objectives.
Lead incident management, root cause analysis, and post-incident reviews to continuously improve system stability.
Manage cloud resource utilization and cost optimization efforts.
Drive adoption of DevSecOps practices, embedding security automation into workflows.
Advocate for and implement GitOps and cloud-native best practices.
Communicate technical strategies and progress effectively to stakeholders and senior management.
Requirements
Bachelor’s or Master’s Degree in Computer Science, Engineering, or related field.
5+ years of experience in DevOps, SRE, or infrastructure engineering roles with at least 2 years in a leadership capacity.
Proven experience leading and mentoring DevOps and SRE teams.
Hands-on expertise with cloud platforms (AWS, Azure, GCP, etc) and experience leading cloud migration projects.
Strong knowledge of Infrastructure as Code tools (e.g Terraform, CloudFormation, ARM templates).
Proficiency with containerization and orchestration technologies (Docker, Kubernetes).