Site Reliability Engineer at Renmoney

Job Overview

Location
Lagos, Lagos
Job Type
Full Time
Date Posted
3 years ago

Additional Details

Job ID
20201
Job Views
181

Job Description



The Position



  • We are seeking a Site Reliability Engineer who will maintain services once live by measuring and monitoring availability, latency, and overall system health with a focus on business activities and continuously evaluating cost and waste


Responsibilities



  • Ensuring availability of UAT and production applications and foster capacity planning for production infrastructures. Monitoring of existing systems/applications using monitoring tools

  • Engage in and improve the whole lifecycle of services from inception and design, through deployment, operations

  • Troubleshooting problems that span systems, databases, storage, network, and codes while suggesting/implementing security measures for the protection of systems, networks, and information

  • Scale systems sustainably through mechanisms like automation and evolve systems by pushing for changes that improve reliability and velocity

  • Minimize and mitigate the risk of reliability-related failures pertaining to systems availability, performance, and correctness.

  • Ensuring investigation into warnings and alerts from monitoring systems, Incident response, diagnosis, and follow-up on system outages

  • Documentation of process and procedure manuals.


Academic Qualification(s)



  • B.Sc. / HND Degree in Computer Science or Information technology-related course

  • ITIL Foundation will be an added advantage.


Requirements:



  • Minimum of 3 years’ experience in a similar role

  • Working knowledge of databases and SQL

  • Comfortable with Open-Source configuration management and orchestration tools (chef, Puppet, Ansible, Terraform, etc.)

  • Knowledge of Docker, Docker swamp, Fargate, and Kubernetes

  • Experience with caching systems such as Kafka and Redis

  • Working experience with building monitoring tools and setting measurement metrics

  • Proficiency with shell and a programming language used in an SRE/Operations engineering context (Python, Go, Ruby, etc.) will be an added advantage

  • Experience with operating in a high availability environment

  • Excellent communication skills with a high level of emotional intelligence

  • Experience in working with remote teams

  • Server Administration skills (Redhat, Windows, CentOs, Ubuntu)


Preferred:



  • Hands-on experience working with AWS Fargate.

  • Experience with Cloud infrastructure services like AWS, Azure, and GCS


We are the place for you, if:



  • You are excited about technology and the future, and you are looking for a place to learn and grow.

  • You care a lot about detail and pride yourself in impeccable execution.

  • You can collect and analyze lots of data and feed in just the right amount of intuition to make sound decisions.

  • You are ready to work extremely hard, at a fast pace, to achieve audacious goals.

  • You love to speak up, ask questions, and are comfortable challenging anyone or any idea.


This job is perfect for you if you:



  • Are creative and an out-of-the-box thinker

  • Have excellent execution skills and are passionate about achieving excellence

  • Enjoy analytical thinking and have problem-solving capabilities

  • Enjoy collaborating with others, building relationships

  • Have a high level of emotional intelligence

  • Have excellent communication and delivery skill.


You will not enjoy this job if you:



  • Work best in structured, hierarchical settings

  • Require clear, pre-set deliverables and constant direction

  • Are not used to working in/with a large team.


What is in it for you



  • You will receive competitive compensation and work with amazing people.

  • You will work in a beautiful environment with a flat structure and solve complex real-world challenges.


Similar Jobs

Cookies

This website uses cookies to ensure you get the best experience on our website. Cookie Policy

Accept