Technology
·
Infrastructure & Ops
·
Abu Dhabi
Site Reliability Engineer (SRE)
Job Responsibilities:
- Automate routine operational tasks using Shell scripting, ensuring efficiency in log analysis, batch management, and system optimization.
- Maintain and optimize middleware components supporting infrastructure operations, ensuring stability and performance.
- Administer and optimize Kubernetes clusters, ensuring scalability, security, and performance.
- Maintain and optimize monitoring and alerting systems based on Prometheus, ensuring high availability of services.
- Contribute to the development of CI/CD pipelines
- Manage cloud resources efficiently, implementing cost optimization strategies to reduce cloud expenditure.
- Improve operational processes, develop automation tools, troubleshoot incidents, and enhance system stability and reliability.
Job Requirements:
- Proficiency in Shell scripting for automating operational workflows and system management tasks.
- Experience in Python or Go, preferably for system automation, tooling, or backend services.
- At least 5 years experience in Operation&Maintenance-related job experience. At least 2 years of hands-on Kubernetes administration experience, including expertise in CSI, CNI, and managing clusters with 20+ nodes in production.
- Experience with Prometheus for monitoring and alerting in an enterprise environment.
- Familiarity with CI/CD deployment processes, with knowledge of GitOps principles. Hands-on experience with GitOps is a plus.
- Experience managing cloud platforms using Infrastructure as Code (IaC) tools like Terraform/OpenTofu. Azure experience is a plus.
- Strong problem-solving skills, a proactive approach to troubleshooting, and a commitment to improving operational efficiency and system reliability.
Bonus Points:
- Experience managing large-scale distributed systems and microservices architecture.
- Background in Site Reliability Engineering (SRE) best practices
- Could speak and write Chinese
- Division
- Technology
- Department
- Infrastructure & Ops
- Locations
- Abu Dhabi
Already working at Astra Tech?
Let’s recruit together and find your next colleague.