Search by job, company or skills

  • Posted 2 hours ago
  • Be among the first 10 applicants
Early Applicant

Job Description

Job responsibilities:

Handle SRE role for assigned cloud services owning the KPIs for service reliability, issue to resolution, service deployment, business continuity management, security policy planning, capacity planning, Automation ,etc.

  • Automation:Automate routine and manual operations tasks to reduce toil and improve efficiency.
  • Monitoring & Alerting:Implement and use monitoring systems to track system health, set up alerting, and create dashboards.
  • Incident Management:Respond to and manage incidents to minimize downtime and resolve issues quickly, including on-call support.
  • System Performance:Measure, analyze, and tune system performance to ensure efficiency and stability.
  • Infrastructure Management:Provision and manage cloud infrastructure, sometimes using Infrastructure as Code (IaC), and assist in platform management and capacity planning.
  • Reliability & Resilience:Build sustainable and reliable systems through software engineering practices, which can include resilience testing and chaos engineering.

Job Requirements:

Full-time bachelor Bachelor's degree or above (or equivalent) in computer science or related discipline.

Be familiar with Linux, Network, Database. Ability to program using one or more high-level languages, such as Python, Java, C/C++, and JavaScript.

Be familiar with containerization technologies like Docker and orchestration tools like Kubernetes.

Be familiar with configuration management and automation tools such as Ansible and Terraform, monitoring, logging, and alerting tools like Splunk, Grafana, or Prometheus.

More Info

Job Type:
Industry:
Employment Type:

Job ID: 137387747

Similar Jobs