Job Summary:
We are seeking a motivated Junior to Mid-Level Cloud Site Reliability Engineer (SRE) to join our team. This role is an excellent opportunity to grow your skills in cloud infrastructure and operations, ensuring the stability and performance of our critical systems.
Key Responsibilities:
- Assist in the creation and deployment of core infrastructure components.
- Participate in 7*24 incident management and troubleshooting for infrastructure components to maintain environmental stability.
- Support 7*24 operational duties for business systems, assisting with the response and escalation of urgent and major incidents.
- Take part in a 7*24 monitoring shift roster (covering application and infrastructure), handling system alerts and abnormal events.
- Work with business teams and utilize Google Cloud Platform (GCP) and cloud-native technologies to assist in implementing and maintaining component solutions.
Qualifications & Requirements:
- Minimum of 3 years of relevant industry experience.
- Understanding and practical experience with Google Cloud Platform (GCP) products and services, including but not limited to Compute Engine, GKE, VPC, Cloud Load Balancing, Cloud CDN, Cloud Storage, and Cloud SQL.
- Familiarity with big data products like BigQuery, including data ingestion, querying, and analysis.
- Capability to deliver resources, deploy applications, perform configuration optimization, and handle monitoring and operational tasks.
- Knowledge of ITIL and DevOps processes and frameworks.
- Professional proficiency in English is required. Chinese communication skills (verbal and written) are a strong advantage.
- Strong ability to work under pressure and willingness to work within a 7*24 shift schedule.
- GCP Associate-level certification, or relevant experience with AWS or Alibaba Cloud, is a plus.