Company Overview
We are a software technology company that provides management and marketing software development services to globally regulated and licensed financial marketing companies.
As part of our continuous global expansion, we are building scalable, secure, and high-performance infrastructure to support international operations and fast-growing business demands. We are looking for a passionate and experienced Senior DevOps Engineer (SRE / GCP) to join our infrastructure team and help drive operational excellence, automation, and cloud reliability across our global platforms.
Role Overview
As a Senior DevOps Engineer (SRE / GCP), you will play a key role in designing, maintaining, and optimizing our cloud infrastructure and production environments. You will be responsible for ensuring high system availability, scalability, security, and operational efficiency across global services.
This role requires strong expertise in Google Cloud Platform (GCP), Kubernetes, Infrastructure as Code (IaC), automation, and modern DevOps/SRE practices. You will collaborate closely with engineering and product teams to build resilient systems, improve deployment pipelines, and enhance platform reliability.
Key Responsibilities
- Design, optimize, and maintain highly available Google Cloud Platform (GCP) production environments
- Manage and support Linux-based infrastructure and middleware systems across global operations
- Lead Site Reliability Engineering (SRE) initiatives, including observability, monitoring, alerting, and incident management
- Build scalable automation solutions and self-healing infrastructure to improve operational efficiency
- Implement and manage Infrastructure as Code (IaC) using Terraform, Ansible, or related technologies
- Develop and improve CI/CD pipelines to support reliable and efficient software deployment processes
- Collaborate with development teams to improve system reliability, deployment workflows, and cloud-native architecture
- Monitor system performance, troubleshoot production issues, and optimize infrastructure utilization
- Ensure cloud security best practices, backup strategies, and disaster recovery (DR) readiness
- Conduct root cause analysis and lead blameless post-incident reviews for continuous improvement
- Continuously improve DevOps processes, platform scalability, and engineering standards
Requirements
- Minimum 5 years of experience in DevOps, Site Reliability Engineering (SRE), or Cloud Infrastructure roles
- Strong hands-on experience with Google Cloud Platform (GCP)
- Solid experience with Kubernetes, container orchestration, and cloud-native infrastructure
- Strong understanding of CI/CD pipelines and DevOps best practices
- Experience with Infrastructure as Code (IaC) tools such as Terraform and Ansible
- Strong Linux system administration and troubleshooting skills
- Good understanding of networking fundamentals, including TCP/IP, DNS, load balancing, routing, and firewall concepts
- Experience with monitoring, logging, and observability tools
- Strong problem-solving mindset with the ability to troubleshoot complex production issues
- Fluent in Chinese and English (spoken and written)
- CKA / CKS certifications are an added advantage
Why Join Us
- Opportunity to work on large-scale global infrastructure projects
- Exposure to modern cloud technologies and high-traffic systems
- Fast-paced, collaborative, and innovation-driven environment
- Competitive salary package with performance-based quarterly bonuses
- Strong career growth opportunities with high ownership and impact
- Work alongside experienced international engineering teams
- Opportunity to shape and improve next-generation cloud infrastructure systems