Senior DevOps Engineer (SRE / GCP)

Confidential

Malaysia, Kuala Lumpur

5-7 Years

Save

Posted 18 hours ago
Be among the first 10 applicants

Early Applicant

Job Description

Company Overview

We are a software technology company that provides management and marketing software development services to globally regulated and licensed financial marketing companies.

As part of our continuous global expansion, we are building scalable, secure, and high-performance infrastructure to support international operations and fast-growing business demands. We are looking for a passionate and experienced Senior DevOps Engineer (SRE / GCP) to join our infrastructure team and help drive operational excellence, automation, and cloud reliability across our global platforms.

Role Overview

As a Senior DevOps Engineer (SRE / GCP), you will play a key role in designing, maintaining, and optimizing our cloud infrastructure and production environments. You will be responsible for ensuring high system availability, scalability, security, and operational efficiency across global services.

This role requires strong expertise in Google Cloud Platform (GCP), Kubernetes, Infrastructure as Code (IaC), automation, and modern DevOps/SRE practices. You will collaborate closely with engineering and product teams to build resilient systems, improve deployment pipelines, and enhance platform reliability.

Key Responsibilities

Design, optimize, and maintain highly available Google Cloud Platform (GCP) production environments
Manage and support Linux-based infrastructure and middleware systems across global operations
Lead Site Reliability Engineering (SRE) initiatives, including observability, monitoring, alerting, and incident management
Build scalable automation solutions and self-healing infrastructure to improve operational efficiency
Implement and manage Infrastructure as Code (IaC) using Terraform, Ansible, or related technologies
Develop and improve CI/CD pipelines to support reliable and efficient software deployment processes
Collaborate with development teams to improve system reliability, deployment workflows, and cloud-native architecture
Monitor system performance, troubleshoot production issues, and optimize infrastructure utilization
Ensure cloud security best practices, backup strategies, and disaster recovery (DR) readiness
Conduct root cause analysis and lead blameless post-incident reviews for continuous improvement
Continuously improve DevOps processes, platform scalability, and engineering standards

Requirements

Minimum 5 years of experience in DevOps, Site Reliability Engineering (SRE), or Cloud Infrastructure roles
Strong hands-on experience with Google Cloud Platform (GCP)
Solid experience with Kubernetes, container orchestration, and cloud-native infrastructure
Strong understanding of CI/CD pipelines and DevOps best practices
Experience with Infrastructure as Code (IaC) tools such as Terraform and Ansible
Strong Linux system administration and troubleshooting skills
Good understanding of networking fundamentals, including TCP/IP, DNS, load balancing, routing, and firewall concepts
Experience with monitoring, logging, and observability tools
Strong problem-solving mindset with the ability to troubleshoot complex production issues
Fluent in Chinese and English (spoken and written)
CKA / CKS certifications are an added advantage

Why Join Us