Search by job, company or skills

SRKay Consulting Group Sdn Bhd

Lead Platform Engineer

7-12 Years
MYR 10,000 - 17,000 per month
new job description bg glownew job description bg glownew job description bg svg
  • Posted 6 hours ago
  • Be among the first 10 applicants
Early Applicant
Quick Apply

Job Description

Position Title: Lead Platform Engineer (SRE Lead)

Location:Kuala Lumpur, Malaysia near by Bukit Bintang

Industry:Insurance

Open to:Malaysian citizens only


About the Role:

We are seeking an experienced and driven Lead Site Reliability Engineer to join our technology organization. In this role, you will be responsible for ensuring the reliability, scalability, and performance of critical systems and applications. You will lead SRE initiatives, champion automation, and collaborate closely with development and operations teams to build resilient, high-performing platforms that support our business and customers.

Key Responsibilities:

  • Lead SRE efforts to maintain and improve system reliability, availability, and performance across production and non-production environments.
  • Design, implement, and maintain monitoring, alerting, and observability frameworks to proactively detect and resolve incidents.
  • Drive incident management processes, including root cause analysis, post-incident reviews, and implementation of preventive measures.
  • Champion automation across infrastructure provisioning, deployment, and operational tasks to reduce manual effort and improve consistency.
  • Collaborate with engineering teams to define and enforce service level objectives (SLOs), service level indicators (SLIs), and error budgets.
  • Lead capacity planning, performance tuning, and scalability assessments to ensure systems meet growing business demands.
  • Manage and optimize cloud infrastructure (Azure, AWS) and containerized environments (Docker, Kubernetes).
  • Establish and promote SRE best practices, including chaos engineering, disaster recovery planning, and resilience testing.
  • Mentor and guide junior SRE team members, fostering a culture of operational excellence and continuous improvement.
  • Work closely with development teams to embed reliability considerations into the software development lifecycle.

Required Skills & Experience:

  • Strong knowledge of Linux/Unix systems and networking fundamentals.
  • Proficiency in programming and scripting languages such as Python, Ansible, PowerShell, .Net, or Java.
  • Hands-on experience with cloud platforms (e.g., Azure, AWS).
  • Familiarity with containerization and orchestration tools (e.g., Docker, Kubernetes).
  • Expertise in monitoring and observability tools such as AppDynamics, Application Insights, Dynatrace, Grafana, or the ELK Stack.
  • Strong understanding of CI/CD pipelines and automation frameworks.
  • Proven problem-solving skills and ability to perform root cause analysis.
  • Excellent communication and collaboration skills.
  • Analytical mindset with a focus on reliability, scalability, and performance.
  • Passion for automation and reducing manual toil.
  • Ability to work under pressure and handle critical incidents effectively.
  • Commitment to continuous learning and staying updated on industry trends.

Desired Qualifications:

  • Experience with distributed systems and microservices architecture.
  • Knowledge of database systems (both SQL and NoSQL).
  • Familiarity with incident management frameworks (e.g., ITIL, SRE best practices).
  • Certifications in cloud technologies or DevOps tools.

Why Join Us

  • Lead SRE strategy for a major organization within the insurance industry.
  • Work with modern cloud technologies, containerization, and observability tools.
  • Collaborate with cross-functional teams to drive reliability and operational excellence.
  • Be part of a culture that values automation, innovation, and continuous learning.
  • Play a key role in shaping resilient systems that directly impact business and customer outcomes.

More Info

Job Type:
Function:
Employment Type:
Open to candidates from:
Malaysian

About Company

SRKAY Consulting Group Sdn Bhd (d.b.a SCIKEY) - SCIKEY Talent Commerce (tCommerce) investment is an online marketplace supported by high degree of automation and engagement ecosystem for everything Talent. This venture redefines the way companies hire, contract and engage talent globally from completely physical to significantly digital giving them better choices, convenience and cost advantage that is not possible through traditional models.

Job ID: 144973015