Search by job, company or skills

N

Lead Specialist, SRE

new job description bg glownew job description bg glownew job description bg svg
  • Posted 8 hours ago
  • Be among the first 10 applicants
Early Applicant

Job Description

At Nadi Tech, part of TNG Digital Group, we build and run secure, high-performance digital infrastructure that powers essential services across Malaysia.

Our platforms support large-scale B2B and public sector initiatives used by millions of people every day. That means reliability, security, and performance are not nice-to-haves. They are built into everything we do from day one.

We are looking for people who enjoy solving complex, real-world problems. People who think in systems, understand how things connect, and take pride in building infrastructure that is stable, scalable, and built to last.

Here, you will take ownership of systems that matter and see your work used in real, everyday scenarios.

What You'll Do:

1.Service Reliability and Availability

  • Ensure uptime/availability of 99.99% are consistently met
  • Reduce Mean Time to Detect (MTTD) and Mean Time to Recover (MTTR) during incidents
  • Drive capacity planning and prevent reliability risks

2.Drive Automation and Operational Excellence

  • Deliver consistent and repeatable deployments with zero critical failures by maintaining and updating deployment scripts/templates
  • Reduce manual toil across the team by measurable percentages
  • Standardize and harden container images, CI/CD pipelines, and cloud infrastructure

3.Release and Disaster Recover

  • Reduce deployment incidents through adherence to best practices in release management
  • Reduce deployment duration via automation
  • Plan and execute disaster recovery and ensure RTO and RPO are met for cloud/multi-cloud environments

4.Incident Response and Troubleshooting

  • Reduce the frequency of recurring issues via problem management & root cause analysis.
  • Establish and enforce incident response processes

5.Security and Compliance

  • Ensure 100% compliance with regulatory and audit requirements for infrastructure security
  • Achieve zero critical security incidents by optimizing infrastructure and adhering to industry standards
  • Ensure infrastructure, container, and code security standards are enforced
  • Successfully implement secure architectures for all new deployments in collaboration with development teams

6.Team Leadership & Strategic Alignment

  • Lead, mentor, and grow the SRE team's technical and operational capabilities
  • Establish on-call rotations, knowledge sharing sessions, and training programs
  • Foster a good culture of blameless accountability, learning, and continuous improvement
  • Partner with product and engineering teams to embed reliability into the SDLC
  • Influence architectural decisions with an SRE mindset

Role Requirements:

Qualification:

  • Bachelor's degree in computer science, Engineering, Network or related field
  • Professional cloud certification

Experiences:

  • Proven 8 years experience in a DevOps or SRE role

Skills:

  • Strong knowledge of scripting language and programming language (e.g. Bash, Python, Go) and experience with configuration management tools (e.g. Ansible, Chef)
  • Good mindset and implementation on CI/CD tools and release engineering
  • Experience with cloud platforms (e.g. AWS, Azure) and infrastructure as code (IaC) tools (e.g., Terraform, CloudFormation).
  • Advanced cloud certification and project management is a plus.
  • Strong understanding in site reliability engineering, infrastructure engineering, cloud architecture service and mindset.
  • Experience with containerization technologies like Docker and container orchestration platforms such as Kubernetes.
  • Knowledge of networking principles and protocols with solid examples.
  • Strong knowledge on cloud architecture and services
  • Strong problem-solving skills and the ability to handle high-pressure situations calmly and effectively.
  • Strong attention to detail and a commitment to delivering high-quality results.

Personality:

  • Passionate, agile, flexible, and positive attitude.
  • Assertive, driven individual with a strong sense of urgency
  • Self-starter with continuous improvement mindset

What you get

Work your way

  • Flexible working hours

Your wellbeing matters

  • Medical coverage, with option to include dependants
  • Extra leave for family and caregiving needs

Rewards that grow with you

  • Monthly lifestyle allowance via TNG eWallet
  • Long-term rewards for your contributions

Everyday support

  • Mobile and broadband reimbursement
  • Discounts and wellness perks

What it's like to work here

We value people who take responsibility, think clearly, and care about doing things right. You will be working with a team that is practical, collaborative, and focused on building things that hold up under real pressure.

Note: Only shortlisted candidates will be contacted.

More Info

Job Type:
Industry:
Function:
Employment Type:

About Company

Job ID: 145713617

Similar Jobs