
Search by job, company or skills
At AIA we've started an exciting movement to create a healthier, more sustainable future for everyone.
If you believe in developing a better tomorrow, read on.
About the Role
We are looking for a System / Site Reliability Engineer (SRE) to help ensure the reliability, scalability, and performance of our enterprise systems and services. In this role, you will apply software engineering principles to operations, partner closely with development and infrastructure teams, and build automation that strengthens system stability and efficiency. You will play a pivotal role in bridging the gap between software development and IT operations, driving a culture of resilience, observability, automation, and proactive problemsolving.1. Ensure System Reliability & Availability
Monitor and report on application performance, and highlight any deviations or issues.
Collaborate with application engineers and developers to identify root causes and implement durable fixes.
2. Incident Management & Root Cause Analysis
Participate as a Subject Matter Advisor during production incidents and outages.
Provide insights backed by system monitoring, code review, and database analysis.
Support postmortem reviews and drive followup actions.
3. Automation & Tooling
Automate operational tasks such as monitoring, alerts, and recovery processes.
Build scripts and internal tools to eliminate manual toil and improve operational efficiency.
4. Monitoring & Observability
Implement telemetry and observability practices to track system health, latency, and error rates.
Manage the Dynatrace platform and its integrations with application services.
Support teams in designing dashboards and visualization setups.
5. Security & Compliance
Work with Security teams to ensure systems comply with regulatory and industry standards (e.g., PCIDSS, GDPR).
Implement necessary access controls, encryption, and audit capabilities within SRE scope.
6. Capacity Planning & Performance Optimization
Analyze usage trends to forecast demand and support scaling decisions.
Contribute to costperformance optimization efforts across infrastructure and applications.
Collaborate closely with development, QA, and infrastructure teams to embed reliability into the SDLC.
7. Documentation & Knowledge Sharing
Maintain clear and uptodate operational documentation, runbooks, and architecture diagrams.
Champion SRE principles across the organization to foster resilience and accountability.
Education
Bachelor's degree in Computer Science, Software Engineering, IT, or related fields.
Experience
3-5 years of experience in SRE, DevOps, or Software Engineering roles.
Experience supporting frontend applications in production environments, ideally within financial services or other regulated industries.
Technical Skills
Strong understanding of frontend performance monitoring and instrumentation.
Handson experience with Real User Monitoring (RUM), Synthetic Monitoring, and APM tools (e.g., Dynatrace, New Relic, Datadog).
Proficiency in building dashboards and alerts using Dynatrace, Grafana, Prometheus, Elastic Stack, or Splunk.
Familiarity with OpenTelemetry for distributed tracing.
Scripting skills in Python, Bash, or JavaScript.
Experience with CI/CD pipelines (e.g., GitHub Flow).
Practical experience with cloud technologies (AWS or Azure).
Knowledge of Docker and Kubernetes.
Understanding of secure coding practices for frontend applications.
Awareness of financial compliance standards such as PCIDSS.
Be part of a highimpact team shaping system resilience across the enterprise.
Work with modern observability and automation technologies.
Influence engineering culture through SRE best practices.
Opportunities to innovate and drive real improvements in system reliability.
AIA Group Limited, often known as AIA , is a Hong Kong-based American multinational insurance and finance corporation. It is the largest public listed life insurance and securities group in Asia-Pacific. It offers insurance and financial services, writing life insurance for individuals and businesses, as well as accident and health insurance, and offers retirement planning, and wealth management services, variable contracts, investments and securities.
Job ID: 144086533