Search by job, company or skills

E

Senior System Engineer

new job description bg glownew job description bg glownew job description bg svg
  • Posted 7 days ago
  • Be among the first 10 applicants
Early Applicant

Job Description

Join EPAM Malaysia as a Senior Application Support Engineer, where you will own the performance, reliability, and scalability of mission-critical platforms. You will lead containerized workloads, optimize database operations, and troubleshoot complex production issues, while influencing platform-level decisions and improvements. Partner closely with engineering, operations, and product teams to drive seamless service delivery, implement long-term solutions, and mentor junior engineers. This role is ideal for senior engineers who thrive in high-ownership, hands-on operational environments and want to make a strategic impact within a modern, cloud-native ecosystem.

Responsibilities

  • Lead incident response and resolution for high-severity production issues, ensuring minimal downtime and timely communication with stakeholders
  • Monitor and optimize application performance, identifying systemic patterns and driving long-term improvements
  • Design, implement and maintain automation workflows and operational tooling to reduce manual intervention
  • Conduct and lead root cause analysis (RCA) and postmortems, implementing durable corrective measures and preventive strategies
  • Maintain and enhance project documentation, runbooks and operational guidelines to ensure knowledge continuity and platform reliability
  • Mentor and guide junior engineers, sharing best practices in application support, DevOps and cloud-native operations
  • Collaborate with cross-functional teams to influence platform architecture, deployment strategies, and operational excellence initiatives
  • Drive continuous improvement initiatives for scalability, availability and operational efficiency across the platform

Requirements

  • 7+ years of hands-on experience in software application support, platform operations, or production engineering
  • Proven experience operating PostgreSQL in production, including installation, configuration, performance tuning and troubleshooting complex SQL workloads
  • Demonstrated expertise with Docker and Kubernetes, including workload management, deployment troubleshooting, Helm chart management and optimization of containerized applications
  • Strong proficiency in Linux administration, including logs, system services, permissions, and performance diagnostics
  • Hands-on experience with monitoring and observability tools (Prometheus, Grafana, ELK, Loki, or similar)
  • Proven ability to own end-to-end troubleshooting, lead RCA and implement durable solutions in high-availability production environments
  • Excellent communication and collaboration skills, capable of articulating technical concepts to both technical and non-technical stakeholders

Nice to have

  • Experience with scripting (Shell, Python, or similar) for automation and operational efficiency
  • Exposure to cloud-native ecosystems and modern DevOps practices
  • Familiarity with CI/CD pipelines, automation tools and infrastructure-as-code frameworks

We offer

  • International projects with top brands
  • Work with global teams of highly skilled, diverse peers
  • Healthcare benefits
  • Employee financial programs
  • Paid time off and sick leave
  • Upskilling, reskilling and certification courses
  • Unlimited access to the LinkedIn Learning library and 22,000+ courses
  • Global career opportunities
  • Volunteer and community involvement opportunities
  • EPAM Employee Groups
  • Award-winning culture recognized by Glassdoor, Newsweek and LinkedIn

More Info

Job Type:
Industry:
Employment Type:

About Company

Job ID: 134808129