Application Support Engineer (Day 2 Operations)

axrail.ai

Malaysia, Kuala Lumpur

2-4 Years

Save

Posted 2 hours ago
Be among the first 10 applicants

Early Applicant

Job Description

We are seeking a high-performing Application Support Engineer to manage post-production (Day 2) support operations for business-critical applications. This role ensures application stability, incident resolution, and continuous service improvement in a production environment.

The ideal candidate is analytical, detail-oriented, and customer-focused, with strong experience in incident management, troubleshooting, and stakeholder communication, and the ability to perform under pressure with high service standards.

Key Responsibilities

Application Support & Incident Management

Provide L1 / L2 support for production systems, ensuring timely resolution of incidents and service requests
Own the end-to-end incident lifecycle: Triage → Investigation → Resolution → Closure → Post-incident review
Troubleshoot application, API, and system-level issues, including log analysis and debugging
Ensure adherence to SLA / SLO commitments, with proper prioritization and escalation
Perform root cause analysis (RCA) and document findings to prevent recurrence

Ticketing & Service Operations (Jira-driven)

Manage and track incidents, requests, and defects using Jira or equivalent tools
Maintain accurate ticket documentation, updates, and resolution details
Drive structured workflows for categorization, prioritization, and escalation
Collaborate with engineering teams by providing clear reproduction steps, logs, and impact analysis

Customer Engagement & Communication

Act as a primary point of contact for customers during incidents and requests
Provide clear, professional, and timely communication to technical and non-technical stakeholders
Manage expectations during outages with transparent updates and timelines
Conduct incident reviews and follow-ups where required

Stability & Continuous Improvement

Monitor application health and support production stability
Perform log analysis and diagnostics to identify root causes and recurring issues
Identify patterns and propose preventive or permanent fixes
Improve system reliability and support processes
Maintain runbooks, troubleshooting guides, and knowledge base articles

Collaboration & Release Support

Work with development, QA, and internal teams for deployment and validation
Support environment configuration across dev, UAT, and production
Assist in release monitoring, rollback decisions, and hotfix coordination
Support change management processes to minimize disruption

Qualifications

Required

2–4 years of experience in Application Support, Production Support, or IT Operations
Experience in L1 / L2 support, handling production incidents and service requests
Strong hands-on experience with Jira (or equivalent)
Solid understanding of application architecture:
APIs, backend services, databases, integrations
Hands-on experience in supporting and troubleshooting Python-based applications
Strong problem-solving and analytical skills, including log analysis and root cause identification
Experience in SLA-driven environments with prioritization and escalation processes
Strong communication skills, with confidence in customer-facing interactions

Preferred

Experience supporting applications on cloud platforms (AWS preferred)
Familiarity with AWS services (EC2, S3, Lambda, RDS, IAM)
Basic understanding of:
SQL / NoSQL databases
REST APIs and JSON
Exposure to incident management frameworks (ITIL)
Scripting knowledge in Python, Bash, or similar

What We Look For

Strong ownership mindset in resolving issues end-to-end
Ability to perform effectively during high-pressure or critical incidents
Good balance of technical troubleshooting and customer communication
High attention to detail, documentation, and operational discipline
Proactive approach to problem prevention and continuous improvement

Candidate Expectations

Demonstrated experience in troubleshooting real production issues, beyond academic exposure
Ability to perform hands-on investigation, including:
Log analysis
Root cause identification
Clear documentation of findings
Confidence in managing incidents in Jira or equivalent systems
Ability to communicate clearly with stakeholders during incidents
Ability to explain past incident handling and resolution scenarios in a structured manner

Important Note

This role is part of Day 2 operations and may require: