We are seeking a high-potential Cloud Support Engineer to support, operate, and optimize cloud infrastructure environments primarily on AWS. This role is critical in ensuring high availability, performance, and security of production workloads while delivering excellent technical support to customers.
This position combines hands-on engineering, operational excellence, and customer-facing support, making it ideal for candidates who are passionate about cloud technologies, problem-solving, and continuous learning.
We welcome both strong fresh graduates with exceptional logical thinking and early-career engineers with relevant cloud experience.
Key Responsibilities
Cloud Infrastructure Management
- Manage and maintain AWS cloud infrastructure to ensure high availability, reliability, and performance
- Implement and maintain Infrastructure as Code (IaC) using tools such as CloudFormation, CDK, or Terraform
- Monitor and optimize cloud resources for scalability, performance, and cost efficiency
- Apply best practices for resource provisioning, tagging, and lifecycle management
Cloud Support & Operations
- Provide end-to-end technical support by diagnosing and resolving complex AWS infrastructure issues across multiple services
- Own and manage incident response lifecycle, including troubleshooting, escalation, and resolution during critical outages
- Proactively monitor cloud environments to detect, prevent, and resolve issues before impact
- Collaborate with internal teams and customers to troubleshoot and resolve infrastructure-related challenges
- Utilize tools such as AWS Systems Manager (Patch Manager) to automate OS-level patching for Windows and Linux environments
- Participate in on-call rotations or off-hours support for critical incidents when required
- Develop and maintain runbooks, documentation, and knowledge base articles to improve operational efficiency
Performance Optimization & Reliability
- Analyze system metrics, logs, and performance data to identify bottlenecks and inefficiencies
- Implement tuning strategies across compute, storage, and networking layers
- Drive improvements in system reliability, observability, and operational excellence
Security & Compliance
- Enforce cloud security best practices, including:
- IAM policies (least privilege access)
- Network segmentation (VPC, subnets, security groups)
- Encryption (at rest and in transit)
- Monitor and respond to security alerts and vulnerabilities
- Support compliance with relevant standards such as PDPA, GDPR, HIPAA (where applicable)
- Ensure responsible and secure usage of cloud resources
Continuous Improvement & Automation
- Identify opportunities to automate repetitive operational tasks using scripting and cloud-native tools
- Contribute to improving support processes, patching strategies, and operational workflows
- Support adoption of emerging AWS capabilities, including exposure to Generative AI tools and services
Qualifications
Required
- 1–3 years of experience in cloud engineering, system administration, or infrastructure support
- (Strong fresh graduates with relevant skills are encouraged to apply)
- Solid understanding of AWS core services, including:
- EC2, S3, VPC, IAM, RDS, CloudWatch
- Basic understanding of networking concepts:
- VPC, subnets, routing, DNS (Route 53)
- Familiarity with Infrastructure as Code (IaC):
- CloudFormation, CDK, or Terraform
- Proficiency in at least one scripting language:
- Python, Bash, or similar
- Understanding of cloud security fundamentals:
- Access control, encryption, network isolation
- Strong analytical thinking and problem-solving skills, with ability to troubleshoot systematically
- Good communication skills and ability to work collaboratively in a team environment
- Willingness to learn, adapt, and operate in a fast-paced environment
Preferred
- AWS certifications:
- AWS Certified Solutions Architect (Associate) or AWS Certified SysOps Administrator
- Hands-on experience with:
- Monitoring tools (CloudWatch, logging systems)
- AWS Systems Manager (SSM), Patch Manager
- Exposure to multi-cloud environments (Azure or GCP)
- Familiarity with automation and DevOps practices
- Exposure to AWS Generative AI services or frameworks is an added advantage
What We Look For
- Strong ownership mindset in resolving issues end-to-end
- Ability to operate under pressure during production incidents
- Passion for cloud technologies, automation, and continuous improvement
- Attention to detail in documentation, troubleshooting, and execution
- Growth mindset with commitment to learning and skill development
Important Note
This role may require on-call support and handling of critical production incidents outside standard working hours.