Core Responsibilities
1. Technical Support & Troubleshooting
Diagnose and resolve issues across AWS services, including:
- Compute: EC2, Auto Scaling, Elastic Beanstalk, Lambda
- Storage: S3, EBS, EFS
- Networking: VPC, Route 53, Elastic Load Balancer (ELB), CloudFront
- Databases: RDS, DynamoDB, Aurora
Additional responsibilities include:
- Investigating performance issues, service outages, and configuration errors
- Analyzing logs, metrics, and error messages to identify root causes
2. Incident Management
- Respond to customer-reported incidents based on severity levels (P1P4)
- Provide real-time support during production outages
- Escalate complex issues to AWS service teams when required
- Communicate effectively with customers regarding status, impact, and workarounds
3. Architecture & Best Practices Guidance
Advise customers on:
- High availability and fault-tolerant architectures
- Scalability and performance optimization
- Cost optimization strategies
- Security best practices, including IAM, encryption, and least-privilege access
Support customers in aligning workloads with the AWS Well-Architected Framework.
4. Monitoring & Optimization
Utilize AWS monitoring and advisory tools such as:
- Amazon CloudWatch (metrics, alarms, logs)
- AWS X-Ray
- AWS Trusted Advisor
- Identify risks related to performance, security, and cost
- Recommend proactive improvements and optimizations
5. Automation & Scripting
- Write, review, and troubleshoot scripts using Python, Bash, and PowerShell
- Use AWS SDKs (e.g., Boto3) for automation
- Assist with automating deployments and operational tasks
- Troubleshoot CI/CD pipelines using tools such as CodePipeline, GitHub Actions, and Jenkins
6. Documentation & Knowledge Sharing
- Create and maintain internal knowledge base articles
- Document Root Cause Analyses (RCA) for major incidents
- Share learnings and best practices with peers to improve overall support quality
Skills Required
Technical Skills
- Strong understanding of AWS core services
- Linux and Windows system administration
- Networking fundamentals (DNS, TCP/IP, VPNs, firewalls)
- Scripting and automation experience
- Basic security principles and IAM concepts
Soft Skills
- Excellent communication skills with the ability to explain complex issues clearly
- Strong customer empathy and professionalism
- Ability to work under pressure, especially during critical incidents
- Analytical mindset with strong problem-solving capabilities
Typical Day-to-Day Activities
- Handle customer support cases through a ticketing system
- Participate in live calls to resolve critical production issues
- Review logs and metrics to diagnose and troubleshoot problems
- Recommend architectural and operational improvements
- Collaborate with AWS service teams and internal peers