About Horizontal: Established since 2003 in the US, Horizontal solves complex challenges across two distinct businesses: Horizontal Digital and Horizontal Talent. We are consistently recognized for being a top workplace and one of the fastest-growing private companies. Horizontal Talent specializes in staffing for IT, Digital & Creative, and Business & Strategy markets. We have global offices in US, UAE, India, and Malaysia.
Description
- Design, implement, and maintain highly available Kubernetes clusters for mission-critical applications in both public cloud and on-premise infrastructure.
- Develop and enhance cloud-native platforms that support cybersecurity software products.
- Architect scalable, resilient, and secure containerized environments.
- Conduct performance tuning, resource optimization, and cost analysis across Kubernetes workloads.
- Troubleshoot cluster and container runtime issues; perform root cause analysis and implement long-term fixes.
- Collaborate with software engineers to containerize applications and ensure smooth CI/CD workflows.
- Implement monitoring, logging, and alerting solutions for proactive cluster health management.
- Mentor engineers on Kubernetes, DevOps practices, and cloud-native architecture.
Requirements
- Bachelor's or Master's degree in Computer Science, Software Engineering, or a related field.
- 37 years of professional software development and/or DevOps engineering experience.
- Strong expertise with Kubernetes (production deployments, scaling, upgrades, troubleshooting).
- Experience with container technologies such as Docker and Containerd.
- Proficiency in at least one programming language (e.g., Python, Go, or C++).
- Familiarity with cloud platforms (AWS, GCP, Azure) and Kubernetes distributions (Rancher, EKS, GKE, AKS, OpenShift).
- Hands-on experience with infrastructure as code (Terraform, Helm).
- Hands-on experience with Rancher on Proxmox for on-premise Kubernetes cluster.
- Knowledge of networking, service meshes (e.g., Istio, Calico), and Kubernetes security best practices.
- Experience with monitoring and logging tools (Prometheus, Grafana, ELK, OpenTelemetry).
- Strong problem-solving skills, with ability to optimize system reliability and performance.
- Excellent communication and teamwork skills.
- Prior experience in cybersecurity, observability, or large-scale distributed systems is a plus.
- Must be a Malaysian citizen or Malaysian Permanent Resident.