Search by job, company or skills

  • Posted 4 days ago
  • Be among the first 10 applicants
Early Applicant

Job Description

We're looking for a DevOps Engineer with 35 years of experience to help us build highly observable, resilient, and testable systems across our microservices ecosystem. This is a hands-on role for someone who enjoys working at the boundary of development and operations

You'll be diving into a system with hundreds of microservices, helping us identify weak spots, build self-healing mechanisms, and level up our observability and quality assurance efforts.

Responsibilities

  • Manage our Kubernetes systems that hosts the Postgres Database
  • Enhance our observability systems using tools like Grafana, and CloudWatch to enable real-time monitoring, alerting, and diagnosis
  • Analyze our microservices landscape to identify and implement self-healing strategies
  • Improve early vulnerability detection mechanisms by employing security gates in the code pipeline.
  • Design and enforce robust health checks for services and background jobs
  • Improve and manage logging, tracing, and alerting pipelines
  • Build infrastructure using Terraform and deploy containerized services with Docker on AWS ECS
  • Use core AWS services (ECS, EKS, EC2, IAM, S3, SQS, CloudTrail) to manage and scale cloud workloads
  • Improve CI/CD pipelines to include observability hooks and automated test gates
  • Contribute to NodeJS based backend application development

Qualifications

  • Bachelor's Degree in Computer Science or a related field (or equivalent practical experience)

Certifications

  • Certified Kubernetes Security Specialist (or similar) and/or
  • AWS Solution Architect Professional and/or
  • AWS DevOps Engineer Professional

Skills & Experience

  • STRONG Kubernetes experience (beyond basic setup and management)
  • 35 years in full-stack or backend engineering (Node.js, Express, React, TypeScript)
  • Strong experience with AWS services: ECS, EKS, EC2, S3, IAM, CloudWatch, SQS, CloudTrail
  • Solid knowledge of integration testing, health checks, and service readiness probes
  • Proficient in building and using Grafana dashboards and integrating observability tools
  • Hands-on with Terraform, Docker, and cloud-native infrastructure

Nice to Have

  • Familiarity with OpenTelemetry or Prometheus
  • Familiarity with building AI Infrastructure with observability
  • Understanding of basic e-commerce concepts: products, orders, offers, categories
  • Power user of AI tools like ClaudeCode, Cursor, Windsurf etc

Why Join Us

  • Tackle the challenge of managing and stabilizing a complex system of 100+ microservices
  • Own critical infrastructure and reliability features from observability to automated recovery
  • Collaborate with an ambitious, remote-first team and help build production-grade platforms at scale

More Info

Job Type:
Industry:
Function:
Employment Type:

About Company

Job ID: 135192833

Similar Jobs