Search by job, company or skills

Horizontal Talent

Senior Site Reliability Engineer (SRE)

new job description bg glownew job description bg glownew job description bg svg
  • Posted 21 days ago
  • Be among the first 10 applicants
Early Applicant

Job Description

About Horizontal: Established since 2003 in the US, Horizontal solves complex challenges across two distinct businesses: Horizontal Digital and Horizontal Talent. We are consistently recognized for being a top workplace and one of the fastest growing private companies. Horizontal Talent specializes in staffing for IT, Digital & Creative and Business & Strategy markets. We have global offices in US, UAE, India, Malaysia and Australia.

About The Role

As a Senior SRE, you'll drive the development and execution of strategies for DevSecOps practices and platform. Your work will ensure seamless collaboration between technology teams, enabling fast and reliable high-quality software delivery.

You'll work with a team responsible for implementing and managing Infrastructure as Code (IaC), CI/CD pipelines, cloud native & micro-services, automation frameworks, and release management processes, ensuring they align with organizational objectives.

What You'll Do

  • Lead the design and implementation of highly available, secure, and scalable banking infrastructure using infrastructure as code (IaC) principles
  • Establish and maintain SLOs/SLIs that define our reliability standards and drive accountability across engineering teams
  • Serve as an incident commander during critical service disruptions, leading cross-functional response teams with calm expertise
  • Build and enhance our observability platform, enabling real-time monitoring of our golden signals (uptime, latency, saturation, error rate)
  • Develop automation solutions for incident response, disaster recovery, and business continuity
  • Drive our DevSecOps platform to enable safe, rapid deployments through CI/CD, GitOps, and self-service capabilities
  • Lead FinOps initiatives to bring visibility and drive ownership amongst tech teams to optimize infrastructure utilization while maintaining performance and reliability
  • Mentor junior engineers and contribute to a culture of operational excellence

What We're Seeking

  • Demonstrated experience of at least 5 years in Site Reliability Engineering, DevOps, or equivalent roles.
  • Strong understanding of cloud technologies (AWS, Azure, GCP, Alibaba Cloud)
  • Experience implementing CI/CD pipelines and GitOps workflows
  • Deep expertise with infrastructure as code tools (Hashicorp Terraform, OpenTofu, CloudFormation, or similar)
  • Proven ability to design and implement observability solutions using modern monitoring stacks
  • Experience leading incident response and building post-mortem processes
  • Strong understanding of Java or any other object-oriented programming language (OOP).
  • Strong understanding of containerization & orchestration.
  • Experience with messaging systems such as Kafka is an added advantage.
  • Familiarity with relational and non-relational databases is a plus.
  • Ability to balance hands-on technical expertise with strategic decision-making.
  • Strong problem-solving skills and the ability to make sound decisions under pressure.
  • A passion for continuous learning, innovation, and professional development.
  • High ownership of responsibilities, with a focus on delivering results and meeting deadlines.
  • Financial services experience is a plus but not required

More Info

About Company

Job ID: 136624569