Search by job, company or skills

FeedMe

Site Reliability Engineer Lead

new job description bg glownew job description bg glownew job description bg svg
  • Posted 5 hours ago
  • Be among the first 10 applicants
Early Applicant

Job Description

About Us

FeedMe's Software Engineering team develops next-generation technologies that change lifestyles for millions of users. Our products handle transactions at a massive scale and extend well into the offline world.

We need a technical visionary who speaks Product. We are looking for a Lead Engineer who can bridge the gap between ambitious business goals and engineering reality. You won't just oversee code; you will oversee the technical direction of our products, ensuring that what we build today scales for the millions of users we'll have tomorrow.

The Role

We are seeking a Site Reliability Engineering Lead to ensure the reliability, availability, and performance of our Point-of-Sale (POS) platform, supporting both cloud services and in-store edge devices. This role is critical in maintaining seamless transaction processing across retail environments, even under intermittent connectivity and high transaction volumes.

Your Day-to-Day:

Leadership & Strategy

  • Lead and mentor a team of SRE/DevOps engineers supporting POS infrastructure
  • Define reliability strategy across cloud backend + store-level POS systems
  • Establish and enforce SLOs/SLIs for transaction latency, uptime, and payment success rates
  • Manage error budgets aligned with business-critical retail operations

System Reliability (POS-Specific)

  • Ensure high availability of transaction processing systems (payments, receipts, inventory sync)
  • Design systems resilient to network instability in retail stores
  • Implement offline-first capabilities and reliable sync mechanisms
  • Minimize downtime during peak retail hours (e.g., weekends, holiday sales)

Incident Management

  • Own incident response for payment failures, POS outages, and sync issues
  • Lead blameless postmortems, especially for revenue-impacting incidents
  • Establish escalation paths for store-level vs platform-level issues
  • Optimize MTTR for distributed environments (cloud + edge devices)

Infrastructure & Automation

  • Drive automation for:
  • POS software deployment and updates (remote device management)
  • Infrastructure provisioning (IaC)
  • Manage hybrid infrastructure (cloud + on-premise/store devices)
  • Improve CI/CD pipelines for frequent, low-risk POS releases

Monitoring & Observability

  • Build observability across:
  • Cloud services (APIs, databases)
  • POS terminals (device health, connectivity, app crashes)
  • Implement real-time monitoring for:
  • Transaction success rates
  • Payment gateway latency
  • Store connectivity status
  • Reduce alert fatigue while ensuring critical retail incidents are detected instantly

Collaboration

  • Work with product and engineering teams to design fault-tolerant POS features
  • Partner with payment providers and third-party integrations
  • Collaborate with customer support teams to improve store-level issue resolution

What You Bring to the Table

  • 710+ years in SRE, DevOps, or backend engineering
  • 24+ years leading technical teams
  • Experience with high-availability transactional systems (e.g., payments, e-commerce, fintech, or POS)
  • Strong knowledge of distributed systems and eventual consistency models
  • Experience with cloud platforms (AWS, GCP, or Azure)
  • Proficiency in at least one programming/scripting language (Python, Go, Java, etc.)
  • Experience with containerization (Docker, Kubernetes)

POS / Retail-Specific Experience (Highly Preferred)

  • Experience with POS platforms (e.g., in-store retail systems, F&B ordering systems)
  • Knowledge of payment processing flows (card present, QR, e-wallets)
  • Familiarity with offline transaction handling and sync reconciliation
  • Experience supporting edge devices or IoT environments
  • Understanding of retail peak cycles (e.g., holiday traffic, flash sales)

What We Have For You

  • Impact: Direct influence on product roadmap and engineering culture.
  • Growth: A clear path for career advancement in management or technical leadership.
  • Flexibility: Hybrid work arrangement & flexible hours.
  • Culture: A young, fun, and energetic team with a casual dress code.
  • Compensation: Competitive salary package and benefits.

More Info

Job Type:
Industry:
Employment Type:

About Company

Job ID: 145269201