About Us
FeedMe's Software Engineering team develops next-generation technologies that change lifestyles for millions of users. Our products handle transactions at a massive scale and extend well into the offline world.
We need a technical visionary who speaks Product. We are looking for a Lead Engineer who can bridge the gap between ambitious business goals and engineering reality. You won't just oversee code; you will oversee the technical direction of our products, ensuring that what we build today scales for the millions of users we'll have tomorrow.
The Role
We are seeking a Site Reliability Engineering Lead to ensure the reliability, availability, and performance of our Point-of-Sale (POS) platform, supporting both cloud services and in-store edge devices. This role is critical in maintaining seamless transaction processing across retail environments, even under intermittent connectivity and high transaction volumes.
Your Day-to-Day:
Leadership & Strategy
- Lead and mentor a team of SRE/DevOps engineers supporting POS infrastructure
- Define reliability strategy across cloud backend + store-level POS systems
- Establish and enforce SLOs/SLIs for transaction latency, uptime, and payment success rates
- Manage error budgets aligned with business-critical retail operations
System Reliability (POS-Specific)
- Ensure high availability of transaction processing systems (payments, receipts, inventory sync)
- Design systems resilient to network instability in retail stores
- Implement offline-first capabilities and reliable sync mechanisms
- Minimize downtime during peak retail hours (e.g., weekends, holiday sales)
Incident Management
- Own incident response for payment failures, POS outages, and sync issues
- Lead blameless postmortems, especially for revenue-impacting incidents
- Establish escalation paths for store-level vs platform-level issues
- Optimize MTTR for distributed environments (cloud + edge devices)
Infrastructure & Automation
- Drive automation for:
- POS software deployment and updates (remote device management)
- Infrastructure provisioning (IaC)
- Manage hybrid infrastructure (cloud + on-premise/store devices)
- Improve CI/CD pipelines for frequent, low-risk POS releases
Monitoring & Observability
- Build observability across:
- Cloud services (APIs, databases)
- POS terminals (device health, connectivity, app crashes)
- Implement real-time monitoring for:
- Transaction success rates
- Payment gateway latency
- Store connectivity status
- Reduce alert fatigue while ensuring critical retail incidents are detected instantly
Collaboration
- Work with product and engineering teams to design fault-tolerant POS features
- Partner with payment providers and third-party integrations
- Collaborate with customer support teams to improve store-level issue resolution
What You Bring to the Table
- 710+ years in SRE, DevOps, or backend engineering
- 24+ years leading technical teams
- Experience with high-availability transactional systems (e.g., payments, e-commerce, fintech, or POS)
- Strong knowledge of distributed systems and eventual consistency models
- Experience with cloud platforms (AWS, GCP, or Azure)
- Proficiency in at least one programming/scripting language (Python, Go, Java, etc.)
- Experience with containerization (Docker, Kubernetes)
POS / Retail-Specific Experience (Highly Preferred)
- Experience with POS platforms (e.g., in-store retail systems, F&B ordering systems)
- Knowledge of payment processing flows (card present, QR, e-wallets)
- Familiarity with offline transaction handling and sync reconciliation
- Experience supporting edge devices or IoT environments
- Understanding of retail peak cycles (e.g., holiday traffic, flash sales)
What We Have For You
- Impact: Direct influence on product roadmap and engineering culture.
- Growth: A clear path for career advancement in management or technical leadership.
- Flexibility: Hybrid work arrangement & flexible hours.
- Culture: A young, fun, and energetic team with a casual dress code.
- Compensation: Competitive salary package and benefits.