Monitor all systems in production and react to alerts;
Manage incidents life cycle until they are fully resolved or providing a workaround solution, escalation to third / fourth level support where required;
Support in incident management after deployment;
Perform log level analysis;
Take end-to-end ownership of customer technical issues, including initial troubleshooting, identification of root cause, issue resolution, and communication;
Reproduce reported issues in an appropriate customer environment;
Gather information to ensure complete availability of details required for root cause analysis;
Provide a robust service for monitoring of products deployed onto the platform.
Requirements:
1+ years of experience in IT;
Experience working with logging, monitoring and alerting tools (e.g. ELK stack, Grafana, PagerDuty, DataDog, Prometheus, Coralogix);
Ability to perform log level analysis;
Strong troubleshooting skills;
Willingness to work in a shift schedule (8/5);
Experience with bug and issue tracking systems (Jira preferred);
Ambition to learn new systems, procedures, techniques in a short period of time;
Structured, process-oriented and business-oriented;
Strong communication and reporting skills;
Self-learning ability, self-motivated and team player;
Proficiency in English (must) and Russian;
Experience working with iGaming industry products (nice to have);