Job Summary:
We are hiring a core partner to co-own our next-gen anomaly detection & attribution (ADA) platform that powers real-time monitoring, automated diagnosis, and decision support across multiple games/regions. You will lead end-to-end workarchitecture, development, maintenance, and analytical impactwhile advancing our LLM-Agent capabilities to turn noisy signals into clear, actionable narratives for business teams.
What You'll Do:
- Own the ADA platform lifecycle: design, implement, and maintain robust pipelines for T+1/near-real-time anomaly detection, multi-baseline benchmarking (global/region/country), and multi-source attribution (holidays, versions, events, migrations, user behavior).
- Advance detection & inference: productionize change-point/outlier methods, time-series features, causal/ablation checks, and automated storyline generation that explains what happened, why, and what to do next.
- Build AI Agents around the system: design tool-use and reasoning flows (ReAct/LangGraph or similar) to enable conversational drill-downs for ops/PMs.
- Productize insights: ship dashboards/alerts (email/Chat/WeCom) and concise decision memos; iterate with stakeholders in publishing, ops, marketing, and analytics.
- Collaboration: partner with DS/Eng/PM to scope, roadmap, and ship; document architecture, APIs, and runbooks.
Job Requirements:
Must-Have Qualifications
- Bachelor's degree or higher in Computer Science, Mathematics, Artificial Intelligence, or a related field, with at least 1 year of relevant work experience ;
- Proficiency in Mandarin, Chinese language and English (both written and spoken) to effectively communicate with non-English speaker counterparts based in China.
- Strong Python engineering (clean code, testing, packaging); solid SQL for large analytical workloads.
- Hands-on with time-series/anomaly detection (change-point, robust stats, seasonality/holiday adjustment, multivariate signals) and attribution logic.
- Practical working exposure to LLM application patterns (tool calling, function calling, retrieval/RAG, Agent planning), and at least one framework/API (OpenAI/Claude/DeepSeek, LangChain/LangGraph, etc.).
- Systems/product mindset: ability to translate business pain points into measurable detection/attribution logic and ship reliable features on short cycles.
- Ownership & reliability: you build guardrails, monitors, and docs; you debug in production and prevent regressions.
Nice-to-Have / Preferred
- Model eval & prompt engineering: rubric design, offline eval sets, golden tasks, prompt/test versioning, data flywheels.
- Causal & experimentation: diff-in-diff, CUPED, synthetic controls, online AB testing at scale.
- Dashboards & alerts: Superset/Tableau/Looker/Metabase; alerting via Slack/WeCom/Email with noise-reduction heuristics.
- Gaming analytics domain: retention funnels, reactivation, event/version rollout attribution, fraud/smurf detection.
- Infra & ops: Docker/K8s, CI/CD, IaC; cloud stacks (GCP/AWS/Azure); cost/perf tuning.