Job Overview
We are seeking a skilled and passionate AI Engineer to join our team in developing and optimizing AI solutions, focusing on LLM (Large Language Models) and RAG (Retrieval-Augmented Generation) systems. This role involves working across both on-device and cloud environments, contributing to end-to-end model development, deployment, and performance optimization.
Job Description
- Learning and tuning LLM and RAG models (fine-tuning, LoRA, etc.)
- Optimizing and deploying on-device and cloud environment models (ONNX, Qualcomm NPU, etc.)
- Building a RAG pipeline based on document embedding and vector retrieval
- Data preprocessing and pipeline automation
- Monitoring and improving model performance
Job Requirement
- Candidate should possess a Bachelor's Degree or equivalent in Science & Technology, Mathematics, Computer Science / Information Technology, or Engineering (Computer / Telecommunication)
- Preferably EXECUTIVE specializing in Information Technology or equivalent
- Fresh graduates are encouraged to apply
- Good communication skills in English and Malay (both spoken and written)
- 2 PERMANENT positions available
Technical Skills
- Model training: Fine-tuning, LoRA, quantization, training data
- RAG configuration: Document preprocessing, chunk embedding, vector DB, LLM interworking
- Deployment/Optimization: ONNX Runtime, NPU Optimization
Optional (Added Advantage)
- On-device AI model optimization experience (Mobile/Embedded, Qualcomm AI)
- Large-scale data learning and distributed learning experience
- Experience using cloud AI platforms (Vertex AI, Azure OpenAI, AWS)