You will be part of a talented team of engineers and AI researchers focused on making our lives better and safer. As a Senior AI engineer, you are responsible for building computer vision models and vision-language models that create extrinsic values for our clients.
This includes all associated areas such as machine learning, deep learning, computer vision, and also image processing.
JOB RESPONSIBILITIES
- Research and develop existing computer vision algorithms and incorporate novel techniques or create new algorithms from the ground up to solve desired use cases.
- Develop and adapt state-of-the-art computer vision and vision-language techniques for face recognition, object segmentation/detection/classification, OCR and multimodal understanding.
- Train and fine-tune Vision-Language Models (VLMs) (e.g., CLIP, BLIP, LLaVA, Qwen-VL, Kosmos, Florence) for tasks such as cross-modal retrieval, image captioning, and visual question answering.
- Prototype, benchmark and implement new algorithms in production-level code.
- Optimize models for on-device, cloud, or edge deployment, balancing accuracy, latency, and resource usage.
- Assist business teams to translate research into real product value.
- Assist in testing and evaluation on the developed solutions (unit test, stress test, benchmark comparisons, etc.).
- Write professional documentation for developed solutions and R&D findings.
JOB REQUIREMENTS
- Solid understanding of machine learning and deep learning is a must
- At least 3 years of R&D experience in the field of deep learning or vision-language models.
- Experience with deep learning frameworks (Tensorflow, Pytorch).
- Hands-on experience in computer vision algorithms such as segmentation, object classification, object detection, OCR, or 3D reconstruction.
- Proficiency in Python and/or C++, with practical use of libraries such as OpenCV, HuggingFace Transformers, or timm.
- Familiarity with multimodal learning, transformer-based architectures, and large-scale pretraining.
- Strong willingness to learn and grow.
- Self-learning ability, passionate, open-minded, team player, good communication skill, and able to work with minimal supervision effort.
- Ability to research, prototype, benchmark and implement new algorithms in production-level code.
- Leverages AI tools to improve coding productivity and quality.
- Bachelor's degree in Computer Science / Information Technology / Engineering / Mathematics or related fields.
Bonus Points
- Proficiency in data analysis skills to facilitate decision making
- Experience with Vision-Language Models (VLMs), multimodal training pipelines, or cross-modal alignment.
- Real-world CV/VLM projects with algorithms implementation.
- Strong background in computer vision, image processing, or multimodal AI.
- Experience in model optimization, knowledge distillation and deployment on edge and mobile devices.
- Experience with Docker, Kubernetes, and model serving frameworks (e.g., Triton, TorchServe, FastAPI, LitServe).
- Experience with Linux architecture and familiar with Linux command.
- Experience in using source control / project tracking systems such as Github, Jira etc.
- M.S. or Ph.D in computer vision or related field.
- Contributions to open-source projects.
- Active participant in technical communities.