Search by job, company or skills
Showing 1 job
Skills:
Ml, Jax, Pytorch, Python, RLAIF, SFT, DPO, distributed training, Ai, ppo, preference data curation, reward modeling, large language models, synthetic data generation, RLHF