We are seeking a highly skilled Data Scientist with expertise in fine-tuning language models using proprietary company data. The ideal candidate will have a strong background in data preparation, model fine-tuning, and benchmarking, as well as staying updated on the latest advancements in the AI field.
Key Responsibilities:
- Develop and prepare datasets for language model fine-tuning using proprietary data.
- Fine-tune language models using techniques such as LoRA, QLoRA, and GRPO.
- Benchmark model performance and analyze results to ensure optimal outcomes.
- Deploy fine-tuned model for inference
- Stay current with the latest research and innovations in language modeling, including relevant arXiv papers.
- Collaborate with cross-functional teams to integrate fine-tuned models into products and services.
Requirements:
- Proven experience in fine-tuning language models with proprietary datasets.
- Proficiency in advanced fine-tuning techniques and methodologies.
- Strong analytical skills for benchmarking model performance.
- Up-to-date knowledge of recent research papers and developments in the field.
- Excellent communication skills and ability to collaborate with technical and non-technical stakeholders.
- Strong background in ML, NLP and Deep Learning
Required Tools and Technologies:
- Proficiency in Python programming.
- Experience with machine learning frameworks such as TensorFlow or PyTorch.
- Familiarity with Hugging Face Transformers for language model manipulation.
- Knowledge of data processing and manipulation libraries, such as Pandas and NumPy.
- Experience with version control systems like Git.
Preferred Qualifications:
- Experience with Relevance-Augmented Generation (RAG) and GraphRAG frameworks.
- Experience with reasoning models and algorithm development.
- Familiarity with additional machine learning tools and libraries.
- Advanced degree in Computer Science, Data Science, or related field.
- Having prior experience publishing research paper on AI field.
Start date: ASAP
Remote vs Onsite: Remotely
US Hours overlap needed?: Minimum 2-6pm CET, preferred 2-7pm CET