About the Company
We are a Swedish AI-startup building a next-generation creation layer that converts natural-language intent into governed, portable systems — producing APIs, UIs, tests, and deployment artifacts automatically and extremely efficiently.
Overview:
We are seeking a skilled AI Data Scientist to own the end-to-end lifecycle of large and specialized language models. This role focuses on delivering production-ready, fine-tuned models optimized for performance, reliability, and domain-specific applications.
Key Responsibilities:
- Fine-tune foundational models (LLaMA 3, Mistral) using PyTorch, Hugging Face, Unsloth, or TRL with parameter-efficient techniques (PEFT).
- Optimize inference speed and throughput via quantization (AWQ, GPTQ) and serving engines (vLLM, TensorRT).
- Build and manage distributed training pipelines on AWS SageMaker or on-premise multi-GPU clusters.
- Design evaluation frameworks to benchmark model performance, reasoning, and hallucination rates.
- Collaborate with engineering and product teams to deploy scalable, production-ready AI solutions.
Qualifications:
- Experience fine-tuning and deploying large language models.
- Strong skills in PyTorch and Hugging Face Transformers or similar frameworks.
- Knowledge of PEFT, quantization, and distributed training workflows.
- Familiarity with model evaluation metrics and inference optimization.
- Strong problem-solving skills and ability to work independently and collaboratively.
Preferred:
- Experience with vLLM, TensorRT, and MLOps workflows.
- Expertise in domain-specific model adaptation.