About The Role
We’re expanding our AI capabilities and are looking for an Associate Data Scientist to join our growing team in Estonia. In this role, you’ll work on designing and deploying advanced AI solutions—including retrieval-augmented generation (RAG), agent-based models, and multi-agent systems. You'll also be involved in scaling these models in production through robust MLOps practices, ensuring performance, compliance, and maintainability.
Key Responsibilities
- Design, implement, and experiment with cutting-edge Generative AI models and architectures, including but not limited to advanced LLM techniques, transformer variants, and potentially other generative paradigms (e.g., diffusion models, GANs where applicable to the domain).
- Develop and refine sophisticated prompt engineering strategies and fine-tuning techniques for domain-specific and complex generative tasks.
- Architect and implement advanced RAG systems, exploring techniques for improved retrieval, generation, and knowledge integration.
- Design, build, and orchestrate complex agent-based and multi-agent systems leveraging large language models for autonomous decision-making and task completion.
- Work extensively with embeddings, advanced vector database techniques, and seamless integration of diverse external APIs and knowledge sources into GenAI workflows.
- Establish and implement rigorous model evaluation frameworks and benchmarks specifically tailored for generative models, focusing on metrics beyond traditional performance, such as creativity, factuality, safety, and bias.
- Build and maintain scalable and production-ready MLOps pipelines capable of supporting the entire lifecycle of advanced GenAI models, from continuous training and experimentation to deployment, monitoring, and A/B testing of different model versions and approaches.
- Leverage containerization and orchestration tools (Docker, Kubernetes, etc.) to manage complex and distributed GenAI model environments.
- Develop and manage robust model versioning, data lineage tracking, and automated testing frameworks to ensure reliable and continuous delivery of GenAI applications.
- Champion and implement ethical AI practices, privacy, and security best practices throughout the GenAI development and deployment lifecycle, with a focus on mitigating risks associated with generative models (e.g., bias, toxicity, misinformation).
Required Experience & Skills:
- At least 2 years of experience in data science, applied ML, or AI-focused roles, with a demonstrable focus on Generative AI projects.
- Strong command of Large Language Models (LLMs), with in-depth understanding of their architectures, capabilities, and limitations.
- Proven experience in designing and implementing advanced RAG and agent-based systems.
- High proficiency in advanced prompt engineering, fine-tuning techniques for large models, and working with various vector stores and embedding techniques.
- Hands-on experience with a variety of open-source and closed-source models like Llama, Mistral, DeepSeek etc.
- Advanced knowledge of Python and deep expertise in relevant NLP/ML frameworks (Hugging Face Transformers, TensorFlow, PyTorch) for building and manipulating generative models.
- Solid understanding of AI ethics, privacy, security, and the specific challenges and considerations within the context of Generative AI.
- Hands-on experience with CI/CD for ML, specifically in the context of deploying and managing complex AI models.
- Extensive experience with Docker, Kubernetes, and designing scalable deployment practices for AI applications.
- Familiarity with advanced monitoring, logging, and model observability techniques tailored for production AI systems.
- Proficiency in scripting and automation for building and managing end-to-end GenAI pipelines.
Preferred Skills
- Experience with cutting-edge cloud-based GenAI platforms and tools (e.g., AWS Bedrock, Google Cloud Vertex AI with Generative AI features, Azure OpenAI Service).
- Exposure to advanced model compression, distributed training of large models, or specialized AI/ML security techniques for generative systems.
- Experience with knowledge graphs, semantic search, or applying reinforcement learning (especially RLHF) to improve generative model outputs.
- Familiarity with AI compliance frameworks, advanced A/B testing methodologies for AI products, and leading responsible AI initiatives in a GenAI context.
- Experience contributing to open-source GenAI projects or publishing research in the field.
- Proven ability to explore and quickly learn new GenAI models, techniques, and tools as the field rapidly evolves.
- Exposure to Model Context Protocol
Tools and Technology:
- Programming Languages: Python (essential)
- GenAI Specific Libraries/Tools: LangChain, LlamaIndex, LangGraph, MCP Servers, Google A2A, Hugging Face Transformers, PyTorch, TensorFlow, Ollama, vLLM
- Vector Databases: Pinecone, Chroma, or similar.
- Cloud Platforms: Experience with at least one major cloud provider (AWS, Google Cloud Platform, Azure), including their AI/ML and GenAI-specific services (e.g., AWS Bedrock, Sagemaker, Google Cloud Vertex AI, Azure OpenAI Service).
- MLOps Tools: MLflow, Kubeflow, or similar for model tracking, serving, and application development.
- Containerization & Orchestration: Docker, Kubernetes.
Equal Opportunity Employer
We are an equal opportunity employer and value diversity in our teams. All qualified applicants will be considered for employment regardless of gender, ethnicity, disability, age, or other characteristics protected by law.
Please Note: This is a full-time role based in Estonia.
Applicants currently residing in Estonia or holding valid authorization to work in Estonia will be considered. Please note that we are not offering visa sponsorship for this role, so kindly apply only if you meet the eligibility criteria.