Aderant is a global industry leading software company providing comprehensive business management solutions for law firms and other professional services organizations with a mission to help them run a better business. We are motivated by a collective desire to drive the legal industry to the forefront of innovation. With over 2,500 clients around the world, including 95 of the top AmLaw 100 firms, we are changing the outside perception of the legal sphere; where there was once resistance to modernization, we are creating a culture that embraces new ideas and technology.

At Aderant, the “A” is more than just a letter. It is a representation of how we fulfill our foundational purpose, serving our clients. It embodies our core values and reminds us that to achieve success, every day must start with the “A”. We bring the “A” to life by fostering a culture of innovation, collaboration, and personal growth. We encourage our diverse teams to bring their whole selves to work – ideas, experience, and passion – to drive our mission forward.

Our people are our strength.

About The Role

We’re looking for a Data Scientist with deep, hands-on experience in large language models (LLMs). You’ll design prompts and evals, fine-tune open-source models, generate high-quality training data, and deploy model services with FastAPI and Docker. If you love turning cutting-edge research into reliable, fast, and cost-effective products, this is for you.

What you’ll do

Own LLM product experiments end-to-end: problem framing, prompt design, data generation, model selection/fine-tuning, offline/online evaluation, and iteration.
Work with open-source models (e.g., Llama, Mistral/Mixtral, Qwen, T5/Flan) using Hugging Face/Transformers, PEFT/LoRA/QLoRA, TRL, Accelerate/DeepSpeed.
Build efficient inference stacks: quantization (8-bit/4-bit), batching, KV-cache, speculative decoding, vLLM/TGI/TensorRT-LLM, and model/endpoint autoscaling.
Design robust evaluation: task accuracy and generation quality (exact match, ROUGE/BLEU/BERTScore), safety/toxicity, hallucination rate, latency/throughput, cost per request, and win-rate from human review; set up an eval harness and dashboards.
Generate and curate training data: synthetic data, augmentation, preference data (for DPO/RLHF), labeling guidelines, data quality checks, and dataset versioning.
Implement RAG pipelines where useful: embeddings, retrieval, and context construction with vector stores (FAISS/Milvus/pgvector/Qdrant).
Ship production services: expose models via FastAPI, containerize with Docker, write tests, add logging/metrics/tracing, and collaborate on CI/CD.
Integrate third-party LLM APIs (OpenAI/Azure OpenAI, Anthropic, Google, Cohere) alongside open-source models; choose the right tool for quality, speed, and cost.
Champion safety & compliance: prompt/response guardrails, PII handling, rate limiting, and abuse monitoring.
Document and share findings, best practices, and reusable components.

What you’ll bring

Strong Python and PyTorch skills; comfortable with data tooling (pandas, NumPy) and experiment tracking (MLflow/W&B).
Proven experience building with LLM APIs and open-source LLMs, including prompt engineering and LLM evaluation.
Hands-on fine-tuning (LoRA/QLoRA or full/adapter-based) and data generation for supervised or preference-based training.
Production experience deploying model services with FastAPI and Docker; familiarity with monitoring and alerting.
Solid understanding of experimental design and statistics; experience with A/B testing and human-in-the-loop evaluation.
Clear communication and a product mindset; you iterate quickly and measure impact.

Nice to have