We’re looking for a mid-level Data Scientist to join our Research Team focused on personal data identification using NLP. This is a hands-on, full-cycle role—from preparing data and fine-tuning NLP models to deploying them in production environments and analyzing feedback to drive continuous improvements.
- Location: Remote or Hybrid
- Contract: Full-time
- Team: Research & AI Innovation
What You’ll Do:
- Work on cutting-edge NLP solutions for detecting personal data in various formats
- Prepare and annotate datasets for model training
- Evaluate and apply modern NLP libraries and frameworks (e.g., spaCy, Flair, Hugging Face)
- Deploy trained models to client environments (on-prem/cloud) using Docker
- Collaborate closely with product teams to align technical work with client needs
- Analyze post-deployment results and contribute to ongoing model improvements
- Continuously explore new technologies in NLP and privacy-aware AI
Must-Haves:
- Solid knowledge of Python
- Experience with Docker and model containerization
- Comfortable working in Linux environments
- Proficiency in Git, including feature branching and workflow strategies
Nice-to-Haves:
- Familiarity with privacy-preserving ML techniques (e.g., differential privacy, federated learning)
- Basic understanding of data privacy legislation (e.g., GDPR)
- You’ll be part of a tight-knit group with a high degree of autonomy, and your work will directly impact product innovation and client success.
If you have the skills and passion to drive impact, we want to hear from you!