Data Scientist (Masters) — AI Data Trainer
About The Role
What if your expertise in machine learning, statistical inference, and data engineering could directly shape how the world's most advanced AI systems reason and respond? We're looking for Data Scientists with graduate-level training to challenge, evaluate, and sharpen cutting-edge AI models — identifying where they fail, why they fail, and how to make them better.
This is a fully remote, flexible contract role. No prior AI industry experience required — just deep, battle-tested knowledge of data science and the ability to communicate it with precision.
- Organization: Alignerr
- Type: Hourly Contract
- Location: Remote
- Commitment: 10–40 hours/week
What You'll Do
- Design Complex Challenges: Create advanced, domain-specific data science problems spanning hyperparameter optimization, Bayesian inference, cross-validation strategies, dimensionality reduction, and more — problems that genuinely push AI to its limits
- Author Ground-Truth Solutions: Develop rigorous, step-by-step technical solutions — including Python/R scripts, SQL queries, and mathematical derivations — that serve as the definitive "golden" benchmark responses
- Audit AI-Generated Code: Critically evaluate AI outputs using libraries like Scikit-Learn, PyTorch, and TensorFlow for correctness, efficiency, and technical soundness
- Refine AI Reasoning: Diagnose logical failures in AI outputs — data leakage, overfitting, improper handling of imbalanced datasets — and deliver structured, actionable feedback that improves how models think
- Document Failure Modes: Systematically capture how and where AI reasoning breaks down across machine learning theory, statistical inference, neural network architectures, and data engineering pipelines
Who You Are
- Currently pursuing or holding a Master's or PhD in Data Science, Statistics, Computer Science, or a quantitative field with strong emphasis on data analysis
- Deep foundational knowledge across supervised and unsupervised learning, deep learning, big data technologies (Spark/Hadoop), or NLP
- Able to communicate complex algorithmic concepts and statistical results clearly and precisely in writing
- Exceptionally detail-oriented — you catch errors in code syntax, mathematical notation, and statistical logic that others miss
- Comfortable working independently and asynchronously on technically demanding tasks
- No prior AI or AI training experience required
Nice to Have
- Prior experience with data annotation, data quality evaluation, or AI evaluation systems
- Proficiency in production-level data science workflows — MLOps, CI/CD for models, or similar
- Familiarity with prompt engineering or language model evaluation
- Experience writing technical documentation or educational content for technical audiences
Why Join Us
- Work directly with industry-leading AI research labs on genuinely frontier problems
- Fully remote and flexible — structure your hours around your life, not the other way around
- Freelance autonomy with consistent, meaningful technical work
- Engage with the most advanced language models in existence and see your contributions shape their capabilities
- Potential for ongoing contract renewals and expanded project involvement as new work launches