Actively recruiting / 27 applicants
We’re here to help you
Wilson Bittencourt is in direct contact with the company and can answer any questions you may have. Email
Wilson Bittencourt, RecruiterThe Mission
Turn messy public-record & web data into trustworthy mass-tort signals - via rapid EDA, rigorous modeling, and production-grade APIs - so lawyers act on facts, not hunches.
What You’ll Own
- EDA Autopilot - Grab raw JSON/CSV/HTML, profile it, spot outliers, and surface “aha” patterns - without waiting for a PM to ask.
- Model Builder - Train and tune classification / ranking models (Typical classifiers, light ML, LLM-based RAG) that lift recall & precision week-over-week.
- API Integrator - Package models behind FastAPI endpoints, validate / marshal schemas with Pydantic, and push to GitHub Actions CI in a day.
- MLOps Wrangler - Monitor drift, schedule batch recall and write lightweight tests.
- Insight Storyteller - Ship clear notebooks / dashboards & concise Loom walk-throughs that legal SMEs grok in minutes.
- Startup Swiss-Army Knife - Spot gaps (data gaps, labeling gaps, infra gaps) and patch them before anyone asks. Ambiguity is the default.
What Success Looks Like
- Week 4 - First exploratory notebook flags a recall-worthy defect the founders didn’t see.
- Week 6 - A FastAPI route serving the trained model hits < 300 ms P95 latency in prod.
- Quarter 1 - Recall ↑ 15 pp and false-positive rate ↓ 10 pp on live web-scrape feed; zero pager-alerts.
Your Toolkit
- 2-4 yrs Python data science: pandas/Polars, Astral stack, PyTorch/TF.
- Comfort spinning up FastAPI + Pydantic micro-services.
- Familiarity with using rate-limited LLMs to augment and clean existing datasets
- Solid SQL & object-storage chops (Postgres, DuckDB, S3).
- CI/CD familiarity (GitHub Actions or similar); basic IaC a plus.
- You document & demo your work proactively - no babysitting required.
Nice-to-Haves
- Prior scraping work (Scrapy, Playwright) or PACER/NHTSA/FDA datasets.
- Experience with vector DBs (Qdrant, pgvector) & prompt-engineering.
- Exposure to SOC 2 or other regulated-data environments.
Interview Process
- 15 minute initial overview call
- ~2-3 hour take home assessment with dataset provided. EDA and written communication expected
- 1 hour pair programming assessment with CTO, with screenshare and agent use
- 30 minute Q&A with founding team
Why Join
Green-field ML canvas, instant customer feedback, and exposure to leadership in a venture-backed startup already generating revenue and real impact in legal tech. Shape how mass-tort intelligence is built in the AI era.
Ready to build? Apply today.