Lead Data Scientist / ML Engineer 🚀
Remote Role
What you’ll be working on:
- Designing and building hybrid ML models that combine supervised learning, time-series forecasting, and NLP to extract insights from unstructured data like PDFs, fund memos, and regulatory filings
- Adding explainability to models using techniques like SHAP, LIME, and feature attribution so outputs are transparent and human-readable
- Building scalable data pipelines across off-chain fundamentals, on-chain activity, and macro benchmarks
- Integrating data from sources like FRED, PitchBook LCD, Securitize, Centrifuge, Maple, and TrueFi, with strong data lineage and freshness guarantees
- Developing anomaly detection and reconciliation tools across issuer, administrator, and blockchain datasets
- Creating evaluation frameworks to measure accuracy, confidence intervals, latency, and data quality
- Backtesting model outputs against historical NAVs, secondary-market trades, and redemptions
- Researching and incorporating credit-risk signals (CDS spreads, recovery rates, default data, etc.)
- Building continuous learning loops using live market data and partner feedback
- Working closely with Product and Engineering to ship models via APIs, SDKs, and dashboards used by traders, curators, and risk teams
- Collaborating with data providers, protocol teams, and fund administrators to improve coverage and signal quality
- Partnering with the CTO on long-term model governance, transparency, and AI ethics
What I’m looking for:
- 5+ years of experience in applied ML, quantitative finance, or credit-risk modeling
- Strong Python and SQL skills, plus experience with ML frameworks like PyTorch, TensorFlow, scikit-learn, or XGBoost
- Solid understanding of time-series forecasting, regression/classification, and probabilistic modeling
- Hands-on experience with financial data (fixed income, private credit, or structured products)
- Familiarity with blockchain and DeFi data, including smart contracts, token metadata, and on-chain events
- Experience deploying ML models into production (APIs, orchestration, or streaming systems)
Bonus to have:
- Background in credit analytics, NAV valuation, or structured credit
- Experience in quant research, fintech data science, or tokenized asset analytics
- Experience with NLP, vector databases, and LLMs / GenAI tools (OpenAI APIs, GPT-4, LangChain, HuggingFace, etc.)