Job Title: Lead Data Scientist - RL
Location: Fully Remote
Compensation: $180,000 – $200,000 base, bonus, + equity
Role: Lead Data Scientist (Reinforcement Learning – Pricing Optimization)
A well-funded, fast-growing B2B SaaS company is rebuilding its core decision engine using reinforcement learning. The mission: replace static pricing logic with adaptive, real-time models that respond to evolving customer behavior, competitive signals, and demand trends. You’ll lead machine learning efforts that directly shape pricing across thousands of business units, driving measurable revenue outcomes at scale.
Why This Role Matters
Pricing is a critical part of this platform, and it's being reimagined from the ground up. This is a rare opportunity to apply reinforcement learning in a real-world setting with high data velocity, business complexity, and direct customer impact.
You’ll operate in a greenfield environment with a modern MLOps foundation, full stakeholder support, and strong technical partners across data, engineering, and product.
What You’ll Do
- Design and train reinforcement learning agents (bandits, DDPG, PPO, etc.) for dynamic pricing decisions
- Build and evaluate offline simulation environments to test policies before live deployment
- Explore policy gradient and value-based methods, as well as model-based RL techniques
- Integrate real-world pricing constraints (e.g. caps, floors, competitor benchmarks) into model design
- Collaborate with MLOps and engineering to deploy agents into production using tools like SageMaker and MLflow
- Own the experimentation layer, including uplift modeling, causal inference, and A/B testing
- Work cross-functionally with product and engineering to drive strategy and roadmap decisions
Tech Stack
- Languages/Frameworks: Python, PyTorch, TensorFlow, SQL, scikit-learn
- Infra: AWS SageMaker, MLflow
- Modeling: Reinforcement learning simulators, custom optimization tools, demand elasticity models
- Deployment: Batch or real-time ML APIs powering customer-facing decisions
Ideal Candidate
- Experienced data scientist or ML practitioner with a background in pricing, optimization, or control systems
- Hands-on with RL in production, including bandits and deep RL (DDPG, PPO, AlphaZero-style)
- Strong understanding of optimization theory, simulations, and causal inference
- Collaborative and pragmatic, skilled at aligning modeling with product and business goals
- Comfortable in greenfield environments where ownership and iteration are critical
About Us
People In AI partners with fast-growing AI and ML organizations to bring clarity and precision to technical hiring. We help candidates connect with mission-driven teams through a transparent and streamlined process.