Dice is the leading career destination for tech experts at every stage of their careers. Our client, InfoObjects Inc, is seeking the following. Apply via Dice today!
Senior Data Scientist (Reinforcement Learning must have)
Location: 100% Remote
Duration: 6 Months contract
Reinforcement Learning (RL) Skill Set
-
Understanding of Sequential Decision Making
-
RL focuses on agents making a series of decisions to maximize cumulative reward.
-
Requires knowledge of Markov Decision Processes (MDPs), policy/value functions, and Bellman equations.
-
Algorithmic Expertise in RL
-
Familiarity with algorithms like:
-
Q-learning, SARSA
-
Deep Q-Networks (DQN)
-
Policy Gradient methods (REINFORCE, PPO, A3C, DDPG, SAC)
-
Experience tuning exploration vs. exploitation strategies.
-
Simulation and Environment Design
-
Ability to create or work with simulated environments (e.g., OpenAI Gym, Unity ML-Agents).
-
Often involves custom reward shaping and environment dynamics modeling.
-
Long Training Times and Instability
-
RL models are often unstable and require careful tuning.
-
Experience with training stability techniques, reward normalization, and experience replay.
-
Tooling and Frameworks
-
Familiarity with RL-specific libraries like:
-
Stable Baselines3
-
Ray RLlib
-
TensorFlow Agents
-
OpenAI Baselines
Team Focus:
- The role is part of a newly formed Global Intelligence Function, supporting cross-business initiatives across three business units: Points, Loyalty, and Hospitality.
- The team is in the early stages of building capabilities and seeks someone who can help shape and grow this function.
Project Scope & Technical Focus:
- Strong emphasis on Reinforcement Learning (RL), especially for dynamic pricing use cases.
- There s currently a skill gap in RL on the Points side, so candidates with RL experience would be especially valuable.
- The approach will be to start simple and evolve iteratively over time.
Role Requirements:
- Ideal candidate is a senior-level, polished professional with a strategic mindset and hands-on experience.
- RL algorithms are open-ended, candidates who can design and iterate based on business needs are highly desirable.
- Strong domain knowledge is required, particularly around rewards, pricing, and loyalty systems.
- Business acumen and the ability to translate technical insights into actionable outcomes are key.
Focus Areas:
- Understanding of RL concepts and practical applications
- Their approach to identifying and defining the "reward" function
- Prior experience implementing RL or similar solutions in a production/business setting
- Ability to align technical efforts with business goals and drive measurable outcomes
Growth Potential:
- The organization is evolving toward a more enterprise-scale operation, offering growth opportunities and cross-functional collaboration.
Ideal Candidate Profile for Roles:
- 5 10+ years of experience in Data Science, Machine Learning, or Applied Statistics
- Strong coding skills in Python; experience with AWS tools (especially SageMaker) is essential
- Able to handle incomplete data or evolving infrastructure (startup mindset preferred)
- Comfortable working independently and engaging cross-functional stakeholders
RL experience is a must have.