Better Engineers. Better Results. BetterEngineer connects accomplished Software Engineers across the Americas with our portfolio of high-growth and newsworthy technology companies in the United States. Senior Engineers in the SalsaMobi network work remotely with some of the most interesting tech companies in the world. Join us today and experience a life where talent has no borders.
Job Description:
We are seeking an expert Senior Data Scientist for a high-impact, 4-month contract to build a sophisticated Minimum Viable Product (MVP) for forecasting sales at the SKU and regional level for a major US retailer. You will be tasked with transforming a unique and complex combination of large-scale credit card, consumer panel, and demographic data into an automated forecasting system.
This is a remote, full-time position open to candidates based in the U.S. or Latin America. The initial engagement is for 4 months, with the possibility of extension.
Responsibilities
- End-to-End Model Development: Lead the development of the forecasting system from an initial Proof-of-Concept (PoC) through a full Minimum Viable Product (MVP).
- Complex Data Integration: Design and implement pipelines to ingest, link, and validate multiple large-scale data sources, including credit card transactions and consumer panel data, using privacy-preserving techniques.
- Advanced Data Correction: Develop and apply robust statistical techniques such as weighting and calibration to correct for non-representative consumer panels and address the high-risk challenge of partial SKU visibility.
- Probabilistic Forecasting: Build and train an ensemble of sophisticated models (e.g., Bayesian time-series, Gradient Boosting Machines, Hierarchical models) to generate granular, SKU-level forecasts. The output must include comprehensive uncertainty estimates (e.g., prediction intervals, quantiles) to support risk-based inventory decisions.
- MLOps Foundation: Establish the foundational MLOps architecture for the project. This includes building automated and resilient data and training pipelines (DAGs) using tools like Airflow and implementing a model registry with MLflow to manage the model lifecycle.
- Deliver First Production Forecast: Successfully orchestrate and deliver the first official forecast on the required 4-week cadence, ensuring the entire system is operational, monitored, and validated.
- Handoff and Documentation: Create comprehensive documentation and work closely with the client team to ensure a smooth transition and handoff of the MVP at the conclusion of the contract.
Required Qualifications
- Proven experience leading complex, end-to-end time-series forecasting projects, preferably within the retail or CPG domain.
- Deep, hands-on expertise in probabilistic forecasting and quantifying uncertainty, using methods like Bayesian modeling (e.g., PyMC, Stan), quantile regression with GBMs (LightGBM/XGBoost), or deep learning models (e.g., DeepAR).
- Demonstrable experience correcting for sampling bias in datasets using statistical weighting techniques (e.g., raking, post-stratification, propensity score weighting).
- Expertise in handling sparse data challenges in forecasting, utilizing techniques like hierarchical modeling or transfer learning to borrow information from related products or categories.
- Practical experience building and deploying automated ML workflows and orchestration pipelines using tools like Apache Airflow or Prefect.
- Strong understanding of data privacy concepts and experience handling sensitive consumer data.
This is a remote, full-time position open to candidates based in the U.S. or Latin America. The initial engagement is for 4 months, with the possibility of extension.