Position Summary
The Data Scientist makes predictive insights and builds decision-support tools from operational and performance data, combining hands-on data science (statistical modeling and machine learning) with ownership of industrial analytics stack. The Data Scientist serves as a PI System Administrator, Seeq Developer, and builder of AI-enabled tools that help engineers and operators act faster and with confidence.
The Data Scientist will work with industrial data systems—especially the PI System (OSIsoft PI / AVEVA PI)—to ensure reliable data availability and governed context and will develop Seeq analyses and Python-based pipelines for time-series modeling, anomaly detection, forecasting, and performance monitoring.With curiosity, rigor and comfort with large time-series datasets the Data Scientist is able to take solutions end-to-end—from PI tag and context configuration, to Seeq development, to Python/AI tool deployment—while collaborating with cross-functional technical teams.
Essential Functions
Data Science, Modeling, and Insights
Build statistical and machine-learning models to detect anomalies, forecast performance, and identify optimization opportunities
Design and evaluate experiments and model validation approaches; translate results into clear recommendations for engineering and operations
Develop dashboards, reports, and model performance metrics to communicate insights and drive data-informed decisions
PI System Administration & Time-Series Data Engineering
Administer and support the PI System (OSIsoft PI / AVEVA PI), including tag strategy, data quality monitoring, and user support
Build and maintain PI AF structure (assets, templates, attributes) and documentation to provide governed context for analytics and reporting
Support PI interfaces/data flows and collaborate with OT/IT and engineers to validate sensors/tags, troubleshoot gaps, and improve reliability and performance
Create curated datasets, features, and labels from PI data (with clear definitions and lineage) to support Seeq analyses and ML modeling
Seeq Development & AI-Enabled Tools
Develop and maintain Seeq Workbooks/Analyses for performance monitoring, anomaly detection, and root-cause investigations
Create reusable Seeq templates, calculation standards, and best practices; enable users through documentation and training
Build AI-enabled tools (e.g., copilots, guided diagnostics, automated summaries) that leverage governed PI/Seeq context to accelerate engineering workflows
Evaluate, monitor, and improve AI tool quality (accuracy, drift, user feedback), and implement practical guardrails for safe, reliable use
Python, Analytics Engineering & Deployment
Develop and maintain Python-based pipelines for data extraction, preprocessing, modeling, and automation
Prototype and productionize analytical applications that support performance monitoring, anomaly detection, and forecasting
Automate recurring model runs, evaluations, and reporting workflows with attention to reproducibility and reliability
Improve existing analytics codebases; contribute to model monitoring, documentation, and maintainable data science practices
Project & Engineering Partnership
Collaborate with engineers and subject matter experts to frame operational problems into measurable data science objectives
Provide analytical support for initiatives including data validation, statistical analysis, modeling, and performance reporting
Help standardize modeling approaches, feature definitions, and evaluation metrics across projects
Data Quality, Governance & Monitoring
Ensure accuracy and reliability of datasets used for analysis and modeling (validation checks, outlier handling, sensor sanity checks)
Perform data cleaning, validation, and documentation, including assumptions, feature definitions, and dataset lineage
Maintain organized analytical workflows and pipelines to support repeatable modeling and ongoing monitoring
Other Responsibilities
Education, Experience, And Skills Required
Bachelor’s degree in Data Science, Computer Science, Engineering, Statistics, Applied Math, or related field
2–5 years of experience in data science, applied analytics, or technical modeling roles
Strong Python skills for data science (e.g., pandas, numpy, scikit-learn; visualization libraries)
Strong skills in SQL and Excel for analysis, validation, and stakeholder-ready outputs
Experience with data visualization and reporting tools (Power BI, Tableau, or similar)
Strong statistical reasoning, analytical problem-solving skills, and attention to data quality
Ability to communicate technical findings clearly to both technical and non-technical stakeholders
Demonstrated experience using applied statistics, machine learning, and time-series modeling
Ability to use Python for data science and AI tooling (data wrangling, modeling, visualization; building assistants/automation)
Understand and apply PI System administration fundamentals and data engineering for high-frequency time-series (tags, quality checks, contextualization)
Seeq development (shared analyses, calculations, templates) and stakeholder-ready data storytelling
Ability to partner cross-functionally and tool and team enablement (requirements, training, documentation, adoption)
Lead and support deployment-minded practices (reproducibility, versioning, testing, monitoring) for analytics, models, and AI tools
Must have strong verbal and written communication skills
Must be able to read, write and speak English at a level which will permit the employee to accurately understand and communicate information to safely and efficiently perform the job duties
Ability to prioritize and plan work activities so time is used efficiently and effectively
Must demonstrate accuracy and thoroughness to ensure quality performance
Ability to identify and resolve problems in a timely manner
Preferred Qualifications
Experience working with OSIsoft PI / AVEVA PI System (PI Data Archive, PI AF) and industrial time-series data
Experience developing in Seeq (Workbench/Organizer), including building shared analyses and calculations
Experience with operational/engineering datasets (e.g., power generation, rotating equipment, process systems)
Familiarity with time-series methods (e.g., resampling, lag features, seasonality, change-point detection)
Experience developing reusable analytics packages, APIs, or scheduled jobs for model execution
Knowledge of predictive modeling (forecasting, classification/regression), anomaly detection, and model evaluation
Physical Requirements
Nothing in this job description restricts management’s right to assign or reassign duties and responsibilities to this job at any time.