Security Clearance Requirements: Public Trust background investigation required. U.S. citizenship required. Must be stated in your resume. Must be eligible for adjudication.
Location: Remote (Telework Authorized). Occasional travel to the Washington, DC area may be required for onsite meetings.
Overview:
The Senior Data Scientist will provide advanced analytical and machine learning expertise to support fraud detection, improper payment identification, and criminal investigative analytics in direct support of a federal oversight mission. Work is performed remotely with occasional onsite collaboration requirements.
Required Qualifications
- Master's degree, Ph.D., or equivalent advanced degree in data science, machine learning, computer science, mathematics, or a related quantitative field — OR a minimum of 10 years of applied hands-on work experience in the same disciplines in lieu of a degree.
- 5 or more years designing, implementing, and maintaining advanced AI/ML predictive models, including both supervised and unsupervised learning methods.
- 5 or more years developing fraud analytic rules and models using leading-edge tools and methods — this must be documented professional experience building fraud detection systems, not adjacent analytics work.
- 5 or more years developing regression, classification, and statistical models specifically for anomaly detection and pattern recognition.
- 3 or more years providing direct data analytics support for criminal investigations into fraud or abuse — experience working with investigators, prosecutors, or within law enforcement or inspector general data environments is required. This is a hard requirement; adjacent fraud analytics experience without direct investigative support does not satisfy this qualification.
- Demonstrated ability to tell a story from data — taking model outputs, anomaly findings, or investigative leads and presenting them in a way that drives decisions, not just reports results. Experience presenting to non-technical stakeholders in a federal or law enforcement environment is strongly preferred.
- 3 or more years of hands-on data manipulation in Python with Pandas required specifically; NumPy and scikit-learn experience strongly expected.
- 3 or more years operating in modern cloud environments — Azure, AWS, or GCP; multi-cloud experience preferred; cloud certifications preferred.
- 2 or more years of advanced SQL data analysis with demonstrated experience in both SQL Server and PostgreSQL specifically; both are required.
- 2 or more years designing and scaling Natural Language Processing solutions in production environments.
- 2 or more years presenting analytical methods and findings to mixed audiences including technical teams, investigators, program leadership, and senior non-technical stakeholders in both written and oral formats.
- U.S. citizenship required. Must be stated on your resume. Must be eligible for and able to obtain a federal Public Trust background investigation.
Key Responsibilities
- Design, develop, implement, and maintain advanced supervised and unsupervised machine learning models targeting fraud, improper payments, and non-compliance within federal program portfolios.
- Provide direct analytical support to criminal investigators, collaborating to determine investigative strategies and adapting analysis to shifting case needs.
- Adhere to federal rules of criminal procedure governing protected information, including Rule 6(e) requirements for grand jury materials.
- Integrate and scale NLP methods including OCR, semantic similarity algorithms, and large language models to parse and analyze large unstructured text corpora.
- Perform data quality analysis on source tables to identify abnormalities and inconsistencies.
- Develop repeatable processes for efficiently combining and analyzing large relational and unstructured datasets.
- Develop and maintain fraud indicators and predictive models; document all methodology in compliance with criminal evidentiary requirements.
- Develop visualizations, dashboards, network graphs, and interactive maps to communicate findings to both investigative staff and program leadership.
- Create automation techniques using Python, SharePoint, Power BI, and Microsoft Excel to improve task efficiency.
- Coordinate with the Data Engineering team to ensure architecture supports machine learning workloads.
- Present analytical methods and findings to technical and non-technical stakeholders in written and oral formats.
Additional Preferred Qualifications
- Experience operating within an Office of Inspector General, Department of Justice, FinCEN, FBI, or equivalent law enforcement or federal oversight data environment.
- NLP or ML-specific certifications.
- Experience applying LLMs to investigative or compliance analytics use cases.
- Familiarity with Rule 6(e) grand jury confidentiality requirements or other federal criminal procedure data handling standards.
- Experience developing network graphs or geospatial visualizations for investigative analysis.
Education and Certifications
- Master's or doctoral degree in a quantitative field required — OR 10 years of equivalent applied experience.
- Cloud certifications in Azure, AWS, or GCP preferred.
- NLP or ML-specific certifications a plus.