For companies
  • Hire developers
  • Hire designers
  • Hire marketers
  • Hire product managers
  • Hire project managers
  • Hire assistants
  • How Arc works
  • How much can you save?
  • Case studies
  • Pricing
    • Remote dev salary explorer
    • Freelance developer rate explorer
    • Job description templates
    • Interview questions
    • Remote work FAQs
    • Team bonding playbooks
    • Employer blog
For talent
  • Overview
  • Remote jobs
  • Remote companies
    • Resume builder and guide
    • Talent career blog
Jobgether
Jobgether

Data Scientist, AI Data Foundations

Location

Remote restrictions apply
See all remote locations

Salary Estimate

N/AIconOpenNewWindows

Seniority

N/A

Tech stacks

Data
AI
Azure
+37

Visa

U.S. visa required

Permanent role
3 days ago
Apply now

This position is posted by Jobgether on behalf of a partner company. We are currently looking for a Data Scientist, AI Data Foundations in United States.

This role sits at the intersection of data science, data engineering, and applied AI infrastructure, focusing on building the foundational data systems that power modern AI and machine learning applications. You will design and maintain the curated datasets, vector stores, feature stores, and graph data models that enable retrieval-augmented generation (RAG), predictive modeling, and intelligent product experiences. Rather than training models directly, your impact comes from ensuring that all downstream AI systems are fueled by high-quality, well-structured, and reliable data. You will work closely with ML engineers, product teams, and analysts to translate raw enterprise data into AI-ready assets. The role also includes deep data discovery work across lending and financial datasets to uncover patterns, anomalies, and actionable insights. Operating in a modern cloud and lakehouse environment, you will help define how data is structured, governed, and consumed across AI use cases.

Accountabilities

  • Design, build, and maintain vector stores supporting retrieval-augmented generation systems, including embedding pipelines, chunking strategies, indexing approaches, and retrieval evaluation frameworks
  • Develop and operate feature store architectures ensuring consistency between offline training and online inference, with strong attention to lineage, freshness, and reuse
  • Create and manage graph data models representing relationships across customers, applications, financial products, and outcomes for both AI and analytical use cases
  • Conduct advanced data discovery and exploratory analysis on lending, deposit, and behavioral datasets to identify trends, anomalies, and model-driving features
  • Build and maintain AI-ready curated datasets with strong governance, documentation, and quality controls to support downstream ML and application teams
  • Define and execute evaluation methodologies for vector retrieval quality, embedding performance, feature drift, and graph completeness
  • Collaborate closely with ML engineers and applied scientists to ensure data infrastructure aligns with modeling and product needs
  • Ensure responsible data usage by partnering with governance and compliance teams to enforce data privacy, security, and regulatory standards
  • Communicate insights from data discovery through dashboards, notebooks, and structured narratives for technical and non-technical stakeholders

Requirements

  • 4-7 years of experience in data science, ML engineering, or applied data roles with emphasis on building production data assets for AI or analytics consumption
  • Strong experience building vector stores for RAG or semantic search, including embeddings, indexing, chunking, and retrieval evaluation
  • Experience designing or operating feature stores, including offline/online consistency and point-in-time correctness
  • Hands-on experience with graph databases such as Neo4j, TigerGraph, or Azure Cosmos DB Gremlin, including graph modeling and querying
  • Strong programming skills in Python (pandas, NumPy, scikit-learn, PySpark) and SQL, with experience in Databricks environments
  • Familiarity with LLM and embedding tooling such as Hugging Face, OpenAI/Azure OpenAI APIs, and LangChain or similar frameworks
  • Strong analytical mindset with proven ability to explore complex datasets, identify patterns, and validate insights statistically
  • Solid understanding of core machine learning concepts including evaluation metrics, leakage, and train/test discipline
  • Excellent communication skills with the ability to translate technical findings into business-relevant insights
  • Preferred experience in FinTech or SaaS environments, especially with lending, credit, fraud, or KYC/AML datasets
  • Familiarity with Databricks AI ecosystem, Azure services, or modern vector/graph database technologies is a plus

Benefits

  • Competitive compensation package with potential bonus and equity components depending on role level
  • Comprehensive health coverage including medical, dental, and vision plans
  • Retirement savings plan with employer contribution options
  • Flexible remote-first work environment across eligible U.S. states
  • Paid time off, company holidays, and flexible scheduling policies
  • Professional development support, including training, certifications, and learning resources
  • Access to modern AI and data tooling within a cutting-edge cloud and lakehouse ecosystem
  • Opportunity to work on high-impact AI infrastructure powering real-world financial and lending systems

How Jobgether Works

We use an AI-powered matching process to ensure your application is reviewed quickly, objectively, and fairly against the role's core requirements. Our system identifies the top-fitting candidates, and this shortlist is then shared directly with the hiring company. The final decision and next steps (interviews, assessments) are managed by their internal team.

We appreciate your interest and wish you the best!

Why Apply Through Jobgether?

Data Privacy Notice: By submitting your application, you acknowledge that Jobgether will process your personal data to evaluate your candidacy and share relevant information with the hiring employer. This processing is based on legitimate interest and pre-contractual measures under applicable data protection laws (including GDPR). You may exercise your rights (access, rectification, erasure, objection) at any time.

We may use artificial intelligence (AI) tools to support parts of the hiring process, such as reviewing applications, analyzing resumes, or assessing responses. These tools assist our recruitment team but do not replace human judgment. Final hiring decisions are ultimately made by humans. If you would like more information about how your data is processed, please contact us.

About Jobgether

👥11-50
📍Brussels
🔗Website
Visit company profileIconOpenNewWindows

Unlock all Arc benefits!

  • Browse remote jobs in one place
  • Land interviews more quickly
  • Get hands-on recruiter support
PRODUCTS
Arc

The remote career platform for talent

Codementor

Find a mentor to help you in real time

LINKS
About usPricingArc Careers - Hiring Now!Remote Junior JobsRemote jobsCareer Success StoriesTalent Career BlogArc Newsletter
JOBS BY EXPERTISE
Remote Front End Developer JobsRemote Back End Developer JobsRemote Full Stack Developer JobsRemote Mobile Developer JobsRemote Data Scientist JobsRemote Game Developer JobsRemote Data Engineer JobsRemote Programming JobsRemote Design JobsRemote Marketing JobsRemote Product Manager JobsRemote Project Manager JobsRemote Administrative Support Jobs
JOBS BY TECH STACKS
Remote AWS Developer JobsRemote Java Developer JobsRemote Javascript Developer JobsRemote Python Developer JobsRemote React Developer JobsRemote Shopify Developer JobsRemote SQL Developer JobsRemote Unity Developer JobsRemote Wordpress Developer JobsRemote Web Development JobsRemote Motion Graphic JobsRemote SEO JobsRemote AI Jobs
© Copyright 2026 Arc
Cookie PolicyPrivacy PolicyTerms of Service