For companies
  • Hire developers
  • Hire designers
  • Hire marketers
  • Hire product managers
  • Hire project managers
  • Hire assistants
  • How Arc works
  • How much can you save?
  • Case studies
  • Pricing
    • Remote dev salary explorer
    • Freelance developer rate explorer
    • Job description templates
    • Interview questions
    • Remote work FAQs
    • Team bonding playbooks
    • Employer blog
For talent
  • Overview
  • Remote jobs
  • Remote companies
    • Resume builder and guide
    • Talent career blog
Straive
Straive

Data Scientist(5+ Years)

Location

Remote restrictions apply
See all remote locations

Salary Estimate

N/AIconOpenNewWindows

Seniority

N/A

Tech stacks

Python
Data
Machine learning
+27

Permanent role
4 days ago
Apply now

Data Scientist with Strong Python Expertise

Role Overview

We are seeking an experienced and innovative Data Scientist with 5-10 years of hands-on expertise to drive our advanced Artificial Intelligence and Machine Learning initiatives, with a specialized focus on Large Language Models (LLMs), Natural Language Processing (NLP), and Retrieval-Augmented Generation (RAG) systems.

The ideal candidate possesses deep technical skills in Python programming, extensive experience in text processing, and advanced SQL proficiency. This role is critical for transforming unstructured data into strategic assets and building production-ready generative AI applications.

Key Responsibilities

  • Generative AI & LLM Development:
  • Design, develop, and implement end-to-end solutions utilizing pre-trained and custom Large Language Models (LLMs) for tasks such as summarization, question-answering, and content generation.
  • Apply techniques like fine-tuning, prompt engineering, and model distillation to optimize LLM performance and efficiency for domain-specific use cases.
  • RAG System Architecture & Deployment:
  • Architect and build robust Retrieval-Augmented Generation (RAG) pipelines, integrating vector databases (e.g., Pinecone, ChromaDB, Milvus) and embedding models to ground LLM outputs in proprietary data, thereby mitigating hallucinations and improving accuracy.
  • Develop and manage the entire lifecycle of RAG systems, from document ingestion and chunking strategies to retrieval and re-ranking optimization.
  • NLP & Text Processing:
  • Lead the development of advanced Natural Language Processing (NLP) models for core tasks including Named Entity Recognition (NER), sentiment analysis, topic modelling, and text classification.
  • Implement efficient text processing and feature engineering techniques on large, unstructured datasets.
  • Programming & Data Management:
  • Demonstrate expert-level proficiency in Python and its data science ecosystem (e.g., PyTorch/TensorFlow, Hugging Face Transformers, NumPy, Pandas, Scikit-learn).
  • Write and optimize complex, performant SQL queries for data extraction, manipulation, and analysis from diverse data sources, including traditional data warehouses and NoSQL stores.
  • MLOps & Deployment:
  • Collaborate with MLOps and Engineering teams to transition LLM/NLP models from proof-of-concept to scalable, high-performance production systems.
  • Develop model monitoring frameworks to track performance, drift, and user feedback in production.

Required Qualifications

  • Experience: 5 to 10 years of progressive experience in Data Science, Machine Learning, or a related field.
  • Specialized Expertise: 3+ years of hands-on experience developing and deploying solutions involving LLMs, RAG, and advanced NLP techniques.
  • Technical Stack:
  • Expert Python programming skills for ML model development and production-level code.
  • Advanced SQL proficiency and experience working with large relational databases.
  • Hands-on experience with deep learning frameworks like PyTorch or TensorFlow.
  • In-depth familiarity with the Hugging Face ecosystem (Transformers, Datasets).
  • Education: Master’s or Ph.D. in Computer Science, Computational Linguistics, AI, or a related quantitative field.

Preferred Qualifications

  • Experience with cloud AI services (e.g., Azure OpenAI, Google Vertex AI, AWS Bedrock).
  • Knowledge of distributed computing frameworks (e.g., Spark, Dask) for large-scale text processing.
  • Experience with containerization (Docker, Kubernetes) and MLOps tools (e.g., MLflow).
  • Track record of research publications or contributions to open-source NLP/Gen AI projects.

About Straive

👥501-1000
📍Singapore
🔗Website
Visit company profileIconOpenNewWindows

Unlock all Arc benefits!

  • Browse remote jobs in one place
  • Land interviews more quickly
  • Get hands-on recruiter support
PRODUCTS
Arc

The remote career platform for talent

Codementor

Find a mentor to help you in real time

LINKS
About usPricingArc Careers - Hiring Now!Remote Junior JobsRemote jobsCareer Success StoriesTalent Career BlogArc Newsletter
JOBS BY EXPERTISE
Remote Front End Developer JobsRemote Back End Developer JobsRemote Full Stack Developer JobsRemote Mobile Developer JobsRemote Data Scientist JobsRemote Game Developer JobsRemote Data Engineer JobsRemote Programming JobsRemote Design JobsRemote Marketing JobsRemote Product Manager JobsRemote Project Manager JobsRemote Administrative Support Jobs
JOBS BY TECH STACKS
Remote AWS Developer JobsRemote Java Developer JobsRemote Javascript Developer JobsRemote Python Developer JobsRemote React Developer JobsRemote Shopify Developer JobsRemote SQL Developer JobsRemote Unity Developer JobsRemote Wordpress Developer JobsRemote Web Development JobsRemote Motion Graphic JobsRemote SEO JobsRemote AI Jobs
© Copyright 2025 Arc
Cookie PolicyPrivacy PolicyTerms of Service