For companies
  • Hire developers
  • Hire designers
  • Hire marketers
  • Hire product managers
  • Hire project managers
  • Hire assistants
  • How Arc works
  • How much can you save?
  • Case studies
  • Pricing
    • Remote dev salary explorer
    • Freelance developer rate explorer
    • Job description templates
    • Interview questions
    • Remote work FAQs
    • Team bonding playbooks
    • Employer blog
For talent
  • Overview
  • Remote jobs
  • Remote companies
    • Resume builder and guide
    • Talent career blog
Jobs via Dice
Jobs via Dice

Data Scientist Vision-Language Models (VLMs)

Location

Remote restrictions apply
See all remote locations

Salary Estimate

N/AIconOpenNewWindows

Seniority

N/A

Tech stacks

Amazon
Data
Cloud
+29

Visa

U.S. visa required

Permanent role
5 days ago
Apply now

Dice is the leading career destination for tech experts at every stage of their careers. Our client, Cardinal Integrated Technologies Inc, is seeking the following. Apply via Dice today!

Position: Data Scientist Vision-Language Models (VLMs)

Location: San Ramon, CA or Milwaukee, WI

Duration: Full-time

Key Responsibilities

VLM Development, Pose estimation & Deployment:

  • Design, train, and deploy efficient Vision-Language Models (e.g., VILA, Isaac Sim) for multimodal applications including image captioning, visual search, and document understanding, pose understanding, pose comparison.
  • Develop and manage Digital Twin frameworks using AWS IoT TwinMaker, SiteWise, and Greengrass to simulate and optimize real-world systems.
  • Develop Digital Avatars using AWS services integrated with 3D rendering engines, animation pipelines, and real-time data feeds.
  • Explore cost-effective methods such as knowledge distillation, modal-adaptive pruning, and LoRA fine-tuning to optimize training and inference.
  • Implement scalable pipelines for training/testing VLMs on cloud platforms (AWS services such as SageMaker, Bedrock, Rekognition, Comprehend, and Textract.)

NVIDIA Platforms:

  • Should develop a blend of technical expertise, tool proficiency, and domain- specific knowledge on below NVIDIA Platforms:
  • NIM (NVIDIA Inference Microservices): Containerized VLM deployment.
  • NeMo Framework: Training and scaling VLMs across thousands of GPUs.
  • Supported Models: LLaVA, LLaMA 3.2, Nemotron Nano VL, Qwen2-VL, Gemma 3.
  • DeepStream SDK: Integrates pose models like TRTPose and OpenPose, Real-time video analytics and multi-stream processing.

Multimodal AI Solutions:

  • Develop solutions that integrate vision and language capabilities for applications like image-text matching, visual question answering (VQA), and document data extraction.
  • Leverage interleaved image-text datasets and advanced techniques (e.g., cross-attention layers) to enhance model performance.

Image Processing and Computer Vision

  • Develop solutions that integrate Vision based deep learning models for applications like live video streaming integration and processing, object detection, image segmentation, pose Estimation, Object Tracking and Image Classification and defect detection on medical Xray images
  • Knowledge of real-time video analytics, multi-camera tracking, and object detection.
  • Training and testing the deep learning models on customized data

Efficiency Optimization:

  • Evaluate trade-offs between model size, performance, and cost using techniques like elastic visual encoders or lightweight architectures.
  • Benchmark different VLMs (e.g., GPT-4V, Claude 3.5, Nova Lite) for accuracy, speed, and cost-effectiveness on specific tasks.
  • Benchmarking on GPU vs CPU

Collaboration & Leadership:

  • Collaborate with cross-functional teams including engineers and domain experts to define project requirements.
  • Mentor junior team members and provide technical leadership on complex projects.

Experience: -

  • 10+ Years

Location: -

  • San Ramon, CA or Milwaukee, WI (Onsite)

Qualifications

  • Education: Master s or Ph.D. in Computer Science, Data Science, Machine Learning, or a related field.

Experience:

  • Minimum of 10+ years of experience in Machine Learning or Data Science roles with a focus on Vision-Language Models.
  • Proven expertise in deploying production-grade multimodal AI solutions.
  • Experience in self driving cars and self navigating robots.

Technical Skills:

  • Proficiency in Python and ML frameworks (e.g., PyTorch, TensorFlow).
  • Hands-on experience with VLMs such as VILA, Isaac Sim, or VSS.
  • Familiarity with cloud platforms like AWS SageMaker or Azure ML Studio for scalable AI deployment.
  • OpenCV, PIL, scikit-image
  • Frameworks: PyTorch, TensorFlow, Keras
  • CUDA, cuDNN
  • 3D vision: point clouds, depth estimation, LiDAR

Soft Skills:

  • Strong problem-solving skills with the ability to optimize models for real-world constraints.
  • Excellent communication skills to explain technical concepts to diverse stakeholders.

Preferred Technologies

  • Vision-Language Models: VILA, Isaac Sim, EfficientVLM
  • Cloud Platforms: AWS SageMaker, Bedrock
  • Optimization Techniques: LoRA fine-tuning, modal-adaptive pruning
  • Multimodal Techniques: Cross-attention layers, interleaved image-text datasets

MLOps Tools: Docker, MLflow

About Jobs via Dice

🔗Website
Visit company profileIconOpenNewWindows

Unlock all Arc benefits!

  • Browse remote jobs in one place
  • Land interviews more quickly
  • Get hands-on recruiter support
PRODUCTS
Arc

The remote career platform for talent

Codementor

Find a mentor to help you in real time

LINKS
About usPricingArc Careers - Hiring Now!Remote Junior JobsRemote jobsCareer Success StoriesTalent Career BlogArc Newsletter
JOBS BY EXPERTISE
Remote Front End Developer JobsRemote Back End Developer JobsRemote Full Stack Developer JobsRemote Mobile Developer JobsRemote Data Scientist JobsRemote Game Developer JobsRemote Data Engineer JobsRemote Programming JobsRemote Design JobsRemote Marketing JobsRemote Product Manager JobsRemote Project Manager JobsRemote Administrative Support Jobs
JOBS BY TECH STACKS
Remote AWS Developer JobsRemote Java Developer JobsRemote Javascript Developer JobsRemote Python Developer JobsRemote React Developer JobsRemote Shopify Developer JobsRemote SQL Developer JobsRemote Unity Developer JobsRemote Wordpress Developer JobsRemote Web Development JobsRemote Motion Graphic JobsRemote SEO JobsRemote AI Jobs
© Copyright 2025 Arc
Cookie PolicyPrivacy PolicyTerms of Service