Real-Time ASR / Audio ML Developer - PT Freelance - North America

Location

Remote restrictions apply

See all remote locations

Hourly rate

Min. experience

5+ years

Hours per week

20 hours

Duration

12 weeks

Required skills

Python Linux Machine learningSpeech RecognitionLow latency

Freelance job

Posted 14 hours ago

Apply now

Actively recruiting / 9 applicants

We’re here to help you

Cynthia is in direct contact with the company and can answer any questions you may have. Email

Cynthia, Recruiter

Role Overview

Join an innovative team building an ultra-low-latency, broadcast-to-transcription pipeline on Linux. We are looking for an expert ASR (Automatic Speech Recognition) Developer to own the critical layer where raw PCM audio signals are converted into real-time text streams.

This is not a standard AI implementation role. You will be working at the intersection of hardware and software, integrating clean audio output from an FPGA pipeline and optimizing the entire stack for deterministic, low-latency performance.

Core Responsibilities

Pipeline Architecture: Build and optimize the end-to-end PCM-to-real-time-text streaming layer.
Performance Tuning: Heavily optimize for latency and memory usage. You will be responsible for tuning buffering behaviors and model inference to ensure instantaneous results.
GPU Optimization: Manage and optimize GPU acceleration to ensure high-speed processing without bottlenecks.
Hardware Integration: Work with raw, uncompressed PCM data coming directly from a specialized FPGA hardware pipeline.
Hardening: Test, debug, and harden the system to ensure it meets the rigorous demands of 24/7 real-time broadcast operations.

Nice-to-Haves (The "Standout" Candidate)

Network Engineering: Background in high-performance networking or broadcast distribution.
Hardware Expertise: Experience working with GPU acceleration (CUDA) and interfacing with FPGA/DSP hardware.
Broadcast Knowledge: Understanding of contribution feeds and broadcast signal topology.