Senior AI Engineer WW-FT

Location

Remote anywhere

Hourly rate

Min. experience

5+ years

Hours per week

20 hours

Duration

24 weeks

Required skills

Freelance job

Posted 22 days ago

Apply now

Actively recruiting / 305 applicants

We’re here to help you

Jane Cervantes is in direct contact with the company and can answer any questions you may have. Email

Jane Cervantes, Recruiter

Role Overview

Join Tally as a Senior AI Engineer and contribute to an AI-first accounting platform designed for SMEs. Our mission is to automate financial workflows with extreme precision and reliability. Unlike typical LLM features, our system demands end-to-end deterministic reliability, ensuring that "almost right" is never acceptable in the financial domain.

Responsibilities

Design and implement systems that ensure deterministic behavior in non-deterministic models, maintaining reliability over time.
Bridge the gap from "demo to production" by developing agents capable of handling real-world complexities and edge cases.
Prevent state drift in complex, multi-step workflows to maintain consistency.
Convert non-deterministic LLM outputs into predictable, measurable behavior.
Develop frameworks for evaluating AI system performance with a focus on rigor and statefulness.
Enhance determinism and reliability within agent workflows.
Instrument systems to ensure outcomes are measurable and metrics-driven.

Required Skills

Expertise in building stateful AI systems managing real-world workflows.
Strong software engineering fundamentals demonstrating engineering rigor.
Proficiency in managing agent failure modes and developing corrective loops.
Experience with evaluations, automated benchmarking, context management, memory architectures, and advanced agent design patterns.
A metrics-first approach to problem-solving, emphasizing that what isn't measured, isn't solved.

Nice to Have

Familiarity with Python, TypeScript, and JavaScript.
Experience with frameworks and tools such as PyTorch, TensorFlow, LangChain, and OpenAI/Anthropic APIs.
Knowledge of data storage solutions like PostgreSQL, Supabase, and Redis.
Experience with infrastructure tools like Docker, Kubernetes, and AWS/GCP.
Familiarity with evaluation and monitoring tools including Weights & Biases, MLflow, and custom benchmarking frameworks.
Experience with workflow and state management tools such as Airflow and Temporal.