Actively recruiting / 305 applicants
We’re here to help you
Jane Cervantes is in direct contact with the company and can answer any questions you may have. Email
Jane Cervantes, RecruiterRole Overview
Join Tally as a Senior AI Engineer and contribute to an AI-first accounting platform designed for SMEs. Our mission is to automate financial workflows with extreme precision and reliability. Unlike typical LLM features, our system demands end-to-end deterministic reliability, ensuring that "almost right" is never acceptable in the financial domain.
Responsibilities
- Design and implement systems that ensure deterministic behavior in non-deterministic models, maintaining reliability over time.
- Bridge the gap from "demo to production" by developing agents capable of handling real-world complexities and edge cases.
- Prevent state drift in complex, multi-step workflows to maintain consistency.
- Convert non-deterministic LLM outputs into predictable, measurable behavior.
- Develop frameworks for evaluating AI system performance with a focus on rigor and statefulness.
- Enhance determinism and reliability within agent workflows.
- Instrument systems to ensure outcomes are measurable and metrics-driven.
Required Skills
- Expertise in building stateful AI systems managing real-world workflows.
- Strong software engineering fundamentals demonstrating engineering rigor.
- Proficiency in managing agent failure modes and developing corrective loops.
- Experience with evaluations, automated benchmarking, context management, memory architectures, and advanced agent design patterns.
- A metrics-first approach to problem-solving, emphasizing that what isn't measured, isn't solved.
Nice to Have
- Familiarity with Python, TypeScript, and JavaScript.
- Experience with frameworks and tools such as PyTorch, TensorFlow, LangChain, and OpenAI/Anthropic APIs.
- Knowledge of data storage solutions like PostgreSQL, Supabase, and Redis.
- Experience with infrastructure tools like Docker, Kubernetes, and AWS/GCP.
- Familiarity with evaluation and monitoring tools including Weights & Biases, MLflow, and custom benchmarking frameworks.
- Experience with workflow and state management tools such as Airflow and Temporal.