WHO WE ARE: MagicSchool is the premier generative AI platform for teachers. We're just over 2 years old, and more than 6 million teachers from all over the world have joined our platform. Join a top team at a fast growing company that is working towards real social impact. Make an account and try us out at our website and connect with our passionate community on our Wall of Love.
As a Senior Backend Engineer- Evaluations, you will work with the Trust, Safety and Quality team as a strong individual contributor to expand our Generative AI Evaluations capabilities. This work involves both application and data engineering. You will feel a sense of ownership over the space far beyond just taking tickets. You will obsess over safety, quality, speed and user impact. You will know when it's time to build versus time to buy and integrate. You will be passionate about building resilient scalable systems to ensure our users have the best experience possible.
What You Will Do:
Ensure we're building the safest product MagicSchool can build.
Build tools and processes for evaluating MagicSchool output at scale
Stay up to date on the latest tools and literature
Passionate about solving education problems with technology
Work closely with Evaluations product manager, team lead, data scientist and engineer and deliver high quality features
Design, architect, and write high quality code to expand our Evaluations framework's capabilities
Debug complex code and applications in a cloud environment
Build software that is easy for others to understand and easy to maintain
Data modeling in a relational database, data warehouse, and object stores.
Assess and help the team choose appropriate third party solutions to problems when building doesn't make sense
Work cross-functionally to integrate MagicSchool's evaluations platform with the larger application
Qualifications/Competencies/Skills:
Working knowledge of SQL (PostgreSQL) and database design / data modeling
Expert knowledge of Python
Modern Python features like typehints and asyncio
Python Multiprocessing to support evaluation jobs in the 10s to 100s of thousands per run.
Familiarity with Pydantic
Strong communication skills: team-first mindset, highly collaborative, can articulate decisions within team's context
Experience working with and deploying docker applications
Builds relationships easily: emotionally intelligent, communication, warm
Gets a lot done: Works hard, resourceful, do whatever it takes
Adaptable: Smart, learns fast, curious
Nice to Have:
Experience with evaluating generative AI systems
TypeScript and React (useful for internal tooling and cross-team contributions)
AWS experience (bedrock, E(C|K)S)
Experience:
Why Join Us?
Our Values:
Compensation Range: $160K - $205K