Job Title: AI Applied Scientist/AI Data Scientist
Job Location: Mexico (Remote)
● Innovate with State-of-the-Art AI: Implement cutting-edge AI solutions for key.
Document Understanding tasks such as OCR/HTR, transcription, Named Entity Recognition (NER), Relation Extraction (RE), Coreference Resolution,Summarization, and Knowledge Graphs working with diverse genealogical andhistorical collections spanning newspapers, city directories, family history books,and vital records (i.e., birth, marriage, & death records).
● Architect Agentic Systems: Design and implement multi-agent workflows usingframeworks like LangChain, LangGraph, CrewAI, AutoGen, AgentCore, Strands,Google ADK, A2A, etc. to automate complex multi-step reasoning tasks inhistorical document analysis and information extraction.
● Analyze and Optimize Multi-Modal Models: Evaluate the performance of multi-modal models such as Gemini, Claude, GPT, and Qwen for zero-shot and few-shot scenarios in comprehensive document understanding.
● Natural Language Processing (NLP): NER, Relation Extraction, CoreferenceResolution, Entity Resolution, Knowledge Graphs (Neo4j), spaCy, NLTK, BERT.
● Computer Vision (CV): Apply expertise using models like YOLO, Nougat,DONUT, OpenCV, etc. to perform layout analysis, identifying text blocks,headers, tables, and deeply nested lists.
● Evaluation & Observability: Establish ensemble models and "LLM-as-a-Judge"frameworks, and use tools like Arize Phoenix, DeepEval, or RAGAS to monitorfor hallucination, drift, and bias.
● Development Productivity: Familiarity with "AI coding" workflows and usageof AI coding assistants such as Amazon Q, Cursor, Claude Code, and Kiro toaccelerate development cycles.
● Collaborate on Cloud Deployment: Partner with ML Ops to deploy datasets,models, and pipelines in cloud environments like AWS (S3, SageMaker,Bedrock, ECS, EKS) and GCP (Vertex AI, Gemini API).
Who You Are:
● Experienced (MS or PhD preferred) in Computer Science, Data Science,Statistics, Mathematics, Linguistics, Engineering or related quantitative field witha strong data focus.
● Specialization in AI & LLMs including familiarity with foundational models such asGPT, Gemini, Qwen, Llama, Claude, etc.
● Experience with inference optimization, vLLM, LoRA, QLoRA, quantization, etc.
● Familiar with embeddings, vector databases, transformer models, with softwaredevelopment experience.
● Strong proficiency in Python and relevant tools and libraries, includingtransformer models, multi-modal models.
● Familiarity with cloud platforms and related AI/ML services such as Google CloudPlatform, GCP, Gemini API, Vertex AI, AWS EC2, S3, SageMaker, ModelRegistry, and Bedrock is a plus.
● ML/LLM/system experience - particular in content information extraction
● Ability to clearly present complex technical solutions to both technical and non-technical stakeholders.