AI Discovery Agent Developer - Part-time - Worldwide

Location

Remote anywhere

Hourly rate

Min. experience

5+ years

Hours per week

30 hours

Duration

12 weeks

Required skills

TypeScriptRAGLlm agents Machine learning Node.js

Freelance job

Posted 4 days ago

Apply now

Actively recruiting / 26 applicants

We’re here to help you

Cynthia is in direct contact with the company and can answer any questions you may have. Email

Cynthia, Recruiter

Objective

Create a production-ready Discovery Agent that interviews prospects, understands answers in context, asks clarifying questions when inputs are incomplete, ingests public company data (website + LinkedIn) and uploaded docs, then outputs a structured discovery report and solution recommendations. Must support multiple LLMs (OpenAI, Anthropic, Llama via AWS Bedrock, etc.).

Target Users

Logistics SMBs (10–100 employees). Agent should be industry-aware but generalizable to other verticals later.

Preferred Stack

AWS: API Gateway or ALB + Lambda (or Fargate), S3, CloudFront, OpenSearch Serverless (or pgvector), SQS/EventBridge, Secrets Manager, CloudWatch.
Langs: TypeScript (Node) or Python (FastAPI).
LLMs: OpenAI, Anthropic, Llama (Bedrock) via a pluggable adapter.
RAG: URL/PDF ingestion, chunking/embeddings, retrieval with citations.
IaC & CI/CD: Terraform + GitHub Actions.

Scope & Deliverables

**Conversation API & Store
**Endpoints: POST /sessions, POST /sessions/{id}/messages, GET /sessions/{id}/summary|report.
Session persistence, consent flags, artifact refs, rate limits.
**Intelligent Clarification
**Detect gaps/contradictions vs. a Discovery JSON schema; ask targeted follow-ups with configurable limits/timeouts.
**Ingestion & Retrieval
**Crawl provided website (1–3 levels, robots-aware), accept LinkedIn URL(s), handle PDF/DOCX uploads.
Embeddings + vector store (OpenSearch Serverless preferred).
Retrieval tuned for concise answers + evidence snippets.
**Multi-LLM Adapter
**Providers: OpenAI, Anthropic, Bedrock (Llama, etc.).
Simple routing by task/cost/latency; streaming responses (SSE).
Outputs
Discovery JSON: company profile, systems/data sources, workflows, pain points, volumes/SLAs, compliance, integration priorities.
Human-readable summary (Markdown/PDF) and Recommendation bundle (candidate solutions with pros/cons + T-shirt size).
**Admin Insights (MVP)
**Metrics: completion rate, # clarifications, retrieval hit rate, model spend estimate; simple ROI stub.
**Security & Guardrails
**Keys in Secrets Manager, PII redaction toggle, domain allowlist for crawlers, prompt-injection filters, redacted logs.
**Infrastructure & DevEx
**Terraform modules for all resources; GitHub Actions pipeline; CloudWatch logs/metrics.
**Docs & Handoff
**README, runbooks, architecture diagram, threat-model checklist, test plan; admin how-to for prompts/router policies.

Non-Functional Requirements

Perf: P95 chat turn < 3s (with retrieval); ingestion jobs typically < 5 min.
Cost: Serverless first; surface per-session inference/infra spend.
Reliability: Timeouts/retries; DLQ for failed ingestions.
Privacy: No training on client data; region us-east-1 unless specified.

Milestones (example)

M0 (1 wk): Architecture + Terraform skeleton + Hello-World API.
M1 (1–2 wks): Chat + clarification loop + multi-LLM adapter.
M2 (1–2 wks): Ingestion (web/LinkedIn/PDF) + embeddings + retrieval.
M3 (1 wk): Discovery JSON + summary/PDF + admin metrics stub.
M4 (0.5–1 wk): Guardrails, tests, docs, final demo.

Acceptance (MVP)

Full session flow returns valid Discovery JSON and downloadable PDF summary.
Evidence snippets demonstrate that ingestion informs answers.
Metrics endpoints return non-zero values after test runs.
Terraform can deploy all required resources into our AWS account.

Candidate Requirements

Proven delivery of agentic or RAG systems in production.
Strong AWS (Lambda/Fargate, API Gateway/ALB, S3/CloudFront, Secrets Manager, CloudWatch).
TypeScript/Node or Python expertise.
Experience with OpenAI/Anthropic/Bedrock; embeddings/vector DBs.
Terraform + GitHub Actions.Nice-to-have: OpenSearch tuning; prompt-injection defenses; LinkedIn/site ingestion; VAPI/Voice/Twilio.