Objective
Create a production-ready Discovery Agent that interviews prospects, understands answers in context, asks clarifying questions when inputs are incomplete, ingests public company data (website + LinkedIn) and uploaded docs, then outputs a structured discovery report and solution recommendations. Must support multiple LLMs (OpenAI, Anthropic, Llama via AWS Bedrock, etc.).
Target Users
Logistics SMBs (10–100 employees). Agent should be industry-aware but generalizable to other verticals later.
Preferred Stack
- AWS: API Gateway or ALB + Lambda (or Fargate), S3, CloudFront, OpenSearch Serverless (or pgvector), SQS/EventBridge, Secrets Manager, CloudWatch.
- Langs: TypeScript (Node) or Python (FastAPI).
- LLMs: OpenAI, Anthropic, Llama (Bedrock) via a pluggable adapter.
- RAG: URL/PDF ingestion, chunking/embeddings, retrieval with citations.
- IaC & CI/CD: Terraform + GitHub Actions.
Scope & Deliverables
- **Conversation API & Store
**Endpoints: POST /sessions, POST /sessions/{id}/messages, GET /sessions/{id}/summary|report.
Session persistence, consent flags, artifact refs, rate limits.
- **Intelligent Clarification
**Detect gaps/contradictions vs. a Discovery JSON schema; ask targeted follow-ups with configurable limits/timeouts.
- **Ingestion & Retrieval
**Crawl provided website (1–3 levels, robots-aware), accept LinkedIn URL(s), handle PDF/DOCX uploads.
Embeddings + vector store (OpenSearch Serverless preferred).
Retrieval tuned for concise answers + evidence snippets.
- **Multi-LLM Adapter
**Providers: OpenAI, Anthropic, Bedrock (Llama, etc.).
Simple routing by task/cost/latency; streaming responses (SSE).
- Outputs
Discovery JSON: company profile, systems/data sources, workflows, pain points, volumes/SLAs, compliance, integration priorities.
Human-readable summary (Markdown/PDF) and Recommendation bundle (candidate solutions with pros/cons + T-shirt size).
- **Admin Insights (MVP)
**Metrics: completion rate, # clarifications, retrieval hit rate, model spend estimate; simple ROI stub.
- **Security & Guardrails
**Keys in Secrets Manager, PII redaction toggle, domain allowlist for crawlers, prompt-injection filters, redacted logs.
- **Infrastructure & DevEx
**Terraform modules for all resources; GitHub Actions pipeline; CloudWatch logs/metrics.
- **Docs & Handoff
**README, runbooks, architecture diagram, threat-model checklist, test plan; admin how-to for prompts/router policies.
Non-Functional Requirements
- Perf: P95 chat turn < 3s (with retrieval); ingestion jobs typically < 5 min.
- Cost: Serverless first; surface per-session inference/infra spend.
- Reliability: Timeouts/retries; DLQ for failed ingestions.
- Privacy: No training on client data; region us-east-1 unless specified.
Milestones (example)
- M0 (1 wk): Architecture + Terraform skeleton + Hello-World API.
- M1 (1–2 wks): Chat + clarification loop + multi-LLM adapter.
- M2 (1–2 wks): Ingestion (web/LinkedIn/PDF) + embeddings + retrieval.
- M3 (1 wk): Discovery JSON + summary/PDF + admin metrics stub.
- M4 (0.5–1 wk): Guardrails, tests, docs, final demo.
Acceptance (MVP)
- Full session flow returns valid Discovery JSON and downloadable PDF summary.
- Evidence snippets demonstrate that ingestion informs answers.
- Metrics endpoints return non-zero values after test runs.
- Terraform can deploy all required resources into our AWS account.
Candidate Requirements
- Proven delivery of agentic or RAG systems in production.
- Strong AWS (Lambda/Fargate, API Gateway/ALB, S3/CloudFront, Secrets Manager, CloudWatch).
- TypeScript/Node or Python expertise.
- Experience with OpenAI/Anthropic/Bedrock; embeddings/vector DBs.
- Terraform + GitHub Actions.Nice-to-have: OpenSearch tuning; prompt-injection defenses; LinkedIn/site ingestion; VAPI/Voice/Twilio.