I am a collaborative, fun, gregarious, and little-to-no-ego, hands-on true full stack software engineer (front, back, devops, and observability). I am an empathetic, friendly, creative, and pragmatic problem-solver with proven abilities to thrive in a fast-paced, technically challenging environment. My robust business background gives me the ability to provide mature analysis to current problems.
Designed, architected, and implemented a concurrent, distributed, highly scalable dns collector, web scraper, and html LLM processor service to reduce cognitive load by creating talking points for sales reps to use with potential customers, processing 700,000 web pages and 10,000+ talking points daily.
The primary external technologies I used were Java, Python, Snowflake, Pinecone, MySQL (Vitess), Kafka, SQS, Mistral, Llama, Langchain.
Researched and tested several different inference servers and optimization strategies for internal LLM service on GPU enabled AWS EC2 instances.
The primary technologies I researched and tested were Nvidia’s Triton Server, Ollama, TensorRT, TensorRT-LLM, vLLM, and ONNX standard.
Code reviewed front-end React/Redux (Typescript) application feature code.