Job Description
REQUIREMENTS:
- Total experience: 10+ years.
- Strong experience with LLMs (LLaMA, DeepSeek, etc.) and understanding of RAG pipelines.
- Hands on experience in Python, Linux, and Shell scripting
- Experience with OpenCV, PyTorch, YOLO, or TensorFlow frameworks
- Familiarity with LLM inference engines like Ollama, vLLM, llama.cpp
- Solid knowledge of model conversion and deployment.
- Experience working on AI Agents, LangChain, and retrieval-augmented generation (RAG)
- Hands-on experience with Docker, Docker Compose, and integration into DevOps pipelines
- Understanding of embedded platforms (Jetson, NXP, Qualcomm) and Yocto builds.
- Experience in model optimization techniques (quantization, pruning, etc.)
- Good grasp of CUDA kernels and GPU computing for acceleration
- Excellent communication skills and the ability to collaborate effectively with cross-functional teams.
RESPONSIBILITIES:
- Understanding functional requirements thoroughly and analyzing the client’s needs in the context of the project
- Envisioning the overall solution for defined functional and non-functional requirements, and being able to define technologies, patterns and frameworks to realize it
- Determining and implementing design methodologies and tool sets
- Enabling application development by coordinating requirements, schedules, and activities.
- Being able to lead/support UAT and production roll outs
- Creating, understanding and validating WBS and estimated effort for given module/task, and being able to justify it
- Addressing issues promptly, responding positively to setbacks and challenges with a mindset of continuous improvement
- Giving constructive feedback to the team members and setting clear expectations.
Qualifications
Bachelor’s or master’s degree in computer science, Information Technology, or a related field.
null