We are seeking experienced LLaMA Software Engineers to join a high-impact team driving innovation in Large Language Model (LLM) solutions. You will play a key role in the implementation, fine-tuning, deployment, and integration of open-source LLaMA (Large Language Model Meta AI) models into real-world production environments.
This is a unique opportunity to collaborate with leading AI/ML researchers, product teams, and infrastructure engineers to develop scalable, safe, and responsible generative AI applications across a range of use cases.
Key Responsibilities
- Design, develop, and deploy applications using LLaMA and other open-source LLM architectures.
- Fine-tune and optimize large models for specific use cases using techniques such as reinforcement learning, prompt engineering, LoRA, or QLoRA.
- Work with stakeholders to integrate LLMs into real-world applications such as search, recommendations, support automation, and content generation.
- Build robust APIs, tools, and pipelines for LLM inference, performance monitoring, and scalability.
- Mitigate bias, toxicity, latency, and cost inefficiencies in LLM outputs, ensuring alignment with responsible AI practices.
- Contribute to open-source initiatives or internal innovation projects focused on benchmarking, performance tuning, and model enhancements.
Required Qualifications
- Bachelor’s or Master’s degree in Computer Science, Machine Learning, Artificial Intelligence, or a related field.
- 3–7 years of experience in software engineering, with 1–2 years of hands-on work with LLMs, transformers, or generative AI technologies.
- Proficiency in Python and libraries such as Hugging Face Transformers, PyTorch, DeepSpeed, Ray, or Accelerate.
- Experience with LLaMA, GPT, PaLM, Mistral, or similar models.
- Knowledge of fine-tuning, distributed training, and inference optimization (e.g., quantization, model pruning).
- Strong engineering fundamentals including version control, CI/CD, and testing best practices.
Preferred Skills
- Experience deploying LLMs in production at scale.
- Familiarity with MLOps platforms such as MLflow, SageMaker, or Weights & Biases.
- Exposure to multimodal models, agent frameworks (e.g., AutoGPT, LangChain, Open Agents), or RAG (retrieval-augmented generation) pipelines.
- Background in privacy-preserving machine learning, RLHF (Reinforcement Learning with Human Feedback), or embedding-based search.
Join us to work on the forefront of LLM innovation and help build the next generation of AI-driven applications.
Apply now to be part of an exciting and rapidly evolving space in artificial intelligence.