We are seeking experienced LLaMA Software Engineers to join a high-impact team driving innovation in Large Language Model (LLM) solutions. You will play a key role in the implementation, fine-tuning, deployment, and integration of open-source LLaMA (Large Language Model Meta AI) models into real-world production environments.

This is a unique opportunity to collaborate with leading AI/ML researchers, product teams, and infrastructure engineers to develop scalable, safe, and responsible generative AI applications across a range of use cases.

Key Responsibilities

Design, develop, and deploy applications using LLaMA and other open-source LLM architectures.
Fine-tune and optimize large models for specific use cases using techniques such as reinforcement learning, prompt engineering, LoRA, or QLoRA.
Work with stakeholders to integrate LLMs into real-world applications such as search, recommendations, support automation, and content generation.
Build robust APIs, tools, and pipelines for LLM inference, performance monitoring, and scalability.
Mitigate bias, toxicity, latency, and cost inefficiencies in LLM outputs, ensuring alignment with responsible AI practices.
Contribute to open-source initiatives or internal innovation projects focused on benchmarking, performance tuning, and model enhancements.

Required Qualifications

Bachelor’s or Master’s degree in Computer Science, Machine Learning, Artificial Intelligence, or a related field.
3–7 years of experience in software engineering, with 1–2 years of hands-on work with LLMs, transformers, or generative AI technologies.
Proficiency in Python and libraries such as Hugging Face Transformers, PyTorch, DeepSpeed, Ray, or Accelerate.
Experience with LLaMA, GPT, PaLM, Mistral, or similar models.
Knowledge of fine-tuning, distributed training, and inference optimization (e.g., quantization, model pruning).
Strong engineering fundamentals including version control, CI/CD, and testing best practices.

Preferred Skills

Experience deploying LLMs in production at scale.
Familiarity with MLOps platforms such as MLflow, SageMaker, or Weights & Biases.
Exposure to multimodal models, agent frameworks (e.g., AutoGPT, LangChain, Open Agents), or RAG (retrieval-augmented generation) pipelines.
Background in privacy-preserving machine learning, RLHF (Reinforcement Learning with Human Feedback), or embedding-based search.

Join us to work on the forefront of LLM innovation and help build the next generation of AI-driven applications.

Apply now to be part of an exciting and rapidly evolving space in artificial intelligence.