Back to Careers

AI Research Engineer

Remote / Noida
Full-time
AI Research

About the Role

Work on the cutting edge of Agentic AI and Large Language Models. You will design, fine-tune, and deploy state-of-the-art LLMs for enterprise applications and autonomous workflows. This role combines research and engineering, requiring you to stay current with the latest AI breakthroughs while building production-ready systems that serve millions of users. Join us in shaping the future of AI-powered automation.

Responsibilities

  • Fine-tune and deploy Large Language Models (LLMs) for domain-specific applications using techniques like LoRA, QLoRA, and full fine-tuning.
  • Develop advanced Graph RAG (Retrieval-Augmented Generation) systems for complex knowledge retrieval and reasoning.
  • Research and implement novel agentic architectures including ReAct, AutoGPT, and custom multi-agent systems.
  • Design and optimize prompt engineering strategies, including few-shot learning, chain-of-thought, and tree-of-thoughts prompting.
  • Build evaluation frameworks and benchmarks to measure model performance, accuracy, and safety.
  • Collaborate with product teams to integrate AI capabilities into production applications with proper monitoring and safeguards.
  • Optimize model inference performance through quantization, distillation, and efficient serving infrastructure.
  • Contribute to the AI research community through publications, open-source projects, and conference presentations.

Requirements

  • MS or PhD in Computer Science, Machine Learning, AI, or related field with focus on NLP or deep learning.
  • Strong experience with PyTorch, TensorFlow, and Hugging Face Transformers library for training and deploying models.
  • Deep understanding of Transformer architectures, attention mechanisms, and modern LLM techniques (GPT, BERT, T5, LLaMA).
  • Hands-on experience with fine-tuning techniques including LoRA, prompt tuning, and instruction tuning.
  • Proficiency in Python and experience with ML infrastructure tools (MLflow, Weights & Biases, Ray, Kubeflow).
  • Knowledge of vector databases (Pinecone, Weaviate, Qdrant) and semantic search technologies.
  • Published research papers in top-tier conferences (NeurIPS, ICML, ACL, EMNLP) or significant open-source contributions.
  • Strong mathematical foundation in statistics, linear algebra, and optimization theory.

Apply for this position

Send your resume and portfolio to our engineering team.

Apply Now

Or email us at careers@cognoflux.com