Job Summary: We are looking for an experienced AI Engineer specializing in Retrieval-Augmented Generation (RAG) to build and optimize hybrid AI solutions leveraging Large Language Models (LLMs). This role involves working with cutting-edge language models and retrieval systems to deliver highly accurate, context-aware, and responsive AI applications. You'll collaborate with cross-functional teams to develop scalable solutions that enhance information retrieval, comprehension, and generation capabilities in real-world applications. Key Responsibilities: Design, develop, and deploy hybrid RAG architectures integrating LLMs with retrieval-based systems for improved relevance and contextual responses. Fine-tune and optimize large language models, enhancing their performance and adaptability to domain-specific requirements. Implement and manage RAG pipelines that effectively combine retrieval mechanisms with generative capabilities, ensuring high accuracy and efficiency. Develop custom plugins, adapters, or APIs to integrate retrieval systems (e.g., Elasticsearch, FAISS) with generative models, facilitating seamless information retrieval. Monitor and troubleshoot issues within RAG pipelines, fine-tuning retrieval parameters and model hyperparameters to optimize performance. Work closely with data engineers to manage and preprocess large datasets for training, ensuring high-quality and diverse data coverage. Evaluate and benchmark the performance of RAG solutions, using metrics such as response accuracy, latency, and user satisfaction. Stay up-to-date with advancements in NLP, LLMs, and RAG methodologies, continually improving existing architectures and recommending new techniques. Qualifications: Bachelor's or Master's degree in Computer Science, Artificial Intelligence, or a related field, or equivalent practical experience. 3+ years of experience in AI/NLP, with a focus on LLMs, transformer-based architectures, and retrieval systems. Proven experience building and deploying RAG solutions or other hybrid AI architectures. Strong understanding of information retrieval methods, including dense retrieval, sparse retrieval, and embeddings-based techniques. Proficiency in Python, TensorFlow or PyTorch, and experience with libraries and tools related to LLMs, such as Hugging Face Transformers. Familiarity with retrieval frameworks like Elasticsearch, FAISS, or OpenSearch. Knowledge of prompt engineering, fine-tuning, and deployment of language models for production environments. Strong analytical skills, with experience in optimizing LLM and retrieval model performance. English required. Preferred Skills: Experience with cloud services and infrastructure (AWS, GCP, Azure) and MLOps tools for model deployment and monitoring. Contributions to open-source RAG projects or experience working with OpenAI, LangChain, or similar frameworks. Knowledge of vector databases, memory-augmented networks, and distributed systems. #J-18808-Ljbffr