We are looking for a Machine Learning Engineer to help us deploy, configure, and optimize Mistral 7B, integrate it into our services, and implement Retrieval-Augmented Generation (RAG) for database interactions.

Key Responsibilities:

Configure and optimize Mistral 7B for business tasks (fine-tuning, quantization, performance optimization).
Assess required computational resources, select the optimal infrastructure for model deployment (on-premise or cloud), and analyze cost efficiency.
Implement RAG to integrate models with vector databases.
Orchestrate interactions between multiple ML services (e.g., one model generates tags, and another validates task descriptions).
Develop a service for interacting with the model (API for predictions, model management, integration with our application).
Optimize model performance for real-world usage.

Requirements:

Experience with LLM models (Mistral 7B, GPT-3/4, LLaMA, Claude, Falcon, Bloom, etc.).
Understanding of Retrieval-Augmented Generation (RAG) and model integration with databases.
Hands-on experience with fine-tuning and dataset preparation/annotation.
Experience with vector databases (Pinecone, Weaviate, FAISS).
Strong proficiency in Python and libraries like PyTorch, TensorFlow, Hugging Face, and LangChain.
Experience in evaluating and optimizing infrastructure for AI deployments.
Experience in developing APIs for integrating AI models into business processes.