Responsibilities:

– LLM configuration and optimization (Mistral, LLaMa, Qwen or others) – fine-tuning, quantization, performance tuning;
– Assessing required computational resources, selecting the optimal infrastructure for model deployment (on-premise or cloud), and analyzing cost efficiency;
– Implementing RAG to integrate models with vector databases;
– Orchestrating interactions between multiple ML services (e.g., one model generates tags, and another validates task descriptions).
– Developing a service for interacting with the model (API for predictions, model management, integration with our application).
– Optimizing model performance for real-world usage.

Requirements:

– 1+ years experience with LLM models (Mistral 7B, GPT-3/4, LLaMA, Claude, Falcon, Bloom, etc.);
– Understanding of Retrieval-Augmented Generation (RAG) and model integration with databases;
– Proficiency in Python and libraries like PyTorch, TensorFlow, Hugging Face, and LangChain;
– Experience in evaluating and optimizing infrastructure for AI deployments;
– Experience in developing APIs for integrating AI models into business processes;
– English level at least A2-B1.

Nice to have:

– Hands-on experience with fine-tuning and dataset preparation/annotation;;
– Experience with vector databases (Pinecone, Weaviate, FAISS);
– Experience with Java.

We offer:

– Regular result-based salary reviews;
– Comfortable working hours (10-19 Kyiv time zone);
– Bonus system;
– Established product-focused environment;
– Range of tasks, from quick and simple to challenging investigation to run;
– Cheerful & dynamic environment;
– Friendly and open-minded team;
– Virtual workspace with perspective to move into one of the offices;
– Mentorship;
– Attractive social package (unlimited and paid sick days, fully paid vacation, birthday day off, etc;)
– Sport and English classes discounts.

Hiring steps:

– First interview with the Recruiter;
– Technical interview with the Team Lead;
– Job Offer.

Didn't find the right opportunity that fits you?