Job Description:-
Lead the design, development, and deployment of applications and services powered by large language models.
Architect scalable and efficient Python-based systems for real-time and batch inference.
Fine-tune, evaluate, and optimize LLMs for specific business needs using frameworks like Hugging Face, LangChain, or OpenAI APIs.
Collaborate with cross-functional teams to integrate LLM solutions into existing platforms or build new AI-driven products.
Mentor and guide a team of developers and researchers on best practices in Python, LLM integration, and deployment.
Monitor model performance, drift, and user feedback to continuously improve system accuracy and safety.
Stay up-to-date with the latest advancements in generative AI, LLMs, and NLP research.
Ensure code quality, reliability, and maintainability through reviews and engineering standards.
Bachelor’s or Master’s degree in Computer Science, Artificial Intelligence, or a related field.
5+ years of Python development experience, including system architecture and large-scale deployments.
2+ years of hands-on experience with LLMs and NLP frameworks (e.g., OpenAI, Hugging Face Transformers, LangChain, LlamaIndex).
Experience deploying models in production, using cloud platforms like AWS, GCP, or Azure.
Strong understanding of transformer-based architectures, embeddings, vector databases (e.g., FAISS, Weaviate, Pinecone), and prompt engineering.
Familiarity with Docker, CI/CD pipelines, and microservices architecture.
Any Graduate