Description

Key Skills: Generative AI, Solutions Architect Exp, Implementation of GenAI/ Client solutions, Compliance/Security requirements for GenAI/ Client, LLM Architecture, Framework: LangChain/ LlamaIndex, Algorithms: FlashAttention/ PagedAttention, Model Compression/ distillation/ Quantization, Performance Optimization (LLMOps) of GenAI Models, Azure Services: ADF/ OpenAI/ AI search, Python/ JAVA, TensorFlow/ PyTorch

 

Job Description:

  • We are looking for a Generative AI Solutions Architect who will be the Subject Matter Expert (SME) for helping various stakeholders in designing solutions that leverage our Generative AI services.
  • You will interact with Business team directly to understand their problem, help and aid them in implementation of generative AI solutions, deliver briefing and deep dive sessions to customers and guide customer on adoption patterns and paths for generative AI.
  • You will work closely with other Solution Architects from various geographies to enable large-scale customer use cases and drive the adoption of Azure/Databricks Services for GenAI services.
  • You will interact with other Data Scientists and Solution Architects in the field, providing guidance on their customer engagements.
  • You drive effective feedback gathering from customers, and you distill and translate that feedback into clear business and technical requirements for product and engineering teams to review.
  • You must have deep technical experience working with technologies related to large language models including LLM architectures, model evaluation, and fine-tuning techniques. You must have experience with embedding model fine tuning and retrieval method evaluation approaches.
  • You should be proficient with design, deployment, and evaluation of LLM-powered agents and tools and orchestration approaches. - Design the network architecture to ensure optimal performance of generative AI applications.
  • You should understand the security and compliance requirements for Client/GenAI implementations.
  • You must have experience with LangChain, LLAMAIndex, Data Augmentation, Responsible AI, and Performance Evaluation frameworks. You should have experience architecting end to end Client/Gen AI applications for customers using Azure services and Well Architected Framework.
  • Candidates must have great communication skills and be very technical, with the ability to impress client's customers at any level, from executive to developer. You will get the opportunity to work directly with senior Client engineers and Data Scientists at customers, partners and Azure Web Services service teams, influencing their roadmaps and driving innovation.

BASIC QUALIFICATIONS:

  • 3+ years of experience in end to end technical architecture, design, deployment and operations for Generative AI/Client platforms and applications.
  • 1+ year experience working with technologies related to large language models including LLM architectures, model evaluation, adapters, model customization including pre-training and fine-tuning techniques.
  • Proficient with design, deployment, and evaluation of LLM-powered agents and tools and orchestration approaches.
  • Proficient with prompt engineering, embedding model fine tuning and retrieval method evaluation and optimization approaches.
  • 5+ years of experience in design/implementation for Machine Learning/AI/Deep Learning solutions - using one or more Deep Learning frameworks such as TensorFlow, PyTorch, etc.
  • 5+ years professional experience in software development in languages related to Client like Python or Java.
  • Experience working with RESTful API and general service-oriented architecture.
  • 8+ years of specific technology domain areas (e.g. software development, cloud computing, systems engineering, infrastructure, security, networking, data & analytics) experience.
  • Master's degree in a quantitative field such as statistics, mathematics, data science, business analytics, engineering, or computer science

PREFERRED QUALIFICATIONS:

  • Experience with optimizing Client workloads using Model compression, distillation, pruning, sparsification, quantization, Transformers based algorithms like FlashAttention, PagedAttention, Speculative decoding, Distributed training/inference optimization, Hardware-informed efficient model architecture.
  • Experience with distributed training and optimizing performance versus costs.
  • Experience with open source frameworks for building applications powered by language models like LangChain, LlamaIndex. Design, develop, and optimize high-quality prompts and templates that guide the behavior and responses of LLM.
  • Experience with design, deployment, and evaluation of LLM-powered agents and tools and orchestration approaches
  • Work Experience in deploying, versioning, monitoring, scalability, and performance optimization (LLMOps) of Generative AI models.
  • Experience with Azure technologies such as ADF, Azure AI Search, Azure OpenAI, Document Intelligence Services,
  • Demonstrated ability to think strategically about business, product, and technical challenges in an enterprise environment.
  • Track record of thought leadership and innovation around Machine Learning.
  • Experience with Performance benchmarking and developing prescriptive guidance on optimally building, deploying and monitoring Client models on Azure to drive actions at scale to provide low prices and increased selection for customers.


 

Education

Any Graduate