Job Description:
- Strong foundation in machine learning and deep learning:
- The candidate should have a solid understanding of machine learning algorithms, including supervised and unsupervised learning, regression, classification, clustering, and neural networks.
- They should also be familiar with deep learning architectures such as CNNs, RNNs, and transformers.
- Should have strong knowledge in NLP
- Experience with GenAI and large language models:
- The candidate should have hands-on experience with GenAI models such as language generators, language translators, and text summarizers.
- They should be familiar with large language models like BERT, RoBERTa, and transformer-based architectures.
- Tech stack expertise:
The candidate should be proficient in the following tech stack:
- Programming languages: Python, YAML, terraform
- Machine learning frameworks: TensorFlow, PyTorch, Scikit-learn
- Deep learning libraries: Keras, OpenCV
- Data manipulation and analysis: Pandas, NumPy, Matplotlib, Seaborn
- Cloud platforms: GCP
- Data analysis and problem-solving skills:
- The candidate should be able to collect, analyze, and interpret large datasets to identify patterns, trends, and insights.
- They should be able to formulate problems, design experiments, and develop solutions using machine learning and GenAI techniques.
- Communication and collaboration skills:
- The candidate should be able to communicate complex technical concepts to non-technical stakeholders, including data insights, model performance, and project progress.
- They should be able to collaborate with cross-functional teams, including data engineers, product managers, and software developers, to integrate GenAI and machine learning solutions into larger projects.
Required:
- Master’s degree or PhD in Computer Science, Statistics, Applied Mathematics, or a related field, with at least 5 – 7 years’ experience in data science or a similar role.
- Translates business needs into analytics/reporting requirements to support data-driven decisions with required information & explain ability.
- Proficient in at least one analytical programming language relevant for data science. Python ecosystem preferred, R will be acceptable, machine learning libraries & frameworks (e.g., TensorFlow, PyTorch, scikit-learn) and familiar with data processing and visualization tools (e.g., SQL, Tableau, Power BI).
- Good knowledge on Natural Language Processing (NLP).
- Expertise in advanced analytical techniques (e.g., descriptive statistics, machine learning, optimization, pattern recognition, cluster analysis, etc.)
- Experience with cloud computing environments (GCP) and Data/ML platforms (Databricks, Spark).
- Leverage ML and LLM technologies to draw insights from data.
- Strong understanding of the Machine Learning lifecycle - feature engineering, training, validation, scaling, deployment, monitoring, and feedback loop.
- Experience in Supervised and Unsupervised Machine Learning including classification, forecasting, anomaly detection, pattern recognition using variety of techniques such as decision trees, regressions, ensemble methods and boosting algorithms.