Job Summary
We are looking for an experienced MLOps Engineer with a strong foundation in data engineering, deep hands-on experience with AWS CDK, and expertise in Amazon SageMaker. This role will be instrumental in bridging the gap between data science and production, enabling scalable, secure, and automated ML solutions in cloud environments.
Key Responsibilities
- Design, build, and maintain scalable MLOps pipelines for training, validation, deployment, and monitoring of machine learning models.
- Use AWS CDK (Cloud Development Kit) to build and manage infrastructure-as-code for ML and data engineering components.
- Collaborate with data scientists to integrate models into production using Amazon SageMaker, including SageMaker Pipelines, Training Jobs, and Endpoints.
- Build robust and scalable data pipelines to support ML lifecycle stages, leveraging AWS services like Glue, S3, Athena, and Redshift.
- Implement CI/CD automation for model retraining and deployment using CodePipeline, CodeBuild, and related DevOps tools.
- Set up monitoring and logging systems for ML model drift, performance, and system health using CloudWatch, SageMaker Model Monitor, and custom tools.
- Ensure compliance with security, governance, and cost optimization practices across the ML infrastructure.
Required Skills
- 5+ years of experience in data engineering and cloud-based ML operations
- Strong expertise in AWS CDK (Python or TypeScript) to provision and manage infrastructure
- Hands-on experience with Amazon SageMaker for building, training, and deploying ML models
- Proficient in Python, SQL, and version control systems (e.g., Git)
- Experience with CI/CD pipelines and DevOps best practices for ML (MLOps)
- Familiarity with containerization tools such as Docker and ECS/EKS
- Solid understanding of cloud data architecture, data lakes, and distributed data processing
Nice to Have
- Experience with feature stores, ML monitoring tools, or model registries
- Familiarity with Delta Lake, Databricks, or Spark for ML-ready data workflows
- AWS certifications (e.g., Machine Learning Specialty, Solutions Architect Associate)