Description

Responsibility: 

  • Helping the product owner and development team to achieve project outcomes. Build stories and prioritize as per the requirement. 
  • Analyze data platform requirements and design and document solutions.
  • Build key infrastructure, frameworks and applications to support the needs of Data Engineers, Data Scientists and Business.
  • Improve SDLC processes with the team of engineers and DevOps.
  • Build effective/efficient and reusable data pipeline frameworks for various data source types, refresh patterns and transformations. 
  • Build data pipelines, jobs using Spark and Databricks to ingest into Data Lake/Delta Lake on AWS 
  • Support teams using the Data and AI/ML platform, design and validate their use case architecure, provide best practices for solutions, troubleshoot development issues with platform feature and frameworks.
  • Partner with team members, product owner and other stakeholder to ideate, review and align on data validation approach. 
  • Maintain end-end data security (both at rest, transit) and data sharing mechanisms.
  • Identify and build automation processes to support various data platform scenarios.
  • Design and build Dev/Data/MLOps processes using cloud services.

 

Qualification: 

  • Strong technical skills and experience working with and supporting multiple engineering teams.
  • Experience in building Big Data/ML/AI applications and optimizing data pipelines, architectures and data sets.
  • 5+ years of technical experience with big data technologies. They should also have experience using the following software/tools:
    • Experience with big data tools: Apache Spark, Databricks, Parquet/Delta, PySpark, SparkSQL, Spark Streaming, Kafka/Kinesis, S3, Glue
    • Experience with Databricks using Unity Catalog and MLFlow is a big plus
    • Experience with any relational/noSQL databases or any MPP databases like Snowflake and Redshift.
    • Experience with data pipeline and workflow management tools:  Step Functions, Airflow
    • Experience with AWS cloud services: EC2, EMR, RDS, Redshift, IAM, Security Group, VPC etc.
    • Experience developing in Python and Notebooks/IDE
    • Experience with Automation: Jenkins CI/CD, Terraform, CDK, Boto3

Education

Any Gradute