Description

We are looking for an experienced AWS AI Data Engineer to join our dynamic team, responsible for developing, managing, and optimizing data architectures that support AI and Machine Learning (ML) workflows. The ideal candidate will have extensive experience in integrating large-scale datasets, building scalable and automated data pipelines, and working with advanced ML frameworks and tools. The candidate should also have experience with AWS ETL services (such as AWS Glue, Lambda, and Data Pipeline) to handle data processing and integration tasks effectively.

Overview: It's one of the workstreams of Project Acuity. PASD Data Platform includes centralized web application for internal PASD users across the Recruitment Business to support marketing and operational use cases. Building a database at the patient level will provide significant benefit to PASD's future reporting capabilities and engagement of external stakeholders.

Must Have Skills*

 

  • Proficiency in programming languages such as Python, Scala, or similar.
  • Solid understanding of machine learning frameworks such as TensorFlow and PyTorch.
  • Strong experience in data classification, including the identification of PII data entities.
  • Knowledge and experience with retrieval-augmented generation (RAG) and agent-based workflows.
  • Deep understanding of how-to re-rank and improve LLM outputs using Index and Vector stores.
  • Ability to leverage AWS services (e.g., SageMaker, Comprehend, Entity Resolution) to solve complex data and AI-related challenges.
  • Ability to manage and deploy machine learning models and frameworks at scale using AWS infrastructure.
  • Strong analytical and problem-solving skills, with the ability to innovate and develop new approaches to data engineering and AI/ML.
  • experience with AWS ETL services (such as AWS Glue, Lambda, and Data Pipeline) to handle data processing and integration tasks effectively.
  • Experience in core AWS Services including AWS IAM, VPC, EC2, S3, RDS, Lambda, CloudWatch, CloudTrail.

     

Nice To Have Skills

 

  • Experience with data privacy and compliance requirements, especially related to PII data.
  • Familiarity with advanced data indexing techniques, vector databases, and other technologies that improve the quality of AI/ML outputs