Must haves:
Proficiency in Python programming
Strong expertise in SQL, Presto, HIVE, and Spark
Knowledge of trading and investment data
Experience in big data technologies such as Spark and developing distributed computing applications using PySpark
Experience with libraries for data manipulation and analysis, such as Pandas, Polars and NumPy Understanding of data pipelines, ETL processes, and data warehousing concepts
Strong experience in building and orchestrating data pipelines
Experience in building APIs Write, maintain, and execute automated unit tests using Python
Extensive experience with key AWS services/components including EMR, Lambda, Glue ETL, Step Functions, S3, ECS, Kinesis, IAM, RDS PostgreSQL, Dynamodb, Timeseries database, CloudWatch Events/Event Bridge, Athena, SNS, SQS, and VPC
Any Graduate