Must haves:
* Proficiency in Python programming
* Strong expertise in SQL, Presto, HIVE, and Spark
* Knowledge of trading and investment data
* Experience in big data technologies such as Spark and developing distributed computing applications using PySpark
* Experience with libraries for data manipulation and analysis, such as Pandas, Polars and NumPy Understanding of data pipelines, ETL processes, and data warehousing concepts
* Strong experience in building and orchestrating data pipelines
* Experience in building APIs
* Write, maintain, and execute automated unit tests using Python
* Follow Test-Driven Development (TDD) practices in all stages of software development
* Extensive experience with key AWS services/components including EMR, Lambda, Glue ETL, Step Functions, S3, ECS, Kinesis, IAM, RDS PostgreSQL, Dynamo dB, Time series database, Cloud Watch Events/Event Bridge, Athena, SNS, SQS, and VPC
* Proficiency in developing serverless architectures using AWS services
* Experience with both relational and NoSQL databases
* Skills in designing and implementing data models, including normalization, denormalization, and schema design
* Knowledge of data warehousing solutions like Amazon Redshift
* Strong analytical skills with the ability to troubleshoot data issues
* Good understanding of source control, unit testing, test-driven development, and CI/CD
* Experience with One Tick or KDB
Bachelor's degree in Computer Science