Description

PySpark for scalable data processing.

Hive and Impala for data warehousing and query based analysis.

Linux/Unix - Scripting and operational troubleshooting.

Solid understanding of distributed computing concepts, data partitioning and performance tuning on Hadoop.

Proficient in developing and maintaining large-scale data pipelines and ETL workflows.

Good-to-Have Technical Skills:

Exposure to ELK stack(Elasticsearch, Logstash, Kibana) for search-driven work sets.

MongoDB for semi structure data storage and retrieval.

Familiar with version control systems (Git), CI/CD pipelines and workflow orchestration tools like Apache Airflow.

Functional Knowledge:

Prior experience or exposure to Banking domain.

Understanding of Anti-Money Laundering (AML) processes, such as transaction monitoring, customer risk rating and case management workflows.

Ability to interpret business rules related to AML.

Responsible for designing, building and optimizing data pipelines that serve as the backbone for AML data analytics and reporting solutions.

Closely work with data analysts, compliance teams and other technology partners to ensure data quality, lineage and timely availability

Education

Any Gradute