Job Summary:
We are seeking a skilled and experienced Senior Data Engineer to mentor and guide our bench candidates in securing successful placements with our clients. The ideal candidate will have a strong development background, excellent communication skills, and the ability to prepare candidates to excel in technical interviews and projects. This role is a part-time, remote opportunity perfect for professionals looking to share their expertise while working flexibly.
Responsibilities:
- Design and build scalable data pipelines using Apache Spark, Kafka, Airflow, and cloud-based data services (AWS, Azure, GCP).
- Develop and optimize data lakes, warehouses, and real-time streaming solutions using tools like Snowflake, Databricks, BigQuery, and Redshift.
- Implement ETL/ELT processes to ingest, process, and transform structured & unstructured data from multiple sources.
- Optimize SQL queries, distributed computing frameworks, and data storage systems for performance and cost efficiency.
- Ensure data governance, lineage, and security compliance (GDPR, CCPA, HIPAA).
- Leverage Infrastructure as Code (Terraform, CloudFormation) and CI/CD pipelines to automate deployments.
- Monitor, troubleshoot, and enhance data pipelines, implementing observability with Prometheus, Splunk, or Datadog.
- Collaborate with Data Scientists, Analysts, and Business teams to support machine learning, reporting, and AI-driven insights.
- Mentor junior engineers, review code, and establish best practices for data engineering.
Required Skills & Experience:
- 6 years of experience in Data Engineering, Big Data, and Cloud Technologies.
- Expertise in Python, SQL, Scala, or Java for data processing.
- Strong knowledge of Apache Spark, Hadoop (HDFS, YARN), Kafka, and Airflow.
- Hands-on experience with AWS (Glue, Redshift, EMR), Azure (Synapse, Data Factory), or GCP (BigQuery, Dataflow).
- Expertise in ETL frameworks, data warehousing, and batch/streaming data processing.
- Database management experience with PostgreSQL, MySQL, MongoDB, or Cassandra.
- Experience in containerization & orchestration (Docker, Kubernetes) and DevOps practices (CI/CD, GitHub Actions, Jenkins).
- Strong understanding of data governance, security, and compliance frameworks.
- Experience with MLOps (MLflow, Feature Engineering) is a plus.
- Excellent problem-solving skills and ability to work in a fast-paced environment