Job Title : GCP / Pyspark Data Engineer
Location : Dallas, TX
Duration : Long term
Job description:
- Bachelor's degree in Computer science or equivalent, with minimum 10+ Years of relevant experience..
- Must have atleast 7+ years of application development experience required using one of the core cloud platforms viz. AWS, Azure & GCP
- Minimum 1+ years of GCP experience. Experience working in GCP based Big Data deployments (Batch/Real-Time) leveraging Pyspark, Big Query, Google Cloud Storage, PubSub, Data Fusion, Dataproc, Airflow;
- Minimum 3+ years coding skills in Python/PySpark and strong proficiency in SQL;
- Extracting, Loading, Transforming, cleaning, and validating data + Designing pipelines and architectures for data processing;
- Architecting and implementing next generation data and analytics platforms on GCP cloud;
- Experience in working with Agile and Lean methodologies;
- Experience working with either a Map Reduce or an MPP system on any size/scale;
- Experience working in CI/CD model to ensure automated orchestration of pipelines.Demonstrate excellent communication skills including the ability to effectively communicate with internal and external customers.
- Ability to use strong industry knowledge to relate to customer needs and dissolve customer concerns and high level of focus and attention to detail.