Looking for a highly technical, hands-on Data Engineer for Client’s Data Lake Team that can independently lead data engineering projects and strive to proactively improve process efficiency, making recommendations for process and system improvements where applicable. The Data Engineer role will be responsible for not only understanding data pipelines but, event streaming applications, and how to build systems that handle massive amounts of data while making it consumable by other application teams, users and data scientists.
Specific Skills:
Experience or knowledge of relational SQL and NoSQL databases, including Postgres and Cassandra.
Strong understanding of in-memory processing and data formats (Avro, Parquet, Json etc.)
Experience or knowledge of AWS cloud services: EC2, MSK, S3, RDS, SNS, SQS
Experience or knowledge of stream-processing systems: i.e., Storm, Spark-Structured-Streaming, Kafka consumers.
Experience or knowledge of object-oriented/object function scripting languages: i.e., Python, Java, Scala, R, SQL.
Experience or knowledge of data pipeline and workflow management tools: i.e., AWS Data Pipeline, Apache Airflow, Argo.
Experience or knowledge of big data tools: i.e., Hadoop, Spark, Kafka.
Experience or knowledge of software engineering tools/practices: i.e., Github, VSCode, CI/CD
Hands-on experience in designing and maintaining data schema life-cycles
Any Gradute