This role is a mix of infrastructure and data engineering (DataOps). We are therefore looking for experience in both infrastructure and data management for the ideal candidates. They should also have autonomy and be able to prioritize their work with minimal oversight.
Responsibilities
- Develop data processing pipelines using programming languages like Java, Javascript and Python on Unix server environments to extract, transform, and load log data
- Implement scalable, fault-tolerant solutions for data ingestion, processing, and storage.
- Support systems engineering lifecycle activities for data engineering deployments, including requirements gathering, design, testing, implementation, operations, and documentation.
- Automating platform management processes through Ansible or other scripting tools/languages
- Troubleshooting incidents impacting the log data platforms
- Collaborate with cross-functional teams to understand data requirements and design scalable solutions that meet business needs.
- Develop training and documentation materials
- Support log data platform upgrades including coordinating testing of upgrades with users of the platform
- Gather and process raw data from multiple disparate sources (including writing scripts, calling APIs, writing SQL queries, etc.) into a form suitable for analysis
- Enable log data, batch and real-time analytical processing solutions leveraging emerging technologies
- Participate in on-call rotations to address critical issues and ensure the reliability of data engineering systems
Experience
General
- Ability to troubleshoot and diagnose complex issues
- Able to demonstrate experience supporting technical users and conduct requirements analysis
- Can work independently with minimal guidance & oversight
- Experience with IT Service Management and familiarity with Incident & Problem management
- Highly skilled in identifying performance bottlenecks, identifying anomalous system behavior, and resolving root cause of service issues.
- Demonstrated ability to effectively work across teams and functions to influence design, testing, operations, and deployment of highly available software.
- Knowledge of standard methodologies related to security, performance, and disaster recovery
Required Technical Expertise
- Expertise in AWS and implementation of CICD pipelines supporting log ingestion.
- Expertise with AWS computing environments such as ECS, EKS, EC2 and Lambda
- 3-5 years’ Experience in designing, developing, and deploying data lakes using AWS native services (S3, Firehose, IAM, Terraform)
- Experience with data pipeline orchestration platforms
- Expertise in Ansible/Terraform scripts and Infrastructure as Code scripting is required
- Implement version control and CI/CD practices for data engineering workflows to ensure reliable and efficient deployments (e.g. Gitlab)
- Proficiency in distributed Linux environments
- Proficiency in implementing monitoring, logging, and alerting solutions for data infrastructure (e.g., Prometheus, Grafana)
- Experience writing data pipelines to ingest log data from a variety of sources and platforms.
- Implementation knowledge in data processing pipelines using programming languages like Java, Javascript and Python to extract, transform, and load (ETL) data
- Create and maintain data models, ensuring efficient storage, retrieval, and analysis of large datasets
- Troubleshoot and resolve issues related to data processing, storage, and retrieval.
- Experience in development of systems for data extraction, ingestion and processing of large volumes of data