Sr Data Engineer

Yochana IT Solutions
Bellevue, WA, USA

Description

Job Description

• Lead the architecture, design, and implementation of scalable, modular, and reusable data flow pipelines using Cribl, Apache NiFi, Vector, and other open-source platforms, ensuring consistent ingestion strategies across a complex, multi-source telemetry environment.

• Develop platform-agnostic ingestion frameworks and template-driven architectures to enable reusable ingestion patterns, supporting a variety of input types (e.g., syslog, Kafka, HTTP, Event Hubs, Blob Storage) and output destinations (e.g., Snowflake, Splunk, ADX, Log Analytics, Anvilogic).

• Spearhead the creation and adoption of a schema normalization strategy, leveraging the Open Cybersecurity Schema Framework (OCSF), including field mapping, transformation templates, and schema validation logic—designed to be portable across ingestion platforms.

• Design and implement custom data transformations and enrichments using scripting languages such as Groovy, Python, or JavaScript, while enforcing robust governance and security controls (SSL/TLS, client authentication, input validation, logging).

• Ensure full end-to-end traceability and lineage of data across the ingestion, transformation, and storage lifecycle, including metadata tagging, correlation IDs, and change tracking for forensic and audit readiness.

• Collaborate with observability and platform teams to integrate pipeline-level health monitoring, transformation failure logging, and anomaly detection mechanisms.

• Oversee and validate data integration efforts, ensuring high-fidelity delivery into downstream analytics platforms and data stores, with minimal data loss, duplication, or transformation drift.

• Lead technical working sessions to evaluate and recommend best-fit technologies, tools, and practices for managing structured and unstructured security telemetry data at scale.

• Implement data transformation logic including filtering, enrichment, dynamic routing, and format conversions (e.g., JSON ↔ CSV, XML, Logfmt) to prepare data for downstream analytics platforms. (100 plus sources of data)

• Contribute to and maintain a centralized documentation repository, including ingestion patterns, transformation libraries, naming standards, schema definitions, data governance procedures, and platform-specific integration details.

• Coordinate with security, analytics, and platform teams to understand use cases and ensure pipeline logic supports threat detection, compliance, and data analytics requirements.

Key Skills

Snowflake Splunk Adx Kafka Http Event Hubs Groovy Python Javascript

Education

Any Graduate

Apply Now

Back To Jobs

Posted On: Today
Experience: 10+ years of experience
Openings: 1
Category: Sr Data Engineer
Tenure: Flexible Position