Databricks Architect-Multi-Cloud

Xoriant
Greater Boston, USA

Description

About the Role

We are seeking a highly skilled Databricks Architect with deep expertise in designing and building real-time, high-throughput data pipelines in Databricks across Azure, AWS, and GCP. The ideal candidate will have strong hands-on experience in Scala, Apache Spark, and deep familiarity with modern data lake architectures. Experience working with data requirements from financial institutions—especially regarding security, governance, and performance—is highly preferred.

Key Responsibilities

Design and implement real-time streaming and batch data pipelines using Databricks + Spark (Scala preferred)
Architect and optimize data lakehouse platforms on Azure (primary), AWS, and GCP
Collaborate with data engineering, analytics, DevOps, and governance teams to ensure secure and performant data architectures
Lead end-to-end architecture for ingestion, transformation, enrichment, and publishing of large volumes of structured and unstructured data
Design for low-latency processing, auto-scaling, and fault tolerance for high-throughput pipelines
Drive infrastructure-as-code adoption (Terraform, CI/CD) to automate deployments and platform provisioning
Apply best practices for data quality, lineage, governance (Unity Catalog / Purview), and security (RBAC, encryption, private networking)
Interface with financial clients to understand regulatory constraints, auditability, and secure access needs
Mentor and guide engineering teams on best practices for Databricks, Spark, and real-time data solutions

Required Skills and Experience

Core Technical Expertise:

Databricks (Azure, AWS, GCP) – deep understanding of cluster management, workspace security, Delta Lake, Unity Catalog
Scala (Strong proficiency) – for high-performance data processing jobs
Apache Spark (Structured Streaming, Core APIs)
Real-time Data Pipelines – using Kafka, Delta Live Tables, or Apache Flink
Cloud Platforms:
Azure: ADF, Synapse, Event Hubs, Azure Data Lake Gen2, Azure Key Vault, Private Link, Azure Monitor
AWS: S3, Glue, Kinesis, Redshift, IAM, CloudWatch
GCP: GCS, BigQuery, Dataflow, Pub/Sub

Data Engineering & Orchestration:

Delta Lake architecture (ACID transactions, schema evolution, time travel)
Data modeling (Star, Snowflake, Data Vault)
Orchestration with Azure Data Factory, Airflow, dbt, or Dagster
Proficiency in SQL, Python for scripting and notebooks

Security and Governance:

Role-based access control (RBAC), token authentication
Data encryption (in transit and at rest), network security (Private Link, VNet/Security Groups)
Unity Catalog, Azure Purview, or Alation for data governance

DevOps & Infra:

CI/CD pipelines using GitHub Actions, Azure DevOps, Jenkins
Infrastructure-as-Code (IaC) using Terraform, Pulumi, or ARM templates
Monitoring with Datadog, Azure Monitor, or Prometheus/Grafana

Preferred Qualifications

Experience working with financial institutions or in regulated environments
Strong understanding of data compliance (GDPR, CCPA, SOX, PCI, etc.)
Familiarity with machine learning pipelines on Databricks
Knowledge of event-driven architectures and microservices

Soft Skills

Excellent communication and stakeholder management skills
Leadership experience in mentoring and guiding engineering teams
Ability to translate complex requirements into scalable designs
Strong documentation and solutioning mindset

Education

Bachelor's or Master’s degree in Computer Science, Engineering, or related field

Certifications (Preferred)

Databricks Certified Data Engineer Professional
Azure Solutions Architect / AWS Certified Solutions Architect / GCP Cloud Architect
Terraform Associate / HashiCorp Certified

Key Skills

Azure Aws Gcp S3 Glue Kinesis Redshift Iam Gcs Bigquery

Education

Any Graduate

Apply Now

Back To Jobs

Posted On: Today
Experience: 5+ years of experience
Openings: 2
Category: Databricks Architect
Tenure: Flexible Position