Data Engineer Lead, Data Engineer Architect
  • VRK IT VISION
2 Days Ago
60-65 per Hourly
W2, C2C, 1099
Remote
10-20 Years
Required Skills: Data Engineer Lead/ Architect with Azure Databricks and Unity Catalog
Job Description

We are looking for an experienced Azure Databricks Engineer with strong expertise in cloud-based data engineering, ETL development, and distributed data processing. The ideal candidate should have solid hands-on experience with PySpark, Delta Lake, Azure Data Factory, and building scalable data pipelines on Azure.

The engineer will work closely with business, Data Architects, and cross-functional teams to design, develop, and optimize data pipelines for enterprise grade analytics and reporting.

 

Key Responsibilities:

Data Engineering & Pipeline Development

Design, develop, and optimize ETL/ELT pipelines using Azure Databricks (PySpark).

Build scalable data ingestion workflows from various structured and unstructured sources.

Implement transformation logic, data cleansing, enrichment, and validation frameworks.

Work with Delta Lake to build medallion architecture (Bronze/Silver/Gold layers).

Develop reusable Databricks notebooks and jobs for production data workflows.

Azure Cloud & Integration

Build and orchestrate pipelines using Azure Data Factory (ADF).

Integrate Databricks with other Azure services—ADLS, Azure SQL, Event Hub, Key Vault, Synapse.

Optimize compute environments (clusters, pools, autoscaling).

Implement DevOps processes using Git, CICD, Azure DevOps.

Performance, Quality & Governance

Optimize PySpark jobs for performance and cost efficiency.

Implement best practices for data governance, security, and access control.

Troubleshoot production issues and perform root-cause analysis.

Conduct code reviews ensuring coding standards and data quality.

Collaboration & Documentation

Work with Data Architects to define architecture and design patterns.

Prepare technical documents, solution diagrams, and runbooks.

Collaborate with business stakeholders to understand requirements and translate them into technical solutions.

Mandatory Skills:

Azure Databricks – notebooks, jobs, workflows, Delta Lake.

PySpark – dataframes, Spark SQL, optimization & debugging.

Azure Data Factory (ADF) – triggers, pipelines, integration runtime.

Data Lake Storage (ADLS Gen2) – folder structures, partitioning, security.

CI/CD – Git (branching strategies), Azure DevOps pipelines.

SQL – strong proficiency in writing optimized queries.

Good-to-Have Skills:

Azure Synapse Analytics

Azure Event Hub / Kafka

Azure Functions

DataBricks REST APIs

Streaming pipelines (Structured Streaming)

Experience with data modelling

Knowledge of Lakehouse architecture

Behavioral & Soft Skills:

Strong analytical and problem-solving skills.

Ability to work independently and in cross-functional teams.

Good communication skills for stakeholder interaction.

Comfortable working in Agile/Scrum models.

Jobseeker

Looking For Job?
Search Jobs

Recruiter

Are You Recruiting?
Search Candidates