Required Skills: Python, SQL, Scala Apache Spark, Hadoop Airflow, Kafka, dbt AWS (S3, Redshift, Glue), GCP (BigQuery), Azure (Data Factory) Snowflake, Databricks ETL/ELT pipelines Data Lakes & Data Warehouses Parquet, JSON, Avro Git, Jenkins, Terraform Data modeling & data governance CI/CD & DevOps basics Strong problem-solving & team leadership
Job Description
Job Title: Senior Data Engineer
Location: CA, TX, NJ, NY, GA, AZ
Contract Type: W2
Experience Required: 9+ years
Job Summary:
We are looking for a highly skilled Senior Data Engineer with 9+ years of experience in designing, building, and maintaining large-scale data pipelines and infrastructure. The ideal candidate will be proficient in both batch and real-time data processing, have strong experience with cloud data platforms, and be capable of mentoring team members and leading end-to-end data projects.
Key Responsibilities:
-
Design, develop, and maintain scalable and reliable data pipelines using modern data engineering tools and frameworks.
-
Build ETL/ELT pipelines for structured and unstructured data from multiple sources.
-
Implement and manage data lake, data warehouse, and real-time streaming solutions.
-
Work closely with Data Scientists, Analysts, and Product teams to understand data requirements and deliver solutions.
-
Ensure data quality, integrity, and governance across all pipelines.
-
Optimize performance and cost-effectiveness of data processes in the cloud.
-
Lead architecture and design discussions for new data initiatives.
-
Mentor junior data engineers and contribute to best practices and code reviews.
Required Skills:
-
Proficiency in Python or Scala for data engineering.
-
Strong experience with SQL, Spark, Kafka, and Airflow.
-
Deep understanding of ETL/ELT processes, data modeling, and data architecture.
-
Hands-on experience with cloud data platforms such as AWS (Redshift, Glue, EMR, S3), GCP (BigQuery, Dataflow), or Azure (Synapse, Data Factory).
-
Experience with data lakes, data warehouses, and columnar storage formats (Parquet, ORC).
-
Knowledge of DevOps practices, CI/CD for data pipelines, and infrastructure as code.
-
Experience with data governance, security, and compliance (GDPR, HIPAA, etc.).
-
Excellent problem-solving, communication, and leadership skills.