Lead design and development of end-to-end data pipelines (batch and streaming) using modern technologies.
Architect and maintain cloud-based data platforms (e.g., AWS, Azure, GCP).
Develop and optimize ETL/ELT processes to ingest data from various internal and external sources.
Implement data governance practices including data quality, cataloging, and lineage.
Collaborate with business and analytics teams to understand data requirements and deliver high-quality datasets.
Mentor junior engineers and provide technical leadership on data projects.
Drive best practices in data engineering, DevOps, and CI/CD pipelines.
Ensure scalability, performance, and cost-efficiency of data platforms.
8+ years of experience in data engineering or related fields.
3+ years of experience in a lead or senior-level role.
Strong proficiency in Python, SQL, and Spark.
Experience with cloud platforms like AWS (S3, Redshift, Glue, EMR), Azure, or Google Cloud (BigQuery, Dataflow).
Deep knowledge of data modeling, warehousing (Snowflake, Redshift, BigQuery), and performance tuning.
Experience with streaming technologies (Kafka, Kinesis, Flink).
Familiarity with orchestration tools (Airflow, Prefect, Dagster).
Exposure to infrastructure as code (Terraform, CloudFormation) and containerization (Docker, Kubernetes).
Strong understanding of data privacy and compliance (e.g., GDPR, HIPAA).
Bachelor’s or Master’s degree in Computer Science, Engineering, or related field.
Experience in data mesh or data lakehouse architectures.
Background in leading cross-functional data initiatives.
Experience with ML Ops or support for data science workflows.
Jobseeker
Recruiter