Required Skills: Snowflake, Redshift, Big Query, Azure Synapse, AWS Glue, Athena, Lambda, EMR, Big Query, Docker, Kubernetes
Job Description
About the Role:
We are looking for a Data Engineer to join our team and help build and maintain scalable data infrastructure. You will be responsible for designing, implementing, and optimizing data pipelines, ensuring data quality, and supporting analytics and machine learning initiatives. If you are passionate about working with big data technologies and cloud platforms, we’d love to hear from you!
Key Responsibilities:
Design, develop, and maintain scalable ETL/ELT pipelines to process large datasets
Build and optimize data warehouses and data lakes on platforms like AWS, GCP, or Azure
Work with structured and unstructured data to support analytics and business intelligence
Ensure data quality, governance, and security across data platforms
Collaborate with data scientists, analysts, and software engineers to support data-driven initiatives
Optimize query performance and data storage for efficient data processing
Automate workflows and monitor data pipelines for reliability
Work with batch and real-time data processing frameworks (e.g., Apache Spark, Kafka, Airflow)
Required Qualifications:
Education: Bachelor’s or Master’s degree in Computer Science, Data Engineering, or a related field
Experience: 3+ years of experience in data engineering or a related role
Technical Skills:
🔹 Proficiency in SQL and database management (PostgreSQL, MySQL, Snowflake, Redshift, BigQuery)
🔹 Experience with Python, Scala, or Java for data processing
🔹 Hands-on experience with ETL/ELT tools like Apache Airflow, dbt, or Informatica
🔹 Strong knowledge of big data technologies (Spark, Hadoop, Databricks, Kafka)
🔹 Experience with cloud platforms (AWS, Azure, or GCP) and data services (S3, Glue, BigQuery, Redshift, Synapse)
🔹 Familiarity with CI/CD pipelines, Docker, and Kubernetes for data deployment
🔹 Understanding of data governance, security, and compliance (GDPR, HIPAA, etc.)
Nice-to-Have Skills:
Experience with machine learning pipelines and MLOps
Knowledge of data streaming technologies (Flink, Pulsar)
Exposure to Terraform or Infrastructure as Code (IaC)