Required Skills: Spark, Hadoop, AWS, Azure, Python, Agile, Scrum, Java
Job Description
Key Responsibilities
Design, develop, and maintain scalable data pipelines using Python
Work hands-on with data ingestion, transformation, and loading processes (ETL/ELT)
Collaborate with data architects and business teams to define data solutions
Develop reusable frameworks for data processing and data quality
Optimize data workflows for performance and scalability
Work with cloud platforms such as GCP, AWS, or Azure
Ensure data quality, consistency, and governance across systems
Support production systems and troubleshoot data issues
Write complex SQL queries for data extraction and analysis
Participate in code reviews and enforce engineering best practices
Mandatory Skills
Strong expertise in Python programming (hands-on coding required)
Experience in building data pipelines and ETL processes
Proficiency in SQL and relational databases
Experience with cloud platforms (GCP/AWS/Azure)
Knowledge of BigQuery
Hands-on experience with Big Data technologies (e.g., Spark, Hadoop)
Experience with data integration tools and frameworks
Strong understanding of data modeling and data warehousing concepts
Experience with version control systems like Git