-
Mandatory: Primary skill: Pyspark.
-
5-8 years of experience in the design and implementation of Big Data pipelines using PySpark, database migration, transformation, and integration solutions for any Data warehousing project.
-
Must have excellent knowledge in Apache Spark and Python programming experience.
-
Experience in developing data processing tasks using PySpark such as reading data from external sources, merging data, aggregating data, performing data enrichment, and loading into target data destinations.
-
Hands-on project experience on Jupyter Notebook/ PyCharm etc. IDE
-
Should have experience in fine-tuning process and troubleshooting performance issue.
-
Experience in source code control systems like git, GitHub, SVN etc.
-
Experience in deployment and operationalizing the code, knowledge of scheduling tools like Control-M etc. is preferred.
-
Experience in CI/CD tools like Jenkins, Nexus, Ansible etc.
-
Excellent communication and interpersonal skills.