Required Skills: QuickETL, AWS Glue-ETL, Data modelling, SQL, Python, Spark, AWS S3, EMR, Athena, Databricks
Job Description
SQL (ANSI, Spark) – Needs to be very proficient with writing highly efficient and complex analytical queries (multi-join, unnest etc.). Data modelling, optimization, ability to handle large data sets and OLAP experience
Must have skills: Mandatory requirements- SQL, Python, Spark, AWS S3, EMR, Athena
Good to have- QuickETL
Need move/migrate pipelines to paved paths
-
Knowledge of Columnar data formats like Parquet, ORC
-
AWS Glue - ETL, Workflows (Building a Glue Spark job to transform raw JSON data from S3 into a Parquet, ORC) Glue Data Catalog, Hive
-
Amazon Kinesis (Data Streams, Data Firehose) AWS Sage maker, AWS S3, AWS CloudFormation or AWS CDK, Amazon Athena
-
Databricks SQL, Notebooks