Required Skills: Python, AWS Cloud Formation, ETL, pyspark, UNIX, RESTful APIs
Job Description
Job Summary:
The primary role of the Data Engineer-Developer is to function as a critical member of a data team by designing data integration solutions that deliver business value in line with the company's objectives. They are responsible for the design and development of data/batch processing, data manipulation, data mining, and data extraction/transformation/loading into large data domains using Python/Pyspark and AWS tools.
Location: Charlotte, North Carolina
Education: Bachelors
Responsibilities:
- Provide scoping, estimating, planning, design, development, and support services to a project.
- Identify and develop the Technical detail design document.
- Work with developers and business areas to design, configure, deploy and maintain custom ETL Infrastructure to support project initiatives.
- Design and develop data/batch processing, data manipulation, data mining, and data extraction/transformation/loading (ETL Pipelines) into large data domains.
- Document and present solution alternatives to clients, which support business processes and business objectives.
- Work with business analysts to understand and prioritize user requirements
- Design, development, test, and implement application code
- Follow proper software development lifecycle processes and standards
- Quality Analysis of the products, responsible for the Defect tracking and Classification
- Track progress and intervene as needed to eliminate barriers and ensure delivery.
- Resolve or escalate problems and manage risk for both development and production support.
- Maintain deep knowledge and awareness of technical & industry best practices and trends, especially in technology & methodologies.
Qualifications:
- At least 7+ years of Developer experience specifically focused on Data Engineering
- Strong Hands-on experience in Data Engineering development using Python and Pyspark as an ETL tool
- Hands-on experience in AWS services like Glue, RDS, S3, Step functions, Event Bridge, Lambda, MSK (Kafka), EKS etc.
- Hands-on experience in Databases like Postgres, SQL Server, Oracle, Sybase
- Hands-on experience with SQL database programming, SQL performance tuning, relational model analysis, queries, stored procedures, views, functions and triggers
- Strong technical experience in Design (Mapping specifications, HLD, LLD), Development (Coding, Unit testing).
- Good knowledge in CI/CD DevOps process and tools like Bitbucket, GitHub, Jenkins
- Strong foundation and experience with data modeling, data warehousing, data mining, data analysis and data profiling.
- Strong experience with Agile/SCRUM methodology
- Good communication and inter-personal skills
- Nice to have Skills
- Knowledge in developing UNIX scripts
- Working knowledge of ERWIN
- Experience in Reporting tools like Tableau, Power BI is a plus
- Experience in working with REST API's
- Experience with other ETL tools (DataStage, Informatica, Pentaho, etc.)
- Experience in workload automation tools like Control-M, Autosys etc.
- Working knowledge of Data Science concepts
Skills:
-
Python
-
AWS Cloud Formation
-
ETL
-
pyspark
-
UNIX
-
RESTful APIs