Required Skills: Amazon Web Services, PySpark, Python, SQL ,Scalability, Scripting ,Stakeholder Engagement, Supervision, Workflow
Job Description
Role: Data Analyst
Location: Remote
Job Type: Contract
Job Summary
Senior Level Data Engineer / Data Analyst technical lead with data analytics experience, Databricks, Pyspark and Python
This is a key role that requires senior/lead with great communication skills who is very proactive with risk & issue management.
Experience and Education Required
10+ years of experience as Data Analyst / Data Engineer/Data Scientist with Databricks on AWS expertise in designing and implementing scalable, secure, and cost-efficient data solutions on AWS
Job Profile:
- Hands-on data analytics experience with Databricks on AWS, Pyspark and Python
- Must have prior experience with migrating a data asset to the cloud using a GenAI automation option
- Experience in migrating data from on-premises to AWS
- Expertise in developing data models, delivering data-driven insights for business solutions
- Experience in pretraining, fine-tuning, augmenting and optimizing large language models (LLMs)
- Experience in Designing and implementing database solutions, developing PySpark applications to extract, transform, and aggregate data, generating insights
- Data Collection & Integration: Identify, gather, and consolidate data from diverse sources, including internal databases and spreadsheets ensuring data integrity and relevance.
- Data Cleaning & Transformation: Apply thorough data quality checks, cleaning processes, and transformations using Python (Pandas) and SQL to prepare datasets.
- Automation & Scalability: Develop and maintain scripts that automate repetitive data preparation tasks.
- Autonomy & Proactivity: Operate with minimal supervision, demonstrating initiative in problem-solving, prioritizing tasks, and continuously improving the quality and impact of your work
Technical Skills:
- Minimum of 10 years of experience as a Data Analyst, Data Engineer, or related role, ideally with a bachelor s degree or higher in a relevant field.
- Strong proficiency in Python (Pandas, Scikit-learn, Matplotlib) and SQL, with experience working across various data formats and sources.
- Proven ability to automate data workflows, implement code-based best practices, and maintain documentation to ensure reproducibility and scalability.
Behavioral Skills:
-
Ability to manage in tight circumstances, very pro-active with risk & issue management
-
Requirement Clarification & Communication: Interact directly with colleagues to clarify objectives, challenge assumptions.
-
Documentation & Best Practices: Maintain clear, concise documentation of data workflows, coding standards, and analytical methodologies to support knowledge transfer and scalability.
-
Collaboration & Stakeholder Engagement: Work closely with colleagues who provide data, raising questions about data validity, sharing insights, and co-creating solutions that address evolving needs.
-
Excellent communication skills for engaging with colleagues, clarifying requirements, and conveying analytical results in a meaningful, non-technical manner.
-
Demonstrated critical thinking skills, including the willingness to question assumptions, evaluate data quality, and recommend alternative approaches when necessary.
-
A self-directed, resourceful problem-solver who collaborates well with others while confidently managing tasks and priorities independently.