Required Skills: ETL Pipelines, ELT Pipelines, Apache airflow, DBT, MongoDB, DynamoDB, RDBMS
Job Description
We are seeking a highly motivated and skilled Big Data Architect to join our team at Multiple Locations (Onsite)
Technical Expertise:
Performance Optimization:
- Proven track record of fine-tuning large-scale databases, including indexing, partitioning, and query optimization.
- Experience in schema redesign and migration strategies.
- Modern Data Solutions: ○ Knowledge of multi-modality data handling and NoSQL solutions (e.g., MongoDB, DynamoDB).
- Familiarity with ETL/ELT pipelines and tools like Apache Airflow, DBT, or similar.
Key Responsibilities
Data Architecture Modernization:
- Analyze the current RDBMS-based architecture for virtual assistant conversational data.
- Redesign and modernize data schemas to support scalability, performance, and multi-modality use cases.
- Incorporate emerging data storage technologies such as Big Query, Snowflake, or other cloud-native platforms.
Optimization and Fine-Tuning:
- Evaluate and improve indexing, partitioning, and sharding strategies to optimize query performance.
- Refactor existing schemas and table structures for efficient data retrieval and storage.
- Implement best practices for data normalization and denormalization as required by the use cases.
Migration Strategy:
- Develop a detailed migration plan for transitioning data from the current RDBMS to modern platforms.
- Ensure data consistency, integrity, and minimal downtime during migration.
- Work with DevOps and engineering teams to automate migration processes and set up monitoring tools.
Support Multi-Modality Data Needs:
- Design data models that can handle multi-modality data (text, images, audio, etc.) effectively.
- Enable seamless integration of new data types into the existing architecture.
Collaboration and Governance:
-
Collaborate with engineering, analytics, and AI/ML teams to align the data architecture with their needs.
-
Define and enforce data governance, quality standards, and security policies.
-
Document architectural decisions and maintain up-to-date diagrams and schemas. 6. Performance Monitoring and Maintenance:
-
Implement tools to monitor database performance and identify bottlenecks.
-
Proactively recommend improvements to maintain high availability and reliability.
-
Plan for future data growth and evolving business requirements.
-
Design and implement scalable and reliable data solutions.
-
Develop data architecture blueprints and roadmaps.
-
Lead the development of data warehousing and ETL processes.
-
Ensure data quality and integrity across all systems.
-
Collaborate with cross-functional teams to understand data needs and requirements.
-
Provide technical guidance and mentorship to junior team members.
-
Evaluate and recommend new data technologies and tools.