Required Skills: Azure, AWS, Kafka, Spark, MLflow, AutoML
Job Description
Core Technical Responsibilities:
• Architect and optimize big data pipelines using Apache Spark, Delta Lake, and Databricks-native tools.
• Design scalable data ingestion and transformation workflows, including batch and streaming (e.g., Kafka, Spark Structured Streaming).
• Create integration guidelines to configure and integrate Databricks with other existing security tools relevant to data access control.
• Implement data security and governance using Unity Catalog, access controls, and data classification techniques.
• Support migration of legacy systems to Databricks on cloud platforms like Azure, AWS, or GCP.
• Manage cloud platform operations with a focus on FinOps support, optimizing resource utilization, cost visibility, and governance across multi-cloud environments.
Collaboration & Advisory
• Act as a technical advisor to data engineering and analytics teams, guiding best practices and performance tuning.
• Partner with architects and business stakeholders to align Databricks solutions with enterprise goals.
• Lead proof-of-concept (PoC) initiatives to demonstrate Databricks capabilities for specific use cases.
Strategic & Leadership Contributions
• Mentor junior engineers and promote knowledge sharing across teams.
• Contribute to platform adoption strategies, including training, documentation, and internal evangelism.
• Stay current with Databricks innovations and recommend enhancements to existing architectures.
Specialized Expertise (Optional but Valuable)
• Machine Learning & AI integration using MLflow, AutoML, or custom models.
• Cost optimization and workload sizing for large-scale data processing.
• Compliance and audit readiness for regulated industries.
Qualifications:
• Bachelor’s degree in computer science.
• At least 6 years of experience in IT cloud infrastructure, architecture and operations, including security, with at least 3 years in a Platform admin role
• Strong understanding of data security principles and best practices.
• Expertise in Databricks platform, security features, Unity Catalog, and data access control mechanisms.
• Experience with data classification and masking techniques.
• Strong understanding of cloud cost management, with hands-on experience in usage analytics, budgeting, and cost optimization strategies across multi-cloud platforms.
• Strong knowledge of cloud architecture, design, and deployment principles and practices, including microservices, serverless, containers, and DevOps.
• Deep expertise in Azure/AWS big data & analytics technologies, including Databricks, real time data ingestion, data warehouses, serverless ETL, No SQL databases, DevOps, Kubernetes, virtual machines, web/function apps, monitoring and security tools.
• Deep expertise in Azure/AWS networking and security fundamentals, including network endpoints & network security groups, firewalls, external/internal DNS, load balancers, virtual networks and subnets.