Required Skills: Datadog,Prometheus, Grafana, New Relic
Job Description
Job Title: Datadog Engineer / Datadog Monitoring Specialist
Location: Houston
Job Type: Contract
Description :
We are seeking a skilled and detail-oriented Datadog Engineer to join our team. In this role, you will be responsible for implementing, configuring, and managing Datadog monitoring solutions across our infrastructure, ensuring the highest level of performance and availability for our systems. You will work closely with development, operations, and security teams to build efficient and scalable monitoring solutions that provide actionable insights into the health of our infrastructure and applications.
Responsibilities:
- Datadog Implementation & Configuration:
- Configure and deploy Datadog across various cloud environments, on-premise servers, and containerized applications.
- Set up custom dashboards, monitors, and alerts to track application, infrastructure, and network performance.
- Monitoring & Alerting:
- Design and implement monitoring solutions to ensure the availability, performance, and reliability of systems and applications.
- Create and fine-tune alerts and thresholds to proactively detect performance degradation, outages, and other critical incidents.
- Collaboration & Troubleshooting:
- Collaborate with engineering and operations teams to integrate Datadog monitoring with CI/CD pipelines and cloud infrastructure.
- Troubleshoot and analyze issues identified through Datadog, providing recommendations for resolution.
- Performance Optimization:
- Work with engineering teams to analyze metrics, logs, and traces to identify performance bottlenecks and suggest improvements.
- Reporting & Documentation:
- Generate and deliver comprehensive reports to stakeholders on system performance, outages, and trends.
- Maintain up-to-date documentation on Datadog configurations, integrations, and best practices.
Qualifications:
- Experience & Skills:
- Strong experience working with Datadog or similar monitoring tools (e.g., Prometheus, Grafana, New Relic).
- Hands-on experience with cloud platforms (AWS, Azure, GCP) and containerization tools (Docker, Kubernetes).
- Proficiency in scripting languages like Python, Bash, or Go for automation.
- Knowledge of infrastructure as code (IaC) tools such as Terraform, Ansible, or CloudFormation.
- Educational Background:
- A degree in Computer Science, Information Technology, or a related field, or equivalent work experience.
- Additional Skills (Preferred):
- Experience in distributed systems and microservices architectures.
- Familiarity with APM tools and log management platforms.
- Strong problem-solving and analytical skills.