to lead the design and implementation of advanced monitoring and observability solutions. You'll be instrumental in enabling real-time visibility into the performance and health of systems across
Linux, Kubernetes, and OpenStack
environments--supporting reliability, scalability, and cloud-native architecture excellence.
Key Responsibilities
1.
Monitoring System Design & Implementation
Architect and deploy robust monitoring pipelines using
Prometheus
and
Zabbix
.
Ensure full visibility across system performance and resource health metrics.
Manage metric collection and alerting with high reliability and precision.
2.
Architecture Optimization & Troubleshooting
Scale observability solutions for low-latency and high-availability environments.
Troubleshoot complex issues across cloud-native infrastructure.
Perform root cause analysis of metric gaps or data inconsistencies.
3.
Automation & Development
Build automation tools using
Golang
and
Python
for observability operations.
Integrate alerts, anomaly detection, and self-healing capabilities.
Embed observability into CI/CD pipelines and GitOps frameworks.
4.
Collaboration & Integration
Collaborate with DevOps, SRE, and platform teams to align monitoring with SLAs.
Integrate with AI-driven analytics and log management tools like
ELK Stack
.
Build intuitive dashboards using
Grafana
to present actionable insights.
Required Qualifications
Bachelor's degree or higher in Computer Science or related field.
3+ years of experience in cloud monitoring or observability engineering.
Strong hands-on knowledge of
Prometheus
,
Zabbix
, and
Grafana
.
Proven skills in
Linux performance monitoring
,
Kubernetes
, and
OpenStack
.
Programming experience in
Golang
(for performance) and
Python
(for automation).
Familiarity with time-series databases, distributed tracing, and CI/CD integration.
Strong analytical and debugging skills; proactive and system-driven mindset
Job Type: Full-time
Pay: RM12,000.00 - RM17,000.00 per month
Education:
Bachelor's (Preferred)
Experience:
Prometheus, Zabbix, and Grafana: 2 years (Preferred)
cloud monitoring or observability engineering: 2 years (Required)
Linux performance monitoring, Kubernetes, and OpenStack: 3 years (Preferred)
Python (for automation): 3 years (Preferred)
Golang (for performance): 2 years (Preferred)
Language:
Mandarin (Required)
Work Location: In person
Beware of fraud agents! do not pay money to get a job
MNCJobz.com will not be responsible for any payment made to a third-party. All Terms of Use are applicable.