to join our growing team. If building and maintaining scalable cloud systems, solving complex problems, and ensuring reliability in high-availability environments is your passion, this role is for you. You will use your expertise to
design, develop, and document cloud infrastructure and monitoring solutions
, while collaborating with team members to gather requirements and deploy cloud technologies that scale our global video advertising platform.
Key Responsibilities
Automate manual operational tasks to streamline processes and reduce manual effort.
Identify and implement improvements to increase system reliability and minimize incidents.
Manage and support proactive monitoring solutions across production environments.
Investigate trends in recurring issues within live applications and facilitate long-term fixes with development teams.
Perform real-time diagnosis on product codebases (backend/frontend) during major incidents.
Troubleshoot and analyze complex, difficult-to-reproduce problems.
Provide analysis and reporting on recurring production problems.
Collaborate with developers to accelerate bug triage and resolution by clarifying details and user scenarios.
Participate in
Tier 3 IT support
, including on-call rotation, to ensure 24/7 support for production infrastructure and services.
Create and manage automated infrastructure solutions, including builds and configuration management.
Apply best practices in networking, security, redundancy, scalability, monitoring, and performance for large-scale cloud and on-premises systems.
Stay up to date and integrate new technologies, PaaS offerings, and APIs to optimize infrastructure operations.
Requirements
5+ years
hands-on experience in
DevOps/Cloud Operations
with tools such as Git, CI/CD environments, and system automation.
Strong scripting/programming skills in
PowerShell
and at least one of the following:
Python, Bash/Shell, Java, JavaScript, Node.js
.
Experience with automation/orchestration tools:
Jenkins, Ansible, Terraform
.
Hands-on experience with
AWS
in large-scale enterprise environments (other cloud platforms a plus).
Strong knowledge of
Docker, Kubernetes, and container orchestration
.
Familiarity with monitoring and alerting tools such as
Grafana, DataDog, or equivalent
.
Understanding of
security architecture, redundancy, scalability, and system design
principles.
Experience with
source control and change management practices
.
Ability to create and maintain clear documentation, including technical references, diagrams, and checklists.
Strong written and verbal communication skills, capable of engaging both technical and non-technical stakeholders.
Ability to
prioritize and multitask in a fast-paced, mission-critical environment
.
Creative, resourceful, and adaptable, with a systems-thinking approach to problem-solving.
Job Type: Full-time
Pay: RM15,000.00 - RM16,000.00 per month
Benefits:
Health insurance
Maternity leave
Opportunities for promotion
Professional development
Work Location: In person
Beware of fraud agents! do not pay money to get a job
MNCJobz.com will not be responsible for any payment made to a third-party. All Terms of Use are applicable.