? Participate in platform software engineering, writing code to continue reducing human intervention in operational tasks and automating processes.
? Lead in-depth technical and data analysis to gauge service trends and drive improvements.
? Contribute to prioritizing reliability features and the design, development, and delivery of effective tooling, alerts, and automated responses to identify and address reliability risks.
? Contribute to proactive technical communication of reliability, stability, and efficiency results (based on Service Level Objectives), service health (via dashboards), key reliability risks, and issues to senior business and technology stakeholders - to prioritize activity (based on trend analysis) and direct investment and action.
? Automate the installation and maintenance of the test/development server, release build, and deployment of existing tools and dependent solutions.
? Design and take ownership of innovations that improve software engineering velocity, infrastructure resiliency, and security.
? Evaluate new application packages and tools and perform research on best practices. ? Ability to debug and find the root cause of the errors related to infrastructure problems for an ongoing operation.
? Have the technical skills to review, verify, and validate the software code developed in the DevOps project.
Required Qualifications
? 6+ years experience in software development and
DevSecOps/SRE functions
with at least two years in a senior technical capacity.
? You are either a Software Engineer with a real interest in systems, networking, monitoring, and automation or an experienced sysadmin or systems engineer with professional Linux skills, development experience managing distributed systems at scale, and a demonstrable interest and experience in using software engineering to solve operational problems.
? Comfortable writing software to automate API-driven tasks at scale. Tooling engineers primarily use
Java C/C++, NodeJS, Python, and Go
.
? Experience automating the build and deployment of software products and understanding the related challenges in distributed systems.
? Ability to quickly and clearly communicate incident status via email in business-friendly language
? Experience and advanced understanding of
Observability tools (e.g., ELK, Grafana/Prometheus, Zabbix, Nagios, etc.
? Experience designing and implementing
CI/CD
and
release management
solutions.
? Well-rounded broad knowledge of OS platforms (Linux/UNIX), Networking, Web Systems, and DevSecOps.
? Experience working with large-scale distributed systems with an understanding of microservices architecture concepts.
? Strong organizational skills and the ability to effectively manage multiple tasks.
? Experience with containers and CD tools - e.g., Pulumi, Docker, Ansible, Puppet, etc.
? Experience with integration and build tools - Jenkins, Groovy, Maven, Atlassian Suite, GitLab CI.
Preferred Qualifications
? Degree in Computer Science, Engineering, or equivalent experience.
? Familiarity with Agile/Lean methodologies, particularly Kanban and Scrum.
? Good understanding and knowledge of distributed systems
? Experience with programming languages C/C++, Java, Golang, Python, etc.
? Experience with IaC (Infrastructure as a code), e.g., Pulumi/Terraform.
? Good understanding of the Linux Kernel and hardware optimizations.
? Experience with Networking, storage (SAN/NAS), and virtualization technology
Job Types: Full-time, Permanent
Pay: RM10,000.00 - RM18,000.00 per month
Benefits:
Health insurance
Professional development
Work Location: In person
Beware of fraud agents! do not pay money to get a job
MNCJobz.com will not be responsible for any payment made to a third-party. All Terms of Use are applicable.