Exp in managing Tanzu App Service, Kubernetes clusters, DevOps Practices, Terraform,
Programming Languages : C, C++, Java, Python, Go, Perl, or Ruby(Anyone of them)
Exp: 5 to 7 Years
Roles & Responsibilities
Job Purpose
Platform Reliability Engineer (PRE) is responsible for engineering, operating, and maintaining GEL's internal container platform and its supporting infrastructure, with a strong focus on reliability, resiliency, and security. As a Senior PRE within GEL's Infrastructure team, you will play a pivotal role in designing, building, and operating distributed container hosting solutions using Broadcom's Tanzu product.
The Job
As a Senior Platform Reliability Engineer, you will play a key role in maintaining the stability, reliability, and efficiency of GEL's internal container platform and its supporting infrastructure. Your responsibilities will include core operational tasks such as resource provisioning and management, responding to platform and application outages, capacity planning, monitoring, and driving reliability enhancements.
You will continuously evaluate platform's technical architecture to ensure it scales effectively with evolving application demands.
This includes proactively identifying and resolving reliability issues, analyzing product dependencies, pinpointing performance bottlenecks, and implementing optimization strategies to enhance platform availability and cost efficiency.
In this role, you will participate in a 24/7 on-call rotation, promptly addressing alerts from the global monitoring team and resolving production incidents to maintain platform and application uptime. Additionally, you will regularly review team workflows to identify manual processes and implement automation solutions that reduce effort and minimize human error.
Regularly review the security advisory issued by Broadcom related to Tanzu suite of products and deploy product updates as required to keep platform vulnerable free.
Work with open-source technologies, CI/CD, SCM tools as necessary, and source control such as Bitbucket, implement organization containers (eg, Docker and Kubernetes). Stay current with industry trends and propose new ways for our business to improve
Takes accountability in considering business and regulatory compliance risks and takes appropriate steps to mitigate the risks.
Maintains awareness of industry trends on regulatory compliance, emerging threats and technologies in order to understand the risk and better safeguard the company.
Highlights any potential concerns /risks and proactively shares best risk management practices.
Our Requirements
Working experience as a Platform Reliability Engineer or strong working experience as a Site Reliability Engineer in a cloud operating environment. Candidates with excellent DevOps experience will be considered.
Strong experience in managing Tanzu Application Service and Kubernetes clusters.
Good working knowledge of DevOps pipeline and automation tools (E.g. Selenium, SOAPUI, Bamboo, Jenkins, Ansible, Maven, Github, Bitbucket, Nexus, Jira, Confluence etc).
Strong technical and business acumen with the ability to lead a small technical team.
Experience with infrastructure-as-code, server templating, orchestration, configuration management and provisioning tools is advantageous e.g. Terraform, Chef, Docker, Packer, Kubernetes.
Must code, debug and optimize code and automate repetitive tasks.
Systematic problem-solving approach, coupled with effective communication skills and a sense of ownership and drive.
Experienced in one or more of the following: C, C++, Java, Python, Go, Perl or Ruby.
Strong experience in a Continuous Integration/Continuous Delivery (CI/CD) environment with strong appreciation of change/version control process and methodologies
Strong experience in dealing with platform upgrades, patching and buildpack management
Strong experience in troubleshooting network related issues
Good working knowledge of NSX-T solution and its integration with various Tanzu suite of products
Candidate should be open to take up on call support on rotation basis
Candidate should be willing to work in shifts
Job Types: Full-time, Contract
Contract length: 12 months
Pay: RM7,000.00 - RM7,500.00 per month
Work Location: In person
MNCJobz.com will not be responsible for any payment made to a third-party. All Terms of Use are applicable.