Design and implement resilient system architectures for high availability and scalability.
Develop automation tools and scripts to improve operational efficiency.
Define, track, and analyze SLOs and SLIs for performance and reliability.
Conduct post-mortem analyses and implement improvements based on findings.
Collaborate on best practices for system reliability and incident management.
Troubleshoot and resolve database, network, and deployment issues.
Ensure issue resolution meets Service Level Agreements (SLAs).
Identify and address system performance bottlenecks with actionable recommendations.
Maintain documentation for processes and incident responses.
MNCJobz.com will not be responsible for any payment made to a third-party. All Terms of Use are applicable.