Search by job, company or skills

OCBC

Linux Site Reliability Engineer (SRE)

This job is no longer accepting applications

new job description bg glownew job description bg glownew job description bg svg
  • Posted 2 months ago

Job Description

ABOUT THE ROLE

The successful candidate will be part of OCBC Bank Infrastructure as a Service team based in Cyberjaya. This role will involve significant focus on process improvements, incident reduction and service quality improvements.

JOB RESPONSIBILITIES:

  • You partner with various technology teams to design and deliver a reliable, scalable, secure, and performant Red Hat Linux Platform.
  • You stay current on technical trends to suggest innovative tools and approaches to interesting problems in the Bank.
  • You share your expertise with the entire Engineering /Operations organization.
  • You participate in a 24/7 on-call rotation and drive improvements using SRE practices.
  • You actively participate in toil elimination, observability and monitoring improvements, knowledge management, error budget compliance, deployment designs and testing.

JOB REQUIREMENTS:

  • Bachelor's degree and/or equivalent experience in Information Technology, Computer Science or Business Management.
  • Have a relevant experience of above 6 years on Red Hat Linux Platform.
  • Install, Maintain, Upgrade and Patch UNIX servers in the organization.
  • Troubleshoot and fix system and software/hardware issues.
  • Support and maintain High Availability of system using clustering software.
  • Secure the systems by following published hardening guidelines.
  • Assist in audit and compliance tasks.
  • Perform Disaster Recovery activities.
  • A passion for problem solving with strong analytical capabilities.
  • Demonstrable web development experience, ability to write APIs.
  • Know at least one of Python, PowerShell, Ruby, Java, C++, C#, Go at an intermediate level.
  • Experience with Querying relational databases, and NoSQL databases.
  • Experience in automating releases, continuous integration/delivery systems and relevant tools in infrastructure.
  • Experience / understanding of Software Defined Data Centre, AWS/Azure-based, cloud-native infrastructure and managed services, such as, EC2, S3 and other storage options, VPCs, IAM.
  • Experience with infrastructure as code (Terraform or CloudFormation).
  • Knowledge of configuration management systems like SCCM, Ansible, Puppet, Chef

KNOWLEDGE, SOFT SKILLS AND ABILITIES:

  • Excellent communications skills, with the ability to explain complex technological ideas and concepts in a way that is understandable to a non-technical audience.
  • Must display excellent teamwork skills, written and oral communication skills, and formal documentation skills.
  • Exceptional information sharing skills with sense of responsibility, attention to detail.
  • Proactive Incident management with situational analysis and decision-making abilities
  • Ability to work independently and with little guidance while keeping the focus.
  • Extensive experience, collaborating with vendors, professional services, and providers.

More Info

Job Type:
Industry:
Employment Type:

About Company

Job ID: 126513797