Site Reliability Engineer (Senior) Id27316

Detalles de la oferta

What you will do Day-to-day management of alerts, checking systems, and escalating issues as necessary; Be part of a team that provides 24x7 on-call support for critical SaaS events; Available in case of emergencies when team members are not available or need help; Documentation of issues and remediation steps; Proactively create appropriate monitors in the EKS/K8S ecosystem; Deploy to EKS/K8s cluster using Terraform and Helm; Learn and maintain existing infrastructure running under Docker Swarm; Improve existing infrastructure health by implementing checks and scripts to correct known issues; Maintenance and development of deployment code; Automating tasks that are currently executed manually; Implement/integrate new technologies in our Cloud Infrastructure; Collaborate with other teams and departments to provide the highest level of support and assistance; Apply a real customer focus when planning deployments/updates, having the customer in the forefront of the mind, and considering the impact on them before making changes; Work closely on solutions with Support, Customer Success, Migration, and Professional Services teams to provide the best in class SaaS service to our customers; Perform RCA and take necessary corrective actions to prevent the recurrence of issues; Create and assign alert-related actions to the appropriate team after the investigation; Handle support requests for environment-specific actions; Identify and provide automation requirements to improve RCA.
Must haves Hands-on AWS Cloud Engineer; Working knowledge of EKS/Terraform/Helm; Working Experience with Docker and Docker Swarm; Good understanding of AWS IAM roles and policies; Logging and Monitoring AWS Resources using CloudWatch logs; Experience working with Linux environment; Proficient in Bash and/or Python scripting; A strong understanding of web technologies such as REST APIs; Working Experience with monitoring solutions, such as Grafana, and Prometheus; Excellent oral and written communication skills; Customer-facing communication skills to effectively explain issues and RCAs to them; Experience in Product/Application Support for SaaS-based products; Understanding of APIs, Databases, Systems Architecture, and Design; Designing, implementing, and operating in a DevSecOps environment; Upper-intermediate English Level.
#J-18808-Ljbffr


Salario Nominal: A convenir

Fuente: Whatjobs_Ppc

Requisitos

Administrador Sql Server

Nos encontramos en la búsqueda de un colaborador para cubrir la vacante de Administrador SQL para insertarse en el equipo de trabajo, en el área de Base de D...


Novatium - Colombia

Publicado a month ago

Desarrollador Junior

En Newo estamos en búsqueda de talentos como tú. Buscamos tecnólogo y/o ingeniero en sistemas, software o áreas similares. Funciones: - Participar en el a...


New World Company Sas - Colombia

Publicado a month ago

Programador De Itinerario Y Planeación

Importante organismo de inspección requiere para su equipo de trabajo un programador de itinerario y plan de trabajo. requisitos: Bachillerato académico o t...


Eyc Ingenieria S.A.S - Colombia

Publicado a month ago

Administrador De Aplicaciones

Nos encontramos en la búsqueda de un colaborador para cubrir la vacante de Administrador de Aplicaciones para insertarse en el equipo de trabajo, en el área ...


Novatium - Colombia

Publicado a month ago

Built at: 2025-01-07T11:13:12.984Z