3/9/2025 A) Incident Management Engineer, AWS Incident Detection and Response
2 weeks ago
DESCRIPTION
ABOUT US
Amazon has built a reputation for excellence with a mission to be the earth's most customer-centric company. Amazon Web Services (AWS) is carrying on that tradition while leading the world in cloud technologies.
The AWS Incident Detection and Response team is part of the Enhanced Support Services (ES2) organization within AWS Support, dedicated to offering eligible AWS Enterprise Support customers proactive engagement and incident management to reduce the potential for failure and to accelerate recovery of critical workloads from disruption. We achieve these objectives by working closely with customers to develop runbooks and response plans customized to the context of each workload onboarded to the service. Onboarded workloads are monitored 24x7 by a team of Incident Management Engineers (IMEs) to detect and engage customers on a call bridge within 5 minutes of a critical alarm.
ABOUT YOU
Incident Management Engineers have a broad skill set with demonstrated career progression and a proven track record of delivering results. The successful candidate will possess strong analytical acumen, solid technology experience, superb business judgment, strategic account ownership, and a propensity to dive deep to solve complex problems. You will also have a passion for creating/providing a world-class experience for our customers. The candidate must understand the competitive and industry landscape and must have the leadership presence and communication skills to effectively work with customers at all levels of their organization. You must be a self-starter and able to execute at both a tactical and strategic level - with a strong attention to detail. This is a global role that requires excellent written and verbal communication skills and a passion and desire for leading the resolution of critical incidents. Your decisions are fundamental to helping protect our most critical customers and will help maintain the health of AWS customers worldwide.
Finally, you are passionate about technology with a desire to learn more and do more with AWS.
ABOUT THE ROLE
AWS Support is looking for a leader with a strong background in Incident Management and customer ownership to be there during the moments that matter for our most critical customers. We are looking for a Major Incident Manager to join our team to provide incident response and account ownership. In this position, you will play a pivotal role in providing communication, emergency response, technical resolver engagement, and incident management for our customers.
Key job responsibilities
1. Drive the resolution of large-scale customer impacting incidents as part of a team rotation.
2. Drive critical, complex customer escalations in situations that are sometimes technically challenging in collaboration with Engineering Teams.
3. Provide critical incident response/management (including leading calls with internal/external participants) for customer's critical workloads.
4. Contribute to Problem Records for customers.
5. Conduct continuous real-time proactive monitoring of customer metrics.
6. Prioritize, manage, and own emerging and developing customer issues from start to finish.
7. Monitor and manage communications during high impact events via relevant channels.
8. Collaborate with key stakeholders across AWS to improve the customer experience and develop mechanisms that support operational excellence.
9. Lead projects and virtual teams to drive operational improvements.
10. Create and review documentation; design/influence new standard operating procedures.
11. Identify and troubleshoot recurring platform issues and own projects to drive improvements.
12. Mentor peers in your areas of technical and operational strength.
13. Perform other duties as required by the organization.
BASIC QUALIFICATIONS
1. 1+ year of experience in a similar role.
2. 2+ years of virtualization, orchestration, and cloud computing (e.g., Hypervisors, VMware, Xen) experience.
3. 1+ year of network and operating system support experience.
4. Bachelor's degree in computer science or equivalent, or 3+ years of technical support experience.
PREFERRED QUALIFICATIONS
1. Experience creating or designing cloud application architectures with a focus on high availability and fault tolerance.
2. Experience with data manipulation and/or automation using Python, JavaScript, or shell scripting.
3. Effective prioritization and time management skills and an ability to work in ambiguous environments.
4. Demonstrated critical thinking and logical problem-solving skills.
5. Familiarity operating or designing distributed architectures with the ability to correlate system behaviors based on known inter-dependencies.
Amazon is an equal opportunities employer. We believe passionately that employing a diverse workforce is central to our success. We make recruiting decisions based on your experience and skills. We value your passion to discover, invent, simplify and build. Protecting your privacy and the security of your data is a longstanding top priority for Amazon. Please consult our Privacy Notice to know more about how we collect, use, and transfer the personal data of our candidates.
#J-18808-Ljbffr
-
Dublin, Dublin City, Ireland Amazon Full timeIncident Management Engineer, AWS Incident Detection and ResponseJob ID: 2882806 | Amazon Web Services EMEA SARL (Irish Branch)ABOUT USAmazon has built a reputation for excellence with a mission to be the earth's most customer-centric company. Amazon Web Services (AWS) is carrying on that tradition while leading the world in cloud technologies.The AWS...
-
Dublin, Dublin City, Ireland Amazon Full timeIncident Management Engineer, AWS Incident Detection and ResponseJob ID: 2882806 | Amazon Web Services EMEA SARL (Irish Branch)ABOUT USAmazon has built a reputation for excellence with a mission to be the earth's most customer-centric company. Amazon Web Services (AWS) is carrying on that tradition while leading the world in cloud technologies.The AWS...
-
Dublin, Dublin City, Ireland Amazon Full timeABOUT USAmazon has built a reputation for excellence with a mission to be the earth's most customer-centric company, a company that customers from all over the globe will recognize, value, and trust for both our products and our service. Amazon Web Services (AWS) is carrying on that tradition while leading the world in cloud technologies.The AWS Incident...
-
AWS Incident Response Lead
2 days ago
Dublin, Dublin City, Ireland Amazon Full timeAbout AWS Incident ToolingAWS Incident Tooling plays a critical role in ensuring the high availability of Amazon Web Services (AWS). Our team is responsible for detecting and resolving issues within AWS infrastructure, leveraging automated tooling to minimize downtime and optimize recovery times.As a Software Development Manager on our team, you will lead...
-
Support Engineer
3 weeks ago
Dublin, Dublin City, Ireland Amazon Full timeSupport Engineer - Incident Management, AWS Incident Response (AIR)Job ID: 2865353 | Amazon Development Centre Ireland LimitedAWS Incident Response is at the heart of high availability of Amazon Web Services. We make customer impacting events shorter and less frequent by providing large scale event and incident management. Our automated tooling quickly...
-
Incident Management Engineer
1 day ago
Dublin, Dublin City, Ireland Amazon Full timeABOUT USAmazon has built a reputation for excellence with a mission to be the earth's most customer-centric company, a company that customers from all over the globe will recognize, value, and trust for both our products and our service. Amazon Web Services (AWS) is carrying on that tradition while leading the world in cloud technologies.The AWS Incident...
-
Dublin, Dublin City, Ireland Amazon Full timeSoftware Development Manager, AWS Incident Tooling & ResponseJob ID: 2830638 | Amazon Development Centre Ireland LimitedAWS Resilience owns services that prevent and respond to availability and security issues for all AWS Services. In other words, we're the people who keep the cloud running. We work on the most challenging problems, with constant new...
-
Security Engineer, Incident Response
1 week ago
Dublin, Dublin City, Ireland ENGINEERINGUK Full timeAmazon is seeking a qualified Security Engineer to join our innovative, high energy Information Security team. In this role you will work within the Amazon Security Incident Response Team (SIRT). SIRT Security Engineers respond to security events, conduct analysis of threats such as malware and intrusion attempts, and provide security services to safeguard...
-
Incident Management Engineer
24 hours ago
Dublin, Dublin City, Ireland Amazon Full timeABOUT USAmazon has built a reputation for excellence with a mission to be the earth's most customer-centric company, a company that customers from all over the globe will recognize, value, and trust for both our products and our service. Amazon Web Services (AWS) is carrying on that tradition while leading the world in cloud technologies.The AWS Incident...
-
Dublin, Dublin City, Ireland Amazon Full timeSoftware Development Engineer, AWS Incident Tooling & ResponseJob ID: 2795181 | Amazon Data Services Ireland LimitedAmazon Web Services is the largest consumer cloud offering in the world, powering cutting edge science, rapidly growing start-ups and industry-leading companies.The AWS Incident Response Systems team is building systems to ensure these AWS...
-
Dublin, Dublin City, Ireland Amazon Full timeSoftware Development Engineer, AWS Incident Tooling & ResponseJob ID: 2795181 | Amazon Data Services Ireland LimitedAmazon Web Services is the largest consumer cloud offering in the world, powering cutting edge science, rapidly growing start-ups and industry-leading companies.The AWS Incident Response Systems team is building systems to ensure these AWS...
-
Dublin, Dublin City, Ireland Amazon Full timeSoftware Development Manager, AWS Incident Tooling & ResponseJob ID: 2830638 | Amazon Development Centre Ireland LimitedAWS Resilience owns services that prevent and respond to availability and security issues for all AWS Services. In other words, we're the people who keep the cloud running. We work on the most challenging problems, with constant new...
-
Dublin, Dublin City, Ireland ENGINEERINGUK Full timeSoftware Development Manager, AWS Incident Tooling & ResponseDESCRIPTIONAWS Resilience owns services that prevent and respond to availability and security issues for all AWS Services. In other words, we're the people who keep the cloud running. We work on the most challenging problems, with constant new services and possible failure modes to prevent - and...
-
Dublin, Dublin City, Ireland Amazon Full timeSoftware Development Manager, AWS Incident Tooling & ResponseJob ID: | Amazon Development Centre Ireland LimitedAWS Resilience owns services that prevent and respond to availability and security issues for all AWS Services.In other words, we're the people who keep the cloud running.We work on the most challenging problems, with constant new services and...
-
Incident Response Expert
3 days ago
Dublin, Dublin City, Ireland Amazon Full timeWe are looking for a skilled Security Engineer to join our Information Security team.In this role, you will be part of the Security Incident Response Team (SIRT) and will be responsible for responding to security events, conducting threat analysis, and providing security services to safeguard sensitive data.You will work closely with detection systems and...
-
Dublin, Dublin City, Ireland Amazon Full timeSoftware Development Manager, AWS Incident Tooling & ResponseJob ID: 2830638 | Amazon Development Centre Ireland LimitedAWS Resilience owns services that prevent and respond to availability and security issues for all AWS Services. In other words, we're the people who keep the cloud running. We work on the most challenging problems, with constant new...
-
Dublin, Dublin City, Ireland ENGINEERINGUK Full timeSecurity Engineer I, Security Incident Response Team (SIRT)DESCRIPTIONAmazon is seeking qualified Security Engineers to join our innovative, high energy Information Security team. In this role you will work within the Amazon Security Incident Response Team (SIRT). SIRT Security Engineers respond to security events, conduct analysis of threats such as malware...
-
Dublin, Dublin City, Ireland Amazon Full timeAmazon never asks for fees or deposits in any form during recruitment process.Please click here to learn more and safeguard yourself from potential frauds.Software Development Engineer, AWS Incident Tooling & ResponseJob ID: | Amazon Data Services Ireland LimitedAmazon Web Services is the largest consumer cloud offering in the world, powering cutting edge...
-
Senior Software Engineer, Incident Response
2 weeks ago
Dublin, Dublin City, Ireland Squarespace Full timeThe Squarespace Incident Response & Observability team is looking for a Senior Software Engineer to lead the automation & experimentation efforts for detection, monitoring, and mitigation across Squarespace-powered systems, to protect our Customers from product and service degradations, incidents and outages, and empower our engineering staff with the...
-
Security Engineer, Incident Response
1 week ago
Dublin, Dublin City, Ireland Amazon Full timeAmazon is seeking a qualified Security Engineer to join our innovative, high energy Information Security team. In this role, you will work within the Amazon Security Incident Response Team (SIRT). SIRT Security Engineers respond to security events, conduct analysis of threats such as malware and intrusion attempts, and provide security services to safeguard...