Cloud Reliability Thought Leader
3 days ago
About the Job
We're seeking a talented technical leader to spearhead the development and implementation of next-generation Reliability KPIs across Huawei Cloud.
In this high-impact role, you'll shape how we measure and incentivize engineering efforts around reliability, ensuring alignment between customer experience and cloud system performance.
This position requires deep technical expertise in cloud reliability as well as the ability to influence senior leadership and drive large-scale organizational change.
If you're passionate about aligning cloud infrastructure with business objectives through data-driven decision-making, this is an exciting opportunity for you.
The Cloud Reliability Lab at the Huawei Ireland Research Center aims to bring world-class reliability to Huawei Cloud by solving cross-functional problems that span hardware, software, networking, monitoring, and operations.
Our teams are working on these areas with a diverse mix of people including industry veterans, academic researchers, and Ph.D. student interns.
In your role, you will collaborate with local research teams, other European research centers, and engineering teams spread across the globe.
Responsibilities:
- Lead the definition and evolution of Reliability KPIs across all Huawei Cloud services, ensuring they correlate with real-world customer experiences and organizational objectives.
- Guide the evolution of existing observability platforms in Huawei Cloud to balance trade-offs between observability coverage, system performance, and operational cost.
- Build scalable solutions for high availability in observability systems themselves.
- Integrate Critical User Journeys (CUJs) into observability systems to ensure the reliability of critical customer-facing workflows.
- Evolving incident management practices to have a stronger alignment with Reliability KPIs, driving improvements in incident response processes.
Requirements:
- 10+ years of experience in cloud infrastructure, with 5+ years in architect or leadership roles.
- Proven expertise in architecting and defining Reliability KPIs in hyperscale cloud environments, ensuring they map directly to business outcomes (customer experience, service uptime, cost optimization).
- Deep understanding of distributed systems development, maintenance, debugging, and the trade-offs involved in building observability solutions at scale.
- Experience in driving organizational change around observability, reliability practices, and incident management.
- Exceptional communication skills, with the ability to align teams around a shared vision for cloud reliability and observability, while also influencing senior leaders to prioritize reliability in business and operational decision-making.
-
Cloud Reliability Engineering Leader
6 days ago
Dublin, Dublin City, Ireland TN Ireland Full timeAbbott is seeking a Cloud Reliability Engineering Leader to join our team in TN Ireland. As an experienced leader, you will be responsible for driving the reliability of our platform and product operations.This role is ideal for someone who thrives in a fast-paced environment, where collaboration and drive are key to success. You will lead a team of Site...
-
Cloud Operations Leader
16 hours ago
Dublin, Dublin City, Ireland Amazon Full time**About the Role**Amazon is seeking an experienced Cloud Operations Leader to spearhead the launch of our European Sovereign Cloud (ESC) initiative. As part of the AWS Managed Operations team, you will play a pivotal role in building and leading operations and development teams dedicated to delivering high-availability AWS services.You will oversee the...
-
Site Reliability Engineer
12 hours ago
Dublin, Dublin City, Ireland Arista Networks Full timeArista Networks is an industry leader in data-driven, client-to-cloud networking for large data center, campus and routing environments. What sets us apart is our relentless pursuit of innovation. We leverage the latest advancements in cloud computing, artificial intelligence, and software-defined networking to provide our clients with a competitive edge in...
-
European Cloud Services Leader
2 days ago
Dublin, Dublin City, Ireland Amazon Full timeAmazon is actively seeking a European Cloud Services Leader to join its team. As part of the AWS Managed Operations team, you will be responsible for overseeing the launch of the European Sovereign Cloud (ESC), working closely with global AWS teams, and influencing the evolution of AWS services and technology.Your responsibilities will include collaborating...
-
Cloud Operations Leader
5 days ago
Dublin, Dublin City, Ireland Amazon Full timeAbout the RoleWe are seeking a skilled Systems Engineer to join our AWS Managed Operations team. As part of this team, you will play a crucial role in building and leading operations and development teams dedicated to delivering high-availability AWS services.Your responsibilities will encompass overseeing the launch of our European Sovereign Cloud (ESC)...
-
Cloud Reliability Engineer
2 days ago
Dublin, Dublin City, Ireland Google Inc. Full timeAbout the JobSite Reliability Engineering (SRE) at Google combines software and systems engineering to build and run large-scale, massively distributed, fault-tolerant systems. We ensure that Google Cloud's services have reliability, uptime appropriate to customer needs, and a fast rate of improvement.SREs keep an ever-watchful eye on our systems capacity...
-
Cloud Reliability Expert
2 days ago
Dublin, Dublin City, Ireland Huawei Ireland Research Center Full timeThe Huawei Ireland Research Centre is revolutionising the way cloud systems operate. As a Senior Architect, you will spearhead the research and development of advanced solutions for observability, incident response, system optimisation, and fault prediction for planet-scale cloud infrastructure.In this role, you will collaborate with local research teams,...
-
Site Reliability Engineering Leader
2 days ago
Dublin, Dublin City, Ireland J.P MORGAN S.E Dublin Branch Full time**Job Description** At J.P. Morgan, we are committed to building trusted, long-term partnerships with our clients by providing strategic advice and products that meet their evolving needs. We are seeking a highly skilled Site Reliability Engineering Leader to join our Commercial & Investment Bank's Digital & Platform Services division. This is a...
-
Cloud Operations Leader
2 days ago
Dublin, Dublin City, Ireland Amazon Full timeWe're looking for a seasoned Cloud Operations Leader to drive strategic team efforts and lead the creation, revision, and improvement of standard operational procedures (SOPs). As part of the AWS Managed Operations team, you'll play a key role in building and leading operations teams dedicated to delivering high-availability AWS services.Key responsibilities...
-
Cloud Systems Development Leader
6 days ago
Dublin, Dublin City, Ireland Amazon Full timeWe are seeking a highly skilled Cloud Systems Development Leader to drive the introduction of the European Sovereign Cloud (ESC) and shape the future of cloud operations. As part of the AWS Managed Operations team, you will lead operations and development teams in delivering high-availability AWS services exclusively for EU customers.Responsibilities:Lead...
-
Cloud Database Engineering Leader
2 days ago
Dublin, Dublin City, Ireland Amazon Development Centre Ireland Limited Full time**About the Role**A Software Development Manager at Amazon Development Centre Ireland Limited is responsible for leading a team of passionate engineers to build reliable and highly scalable distributed systems.This is an excellent opportunity to pave the path for a new generation of cloud database services, grow as a technology leader, and work with some...
-
Dublin, Dublin City, Ireland Amazon Full timeAre you a Reliability and Scalability Expert looking for a challenging role? We are seeking a skilled professional to join our Global Cloud Services team. As a member of this team, you will be responsible for defining availability goals for service teams across AWS and strategies to make these goals attainable with minimal effort.You will also contribute to...
-
Cloud Operations Leader
3 days ago
Dublin, Dublin City, Ireland Amazon Full timeJob Description\We are seeking an experienced Systems Engineer to lead the launch of our European Sovereign Cloud (ESC) in 2025. As part of our AWS Managed Operations team, you will play a pivotal role in building and leading operations and development teams dedicated to delivering high-availability AWS services, including EC2, S3, Dynamo, Lambda, and...
-
Cloud Database Leader
6 days ago
Dublin, Dublin City, Ireland Engineeringuk Full timeCompany OverviewAt Engineeringuk, we are shaping the future of cloud database services that support mission-critical workloads. Our team is responsible for delivering a key-value and document database with single-digit-millisecond performance at any scale.We operate at more than 100 million requests per second, serving some of the world's fastest-growing...
-
Cloud Solution Specialist
2 weeks ago
Dublin, Dublin City, Ireland Taiga Cloud Limited Full timeCloud Solution Architect Platform (m/f/d)As the AI Cloud Solution Architect – Platform, you will play a critical role in shaping and implementing the platform architecture that powers our AI cloud solutions. Your role is to secure and win customers' technical decision and empower them to utilize our technology. You will collaborate with clients, early in...
-
Reliability KPI Architect
3 days ago
Dublin, Dublin City, Ireland Huawei Ireland Research Center Full timeAbout the RoleWe're looking for an exceptional technical leader to define, implement, and champion the adoption of next-generation Reliability KPIs across Huawei Cloud.This is a high-impact role where you'll have the opportunity to shape how we measure and incentivize engineering efforts around reliability, ensuring a clear alignment between customer...
-
Cloud Reliability Engineer
5 days ago
Dublin, Dublin City, Ireland Google Full timeAbout the Role:Site Reliability Engineering at Google combines software and systems expertise to build and maintain large-scale, fault-tolerant systems. As a Cloud Reliability Engineer, you will ensure that our services have reliability, uptime appropriate to customer needs and a fast rate of improvement. This involves managing project priorities, deadlines,...
-
Network Reliability Specialist
5 days ago
Dublin, Dublin City, Ireland Huawei Ireland Research Center Full timeAbout the PositionHuawei Ireland Research Center is seeking an experienced Principal Network Architect to lead the development of cutting-edge data center solutions for Huawei Cloud's hybrid optical-electrical network infrastructure. As a member of our team, you will play a critical role in shaping the future of hyperscale cloud infrastructure and delivering...
-
Cloud Leader
2 days ago
Dublin, Dublin City, Ireland Amazon Full timeJob Description:Company Overview:As a leader in cloud computing, Amazon is constantly seeking top talent to help drive innovation and growth.
-
Cloud Systems Engineer Leader
4 days ago
Dublin, Dublin City, Ireland Amazon Full timeOverviewAWS is set to introduce the inaugural European Sovereign Cloud (ESC), a significant development in utility computing.To spearhead this initiative, we are actively seeking experienced system engineers with a strong background in automation and operations. This role involves overseeing the launch of the ESC, working closely with global AWS teams, and...