SRE Manager
12 hours ago
Apple Services Engineering team is one of the most exciting examples of Apple’s long-held passion for combining art and technology. Join Apple Services Engineering Cloud Service Infrastructure team, as a Site Reliability Engineering Manager, to help support and scale cloud services for millions of Apple users. We are building and supporting new and existing critical infrastructural systems and frameworks which provide and support services like structured and unstructured storage, caching, queueing, searching, and much more at hyperscale. These form the platform upon which many iCloud and other backend systems at Apple are built. The team is responsible for the next generation platform that will power Apple’s infrastructural services. These services operate at extremely large scale and store exabytes of data. The platform will support a variety of services based on open-source software, such as Kubernetes, Cassandra, Zookeeper, Kafka, Redis, etc, alongside internally developed services. This is a hands-on role, to establish SRE practices for a private cloud service, to accelerate our ability to reliably and consistently deliver thousands of applications. You will lead a team of Site Reliability Engineers who thrive in a fast-paced workplace, where drive and collaboration are the keys to success
Description
The Apple Services Engineering Cloud Services SRE organization is looking for a strong, hands-on leader. The leader will lead a platform focused SRE team, and be responsible for the reliability of the platform. The platform serves workloads that provide our organisation and our customers with their favourite applications, services, and tools. We are domain experts in fleet management, systems, and software engineering. We build automations, instrument reliability tools, and respond to alerts and incidents which may pose a risk to the reliability of the platform. Team’s focus is on infrastructure capabilities and processes, improving the reliability and efficiency of the systems, at scale.
Responsibilities include:
1. Act as the Service Owner, designing and mapping key performance indicators to achieve the organization’s mission
2. Lead the definition of requirements, priorities and planning of engineering deliverables
3. Implement structured engineering and operations processes
4. Lead the team in daily agile SRE practices, ensuring proper team focus on priorities, achievements, and deliverables
5. Optimise velocity and efficiency of delivery, and drive continuous improvement
Success depends on strong understanding of SRE principles and practices, combined with a track record of resolving issues in a live production environment, and implementing strategies to minimize them while driving clear action plans for the team. The successful candidate will be highly self-motivated with a passion for excellence, quality, and detail. As a leader, they are responsible for coaching and mentoring their team members, helping them achieve service goals, and build career paths in alignment. It’s imperative for the leader to empower their team by providing appropriate context and timely feedback. The leader will not only own the service, but will also collaborate with other teams within Apple. They will build trust with stakeholders and partner through diplomacy, discussion, and follow-through. This is a broad cross-organisation role with high-visibility, collaborating with multiple teams. They are expected to invest in and build good relations with key partners. Their collaboration with internal customers, product engineering, and development groups is critical to success.
Minimum Qualifications
- Experience in critical, large scale distributed systems experience, combining Hardware, Operating Systems and Software
- Experience building and leading engineering teams; ideally SRE or Production Engineering
- Strong emphasis on SRE as an engineering subject area, with proficiency in at least in one of the following languages (Golang, Rust, Python, Swift)
- Understanding of SRE principals, including monitoring, alerting, error budgets, fault analysis, and other common reliability engineering concepts, with a keen eye for opportunities to eliminate toil by code and process improvements
- Superb interpersonal skills, capable of working with multi-functional technical and business teams and varying levels of management, influencing decision making
- Bachelors or Masters in Computer Science, Computer Engineering, or equivalent experience.
Preferred Qualifications
- Working with large bare-metal infrastructure and release management.
- Experience with large scale server provisioning, fleet management and maintenance
- Experience with development within Kubernetes ecosystem, including operator framework, controllers and CRDs
- Hardware bootstrap and associated security (PXE, BIOS, TPM, secure boot, trusted computing)
- Automating operations processes via services and tools
- Configuration management and fleet orchestration via Puppet, Chef, Ansible, or others
#J-18808-Ljbffr
-
Site Reliability Engineer
2 days ago
Dublin, Ireland September Consulting Ltd Full timeSite Reliability Engineer (SRE) (6 months) €510 a day REMOTE or Hybrid Want to work for a large banking multinational in Dublin with a diverse technical infrastructure supporting an enterprise platform? Their SRE team are responsible for ensuring that the platform is stable and healthy, empowering developers to build resilient products – working on...
-
(15h Left) Site Reliability Engineer
23 hours ago
Dublin, Ireland Apple Inc. Full timeDublin, County Dublin, Ireland Software and Services People at Apple don’t just build products — they craft the kind of experience that have revolutionized entire industries. The diverse collection of our people and their ideas inspire innovation in everything we do. Imagine what you could do here! Join Apple, and help us leave the world better than we...
-
Dublin, Ireland Google Inc. Full timeAbout the job Site Reliability Engineering (SRE) combines software and systems engineering to build and run large-scale, massively distributed, fault-tolerant systems. SRE ensures that Google Cloud's services—both our internally critical and our externally-visible systems—have reliability, uptime appropriate to customer's needs and a fast rate of...
-
Dublin, Ireland Google Inc. Full timeSoftware Engineer III, Site Reliability Engineering corporate_fare Google place Dublin, Ireland Mid Experience driving progress, solving problems, and mentoring more junior team members; deeper expertise and applied knowledge within relevant area. Minimum Qualifications: - Bachelor’s degree in Computer Science, a related field, or equivalent practical...
-
Dublin, Ireland Google Inc. Full timeSenior Software Engineer, Site Reliability Engineering corporate_fare Google place Dublin, Ireland Apply Minimum Qualifications: - Bachelor’s degree in Computer Science, a related field, or equivalent practical experience. - 5 years of experience with software development in one or more programming languages. - 5 years of experience with data...
-
Dublin, Ireland Apple Inc. Full timeApple Services Engineering team is one of the most exciting examples of Apple’s long-held passion for combining art and technology. Join Apple Services Engineering Cloud Service Infrastructure team, as a Site Reliability Engineer, to help support and scale cloud services for millions of Apple users. We are building and supporting new and existing critical...
-
Cloud Service Engineer
3 days ago
Dublin, Ireland Adecco Ireland Full timeAre you a junior IT professional with experience in Linux, networking, databases, Cloud and programming? Are you in search for your next career move? Are you fluent in English and Chinese? If you are tech savvy, creative, outgoing, and willing to roll up your sleeves and get things done in a fast-paced, rapidly changing environment, we may have the perfect...
-
▷ 15h Left! Site Reliability Engineer
3 days ago
Dublin, Ireland Apple Inc. Full timePeople at Apple don’t just build products — they craft the kind of experiences that have revolutionized entire industries. The diverse collection of our people and their ideas inspire innovation in everything we do. Imagine what you could do here! Join Apple and help us leave the world better than we found it.The Apple Services Engineering (ASE) System...
-
Senior Site Reliability Engineer
1 day ago
Dublin, Ireland Prove Full timeTitle: Senior Site Reliability Engineer Department: Internal Operations Reports To: Senior Manager, Site Reliability FLSA Status: N/A Location: Ireland Job Summary: The Senior Site Reliability Engineer is responsible for bringing a software engineering approach to Prove operations. Using software as a tool to manage systems, solve problems, and...
-
Apply Now: Senior Site Reliability Engineer
3 days ago
Dublin, Ireland Tbwa ChiatDay Inc Full timeReddit is a community of communities. It’s built on shared interests, passion, and trust and is home to the most open and authentic conversations on the internet. Every day, Reddit users submit, vote, and comment on the topics they care most about. With 100,000+ active communities and approximately 97M+ daily active unique visitors, Reddit is one of the...
-
Site Reliability Engineer
3 days ago
Dublin, Ireland Tiger Resourcing Group Full timeRole: Sr SRE (Application Support + Automation)Location: Dublin, IrelandExperience: 5-7 yearsSalary: 55K – 60K EUR/ yearAbout the Role· Plan, manage, and oversee all aspects of a Production Environment· Define strategies for Application Performance Monitoring, Optimization in Prod environment· Respond to Incidents and improvise platform based on...
-
▷ High Salary: Major Incident Manager
2 days ago
Dublin, Ireland J.P MORGAN S.E Dublin Branch Full timeJob Description Join our rapidly expanding Chase brand as a Major Incident Manager, where you'll manage incidents in a high transaction throughput technology context. We're seeking a customer-focused, analytical thinker who thrives on solving complex problems and promoting the right outcomes for our customers. You'll be part of a global team providing...
-
Dublin, Ireland ENGINEERINGUK Full timeYou will need to login before you can apply for a job. Systems Development Manager, Managed Operations View more categories View less categories Sector Operations and Facilities Management Role Manager Contract Type Permanent Hours Full Time DESCRIPTION AWS is set to introduce the inaugural European Sovereign Cloud (ESC), marking a significant...
-
Dublin, Ireland Amazon Full timeSoftware Development Engineer, CloudWatch, Amazon Intelligent Operations Job ID: 2825515 | Amazon Development Centre Ireland Limited Come and build the future with us as we change the way the world sees the Cloud! CloudWatch is the observability platform of AWS, built for developers, system operators, site reliability engineers (SRE), and IT managers....
-
Software Engineer III
6 hours ago
Dublin, Ireland LexisNexis Risk Solutions Full timeDo you enjoy being part of a team that works with a diverse range of products/technology? Are you a collaborative team player? About the Business LexisNexis Risk Solutions is the essential partner in the assessment of risk. Within our Insurance vertical, we provide customers with solutions and decision tools that combine public and industry-specific...
-
System Development Engineer, ESC
3 days ago
Dublin, Ireland ENGINEERINGUK Full timeDESCRIPTION Would you like to be an Engineer that builds the Cloud, rather than an Engineer that just uses it? At AWS, our Engineers look after the behind-the-scenes software and tools that make the world's largest cloud computing infrastructure possible. We have an amazing opportunity for you to join a world-class network team in a dynamic environment...
-
Dublin, Ireland Amazon Full timeJob ID: 2755380 | Amazon Data Services Ireland Limited Would you like to be an Engineer that builds the Cloud, rather than an Engineer that just uses it? At AWS, our Engineers look after the behind-the-scenes software and tools that make the world's largest cloud computing infrastructure possible. We have an amazing opportunity for you to join a...
-
Dublin, Ireland Avature Full timeOur mission is to revolutionize digital labor by developing and deploying the latest conversational artificial intelligence (AI) technology in IBM’s industry-leading digital labor platform WatsonX Orchestrate. We are proud of our state-of-the-art, secure, and scalable application infrastructure, where data confidentiality, performance, and security are...
-
Site Reliability Engineer
3 days ago
Dublin, Ireland DocuSign, Inc. Full timeCompany Overview Docusign brings agreements to life. Over 1.5 million customers and more than a billion people in over 180 countries use Docusign solutions to accelerate the process of doing business and simplify people’s lives. With intelligent agreement management, Docusign unleashes business-critical data that is trapped inside of documents. Using...
-
▷ [Immediate Start] Senior Software Engineer
2 days ago
Dublin, Ireland AdsWizz Full timeSenior Software Engineer, Platform Engineering Who We Are: SiriusXM and its brands (Pandora, SXM Media, AdsWizz, Simplecast, and SiriusXM Connected Vehicle Services) are leading a new era of audio entertainment and services by delivering the most compelling subscription and ad-supported audio entertainment experience for listeners in the car, at home, and...