Site Reliability Engineer

1 week ago


Dublin, Ireland Elwood Roberts Full time

Job: site Reliability Engineer Location: North Dublin Rate: 450-500 per day Type: Contract (12 months+) Working arrangement: Hybrid (1-2 days onsite per week) We have an excellent role for a combined Site Reliability Engineering (SRE) and Observability Engineering role to oversee and ensuring that complex software systems are reliable, scalable, making sure the systems transparent is measurable through advanced monitoring and data analysis. The role is based with a leading business within the aviation sector. Some of the Responsibilities will include: Work within the technology team to implement the SRE strategy and roadmap of best practices.Monitor capacity availability and system health of production environments.Improve reliability, quality, and time-to-market for all applicationsBuild systems and introduce tooling to manage applications and infrastructureDesign, implement, and maintain the observability platform (metrics, logging, tracing, and alerting) to support developers and operators.Own and evolve tooling and infrastructure related to Prometheus, Grafana, Open Telemetry or similar observability tools.Collaborate with SREs, developers, and infrastructure teams to define SLOs, SLAs, and SLIs and ensure proper instrumentation of services.Improve system performance and reliability through monitoring insights and proactive detection of anomalies.Develop automation and tooling to reduce manual effort and improve response times for incidents.Enable development teams with self-service dashboards and alerting configurations.Participate in on-call rotations and incident response processes to identify and address observability gaps.Offer primary operational support for distributed software applicationsWork with development team and build process to ensure tools and automation is in place to ensure reliable and quality code deployments. Work with development teams to test and improve servicesGather and analyse data from operating systems to troubleshoot and fine-tune performanceMeasure and optimize system performanceUnderstand the metrics and ensure the quality of the systemsContribute to platform management, capacity planning, design consulting, service level objective (SLOs) establishmentHelp improve process for making sure that on-call requests are managed efficiently without taking a hit at the reliability of the system.Optimizing on-call incident management through automated tools and software.Act as the storehouse of the information regarding the process and system in the team. Responsible for directing the issue to the right person so that quick action can be taken, and the downtime can be reduced for the system.Using trend analysis to review potential process challenges as well as the infrastructure and operations.Responsible for the documentation of all that knowledge.Provide regular updates on health and performance of production systems.Support a CI/CD framework through engagement with engineering teams.Supporting delivery and operations where necessaryChampion continuous improvement Use automation to create sustainable servicesMaintain continuity, capacity, and compliance. Professional Skills 5+ years hands-on experience with cloud technology as infrastructure or application developer.2-3 years experience as Site Reliability EngineerStrong experience with observability tools such as Prometheus, Grafana, Loki, Tempo, Elastic Stack, OpenTelemetry, or commercial solutions like Datadog, New Relic, or Splunk.Solid understanding of distributed systems, microservices, and cloud-native architectures.Proficiency with infrastructure-as-code tools (e.g., Terraform, Ansible, Helm) and container orchestration systems (e.g., Kubernetes).Experience with scripting and programming languages (e.g., Python, Go, Bash).Communication Skills: Effectively shares information and asks clarifying questions.Collaboration: Works well within a team, open to feedback, and engages cross-functionally.Problem-Solving: Analyses issues independently and collaborates on complex solutions.Time Management: Manages priorities effectively and adapts to shifting tasks.Mentoring & Leadership: Provides guidance to junior staff, delegates work and promotes team development.Customer-Centric Mindset: Balances technical goals with end-user and business needs, translating these effectively.Adaptability & Flexibility: Remains agile to address changes in technology, project goals, or priorities.Strategic Thinking: Thinks ahead and plans for long-term reliability, scalability, and impact.Ownership & Accountability: Takes ownership of work, sees issues through to resolution, and follows up on tasks and action items.Demonstrates ability to resolve problems at a management and technical level.Highly innovative with a drive for operational excellence



  • Dublin, Ireland Reperio Human Capital Full time

    Site Reliability Engineer 190995 Desired skills: SRE, Azure, SaaS, Dublin Site Reliability Engineer - Azure | SaaS | Dublin (Hybrid - 2 Days Onsite)A leading software company is looking for a Site Reliability Engineer (SRE) to join their growing team supporting a complex SaaS platform hosted in Microsoft Azure. This is a traditional SRE role focused on...


  • Dublin, Ireland G Treasury Ss, Llc Full time

    Site Reliability Engineer (Dublin, Hybrid)DevOps - Dublin 2 (Hybrid) The mission, should you choose to accept it, is to pioneer and scale GTreasurys system and application observability efforts, and reduce toil amongst our operational workstreams. You will work across a global set of hard-driving engineering, support, and technical operations teams that care...


  • Dublin, Ireland Crone Corkill Full time

    Crone Corkill have partnered with a technology consultancy who are searching for a Site Reliability Engineer to join a client in their Dublin office on a permanent basis.Expertise with Apache Kafka within a production environment is absolutely key here, with strong knowledge and experience across Kafka architecture, security, clusters, stream processing and...


  • Dublin, Ireland Fis Management Services Llc Full time

    Senior Site Reliability Engineer page is loadedSenior Site Reliability EngineerApply locations IRL DUBL 11-12 time type Full time posted on Posted 13 Days Ago job requisition id JRAre you curious, motivated and forward-thinking?At FIS you'll have the opportunity to work on some of the most challenging and relevant issues in financial services and...


  • Dublin, Ireland Fis Management Services Llc Full time

    Senior Site Reliability Engineer page is loadedSenior Site Reliability EngineerApply locations IRL DUBL 11-12 time type Full time posted on Posted 13 Days Ago job requisition id JRAre you curious, motivated and forward-thinking?At FIS you'll have the opportunity to work on some of the most challenging and relevant issues in financial services and...


  • Dublin, Ireland Fis Management Services Llc Full time

    Senior Site Reliability Engineer page is loadedSenior Site Reliability EngineerApply locations IRL DUBL ***** time type Full time posted on Posted 13 Days Ago job requisition id JR*******Are you curious, motivated and forward-thinking?At FIS you'll have the opportunity to work on some of the most challenging and relevant issues in financial services and...


  • Dublin, Ireland Ebay Full time

    Social network you want to login/join with:Software Engineer, Site Reliability, Dublincol-narrow-leftClient:eBayLocation:Dublin, IrelandJob Category:Other-EU work permit required:Yescol-narrow-rightJob Reference:b284d91f****Job Views:4Posted:Expiry Date:col-wideJob Description:At eBay, we're more than a global ecommerce leader — we're changing the way the...


  • Dublin, Ireland FIS, Inc. Full time

    Senior Site Reliability EngineerAre you curious, motivated and forward-thinking? At FIS you'll have the opportunity to work on some of the most challenging and relevant issues in financial services and technology. Our talented people empower us, and we believe in being part of a team that is open, collaborative, entrepreneurial, passionate and above all fun....


  • Dublin, Ireland Susquehanna International Group Full time

    OverviewAs aSite Reliability Engineerat Susquehanna, you'll be working alongside experienced engineers to solve real problems, being responsible for designing, supporting, maintaining and improving infrastructure across virtual and physical environments, applying a DevOps approach.This role is aimed at graduates and early career professionals passionate...


  • Dublin, Ireland Jpmorgan Chase & Co. Full time

    As aLead Site Reliability Engineerat JPMorgan Chase in theCommercial & Investment Bank's Digital & Platform Servicesdivision, you hold a leadership role in your team, demonstrate strong knowledge across multiple technical domains, and advise others on the technical and business issues facing them.Take lead and conduct resiliency design reviews, break up...