
Reliable AI Systems Professional
2 days ago
About Anthropic
Anthropic aims to create safe and beneficial artificial intelligence systems. Our mission is to advance the capabilities of AI while ensuring reliability, interpretability, and controllability.
About the Role
We are seeking talented engineers with experience in reliability to join our team. Your primary responsibility will be defining and achieving reliability metrics for our internal and external products and services.
Key Responsibilities:
- Develop Service Level Objectives for large language model serving and training systems.
- Design and implement monitoring systems, including availability, latency, and other salient metrics.
- Assist in designing high-availability language model serving infrastructure capable of handling millions of users.
- Develop automated failover and recovery systems for model serving deployments across multiple regions and cloud providers.
- Lead incident response for critical AI services, ensuring rapid recovery and systematic improvements from each incident.
Requirements:
- Extensive experience with distributed systems observability and monitoring at scale.
- Understanding of the unique challenges of operating AI infrastructure, including model serving, batch inference, and training pipelines.
- Proven experience implementing and maintaining SLO/SLA frameworks for business-critical services.
Benefits:
- Competitive compensation and benefits package.
- Optional equity donation matching.
- Generous vacation and parental leave policies.
- Flexible working hours.
How We Work
We believe that the highest-impact AI research will be big science. At Anthropic, we work as a single cohesive team on just a few large-scale research efforts, valuing impact over individual puzzles. As such, we greatly value communication skills.
Come Work with Us
Anthropic is a public benefit corporation headquartered in San Francisco. We offer a collaborative environment where you can contribute to groundbreaking AI research and development.
-
AI Infrastructure Reliability Expert
19 hours ago
Dublin, Dublin City, Ireland beBeeEngineer Full time €60,000 - €90,000Senior Site Reliability EngineerWe are building the world's leading AI-first cloud infrastructure company.Our vertically integrated, purpose-built AI infrastructure solutions are trusted by Fortune 500 companies to power their most advanced AI applications.We are redefining AI cloud infrastructure with a mission to align computing with the future of the...
-
AI Infrastructure Reliability Manager
4 days ago
Dublin, Dublin City, Ireland beBeeEngineering Full time €90,000 - €120,000Reliability Engineering LeaderThis position is a unique opportunity to lead a team of engineers focused on defining and achieving reliability metrics for internal and external products and services.As a reliability engineering leader, you will oversee the development of service level objectives that balance availability/latency with development velocity...
-
Senior Site Reliability Engineer
1 week ago
Dublin, Dublin City, Ireland Crusoe Energy Systems LLC Full timeCrusoe is building the World's Favorite AI-first Cloud infrastructure company. We're pioneering vertically integrated, purpose-built AI infrastructure solutions trusted by Fortune 500 companies to power their most advanced AI applications. Crusoe is redefining AI cloud infrastructure, with a mission to align the future of computing with the future of the...
-
Senior System Reliability Specialist
2 days ago
Dublin, Dublin City, Ireland beBeeReliability Full time €235,000 - €355,000Reliability Engineer RoleAbout Our MissionOur mission is to create reliable, interpretable, and steerable AI systems that are safe and beneficial for our users and for society as a whole.Our team is a quickly growing group of committed researchers, engineers, policy experts, and business leaders working together to build beneficial AI systems. Job OverviewWe...
-
Senior Site Reliability Engineer
16 hours ago
Dublin, Dublin City, Ireland Crusoe Energy Systems LLC Full timeCrusoe is building the World's Favorite AI-first Cloud infrastructure company. We're pioneering vertically integrated, purpose-built AI infrastructure solutions trusted by Fortune 500 companies to power their most advanced AI applications. Crusoe is redefining AI cloud infrastructure, with a mission to align the future of computing with the future of the...
-
AI Reliability Engineer
7 days ago
Dublin, Dublin City, Ireland beBeeReliability Full time €100,000 - €200,000Job DescriptionWe are seeking a talented Reliability Engineer to join our team. In this role, you will be responsible for developing and implementing monitoring systems, designing high-availability infrastructure, and leading incident response efforts.The ideal candidate will have extensive experience with distributed systems observability and monitoring at...
-
AI Agent Engineer
1 week ago
Dublin, Dublin City, Ireland Naptha AI Full timeJoin to apply for the AI Agent Engineer role at Naptha AI.About The RoleWe are seeking an AI Agent Engineer to join our team at Naptha AI, where you'll help build and test AI agents using our interoperability platform. This role is perfect for developers who are passionate about AI and eager to gain hands-on experience with the latest agent frameworks and...
-
Dublin, Dublin City, Ireland Anthropic Full timeEngineering Manager, AI Reliability EngineeringDublin, IEAbout AnthropicAnthropic's mission is to create reliable, interpretable, and steerable AI systems. We want AI to be safe and beneficial for our users and for society as a whole. Our team is a quickly growing group of committed researchers, engineers, policy experts, and business leaders working...
-
Engineering Manager, AI Reliability Engineering
16 hours ago
Dublin, Dublin City, Ireland Anthropic Full timeEngineering Manager, AI Reliability EngineeringDublin, IEAbout AnthropicAnthropic's mission is to create reliable, interpretable, and steerable AI systems. We want AI to be safe and beneficial for our users and for society as a whole. Our team is a quickly growing group of committed researchers, engineers, policy experts, and business leaders working...
-
Leading AI Reliability Engineering Specialist
15 hours ago
Dublin, Dublin City, Ireland beBeeReliability Full time €150,000 - €208,920Job TitleWe are seeking an experienced engineering leader to manage our Reliability Engineering team.This team includes Software Engineers and Systems Engineers focused on defining and achieving reliability metrics for all of our internal and external products and services.Responsibilities:Lead and grow a team of reliability engineers responsible for large...