Reliability Engineering: High Availability, Resilience & Observability

Durée totale: 16 heure(s)

Localisation: A cet endroit
Date et lieu de début: 3 date(s)

Reliability Engineering: High Availability, Resilience & Observability

Web Infra Academy (EN)

Réserver maintenant

Note du fournisseur:

Astuce: besoin de plus d'informations sur la formation? Téléchargez la brochure!

Réserver maintenant Recevoir une brochure

Dates et lieux de début

Nieuwegein

29 juin 2026 jusqu'au 30 juin 2026

Réserver maintenant

voir détails

29 juin 2026, Nieuwegein, Day 1

30 juin 2026, Nieuwegein, Day 2

Nieuwegein

8 oct. 2026 jusqu'au 9 oct. 2026

Réserver maintenant

voir détails

8 octobre 2026, Nieuwegein

9 octobre 2026, Nieuwegein

Nieuwegein

17 déc. 2026 jusqu'au 18 déc. 2026

Réserver maintenant

voir détails

17 décembre 2026, Nieuwegein

18 décembre 2026, Nieuwegein

Description

Modern IT systems are complex, distributed and constantly evolving. Reliability does not happen by accident — it must be actively built, monitored and improved.

In this training, you will learn how to build and operate systems that remain stable under pressure, handle failures in a controlled way and provide continuous insight into their behavior. The focus is not only on infrastructure, but especially on applications and microservices: how software behaves in production and what is required to keep it reliable.

You will work with principles from Site Reliability Engineering (SRE) and learn how development and operations come together in a DevOps way of working. You will see how decisions …

Lisez la description complète ici

Foire aux questions (FAQ)

Il n'y a pour le moment aucune question fréquente sur ce produit. Si vous avez besoin d'aide ou une question, contactez notre équipe support.

Vous n'avez pas trouvé ce que vous cherchiez ? Voir aussi : Devops, Développement mobile, Développement Web, Amazon Web Services (AWS) et Utiliser le Cloud.

Modern IT systems are complex, distributed and constantly evolving. Reliability does not happen by accident — it must be actively built, monitored and improved.

You will work with principles from Site Reliability Engineering (SRE) and learn how development and operations come together in a DevOps way of working. You will see how decisions in application behavior, dependencies and integrations directly impact availability, performance and recovery.

You will learn how to handle failures in practice: from retries and backpressure to circuit breakers and graceful degradation — not as isolated patterns, but as part of systems that continue to function under real-world conditions.

Observability plays a central role: you will work with metrics, logs and traces, and learn how to use SLI’s, SLO’s and error budgets to make reliability measurable and to align it with user experience and business impact.

You will also gain insight into data reliability and distributed systems behavior, including consistency trade-offs (CAP and PACELC), so systems are not only available, but also correct.

The training covers the full lifecycle: build, deploy, monitor, validate and improve. You will learn how to test reliability with resilience testing and chaos engineering, and how to continuously improve based on production data.

The course material (slides) is in English and reflects real-world practices in modern IT organizations.

This training is available as classroom training and as e-learning. Classroom sessions can be attended on-site or virtually (via Microsoft Teams or Zoom). The e-learning is fully in English and includes English subtitles.

Our training is also delivered through selected international training partners, allowing participation outside the Netherlands. Contact us for current availability and locations.

Who should attend:

This training is designed for technical professionals involved in building, operating and improving modern IT systems.

Typical participants include:

* DevOps and platform engineers

* Software engineers

* Solution and cloud architects

* IT managers and technical leads

What you will learn:

* How systems and microservices behave under failures and peak load

* High availability and failover in practice (zones, regions and dependencies)

* Resilience strategies such as retries, backpressure, circuit breakers and graceful degradation

* How to use SLI’s, SLO’s and error budgets to manage reliability

* Observability with metrics, logs and traces, and the move toward system intelligence

* Trade-offs in distributed systems such as CAP, PACELC and consistency vs availability

* How to ensure data reliability (replication, recovery, integrity and consistency)

* How to validate reliability with testing and chaos engineering

Results:

After this training, you will be able to:

* Build and operate more reliable systems and microservices in production

* Detect, understand and resolve issues faster using observability

* Make better decisions on how systems handle failures and dependencies

* Align reliability with user experience and business impact

* Collaborate more effectively within DevOps teams

* Continuously improve system reliability instead of only reacting to incidents

Course Agenda

Architecture in Practice
Scope, Mindset & Shared Language
Software Resilience & Designing for Failure
High Availability Architecture
Safe Change & Delivery Reliability
Data Reliability & State Management
Resilience Validation & Chaos Engineering
System Intelligence & Observability
Adoption, Governance & Reliability Maturity

Architecture in Practice: Understand how modern systems evolve in real-world environments, driven by trade-offs, simplicity (KISS) and continuous change. Learn why reliability is influenced by decisions across the entire lifecycle — not just infrastructure.
Scope, Mindset & Shared Language: Build a solid foundation in reliability engineering, including SLI, SLO and error budgets. Learn how SRE principles and reliability economics guide both development and operational decisions.
Software Resilience & Designing for Failure: Build applications and microservices that handle failure gracefully using retries, backoff, circuit breakers and bulkheads. Prevent cascading failures with loose coupling, backpressure and isolation patterns.
High Availability Architecture: Understand how systems stay available in practice using redundancy, failover and multi-zone or multi-region setups. Learn how to control blast radius and design for predictable recovery.
Safe Change & Delivery Reliability: Deliver changes safely using CI/CD, GitOps, Infrastructure as Code and progressive delivery strategies such as canary and blue/green deployments. Use guardrails, feature flags and automated policies to reduce risk.
Data Reliability & State Management: Manage data consistency, replication and recovery in distributed systems. Understand CAP and PACELC trade-offs, eventual consistency and how to prevent data loss or corruption.
Resilience Validation & Chaos Engineering: Validate system behavior under stress using resilience testing and chaos engineering. Define steady state, test realistic failure scenarios and safely experiment in production environments.
System Intelligence & Observability: Go beyond monitoring with observability, tracing and system intelligence. Apply concepts such as the four golden signals and use data-driven insights to detect, understand and prevent issues.
Adoption, Governance & Reliability Maturity: Scale reliability across teams using platform engineering, policy-as-code and governance models. Implement guardrails, maturity models and continuous improvement loops.

This reliability engineering training helps you build, operate and continuously improve systems that remain stable under real-world conditions.

Rester à jour sur les nouveaux avi

Pas encore d'avis.

Demander des informations à propos de formation. Dorénavant, nous recevrez aussi une notification lorsque qu'un autre utilisateur partage son avis. C'est un bon moyen de vous encourager à continuer d'apprendre!
Voir les produits similaires avec des avis: Devops.

Partagez vos avis

Avez-vous participé à formation? Partagez votre expérience et aider d'autres personnes à faire le bon choix. Pour vous remercier, nous donnerons 1,00 € à la fondation Stichting Edukans.

Marcel Punselie

9,1

Roland Mammen

Il n'y a pour le moment aucune question fréquente sur ce produit. Si vous avez besoin d'aide ou une question, contactez notre équipe support.

Recevoir une brochure d'information (gratuit)

Prénom: (optionnel)

Nom: (optionnel)

Adresse e-mail: (optionnel)

Entreprise: (optionnel)

Téléphone: (optionnel)

Date / lieu de préférence: (optionnel)

Vous avez des questions?

Votre question: (optionnel)

Nous conservons vos données personnelles dans le but de vous accompagner par email ou téléphone.
Vous pouvez trouver plus d'informations sur : Politique de confidentialité.