Brought to you by Data Center Knowledge
An accidental release of fire-suppression agent into the data center environment in Europe triggered a chain of events that led to an Azure cloud outage for a group of customers on September 29.
Microsoft Azure engineers attributed what they referred to as a “Storage Related Incident,” which led to systems going down for customers hosting virtual infrastructure in the company’s data center in Northern Europe, to an accident during routine maintenance of the facility’s fire-suppression system.
Related: How to Survive a Cloud Meltdown
From the incident report posted on the Azure status page:
During a routine periodic fire suppression system maintenance, an unexpected release of inert fire suppression agent occurred. When suppression was triggered, it initiated the automatic shutdown of Air Handler Units (AHU) as designed for containment and safety. While conditions in the data center were being reaffirmed and AHUs were being restarted, the ambient temperature in isolated areas of the impacted suppression zone rose above normal operational parameters. Some systems in the impacted zone performed auto shutdowns or reboots triggered by internal thermal health monitoring to prevent overheating of those systems.
Related: AWS Outage that Broke the Internet Caused by Mistyped Command
The staff brought the air handlers back online 35 minutes after the unexpected gas release, returning facility temperature to a normal operational level