While the UK Celebrated Easter, a Multi-Million-Pound Digital Brain Nearly Boiled Alive

ExcalibuR

Legend
LEGEND
PREMIUM
MEMBER
Joined
Jan 17, 2025
Messages
4,031
Reaction score
7,796
Deposit
11,800$

SpiNNaker has everything — except proper cooling, protection, and an "off" switch.
1746663755996.png

During the Easter holiday, a serious incident occurred at the University of Manchester: the SpiNNaker supercomputer — designed to mimic the human brain — suffered a critical overheating failure. The event sent chills through data center administrators worldwide.


Professor Steve Furber, who is now retired but still calls SpiNNaker his brainchild, explained that on April 20, the cooling system failed. The temperature kept rising until the servers were manually shut down the following day.


The project aims to replicate the neural connections of a living organism by uniting hundreds of thousands of ARM processor cores into a single architecture. While replicating human consciousness is not yet the goal, Furber believes it is possible to digitally simulate the nervous system of a rodent.


Earlier this month, at an event celebrating the 40th anniversary of the launch of the first ARM processor, Furber shared his team’s ambitions to simulate the behavior of an "entire mouse" with sufficient detail. However, these plans now depend on how well the hardware survived the overheating.


SpiNNaker is housed in the Kilburn Building, constructed in 1972 specifically for computing equipment. The facility uses a chilled water system to cool all central machine rooms.


In 2016, the team set up SpiNNaker in a space that used to be a mechanical workshop. The cooling system works on a recirculation principle: heated air from the back of server racks flows into plenums leading to chillers at the room’s edges. These chillers use heat exchangers connected to the building’s main water supply to lower the temperature.


The failure stemmed from a breakdown in water circulation. According to Furber, when the liquid is no longer cooled, the chillers' fans only make things worse instead of solving the problem.


As a result, the temperature began rising uncontrollably. Since automatic shutdown mechanisms failed, the servers kept running. The professor assumed some boards had overheat protection, which could have saved the core hardware — but the network switches and power supplies remained active. Without them, it's impossible to fully diagnose the boards. Experts now fear that more damage may be hidden beneath what’s already known.


Although there were occasional cooling issues over SpiNNaker’s nine years of operation, none had previously led to serious consequences. Furber believes the extended Easter weekend in the UK, with both Friday and Monday as public holidays, was the fatal factor — the temperature rose too long without staff intervention.


The team is now developing fully automated emergency shutdown procedures. Currently, the supercomputer is operating at 80% capacity and is available to internal users, but system testing is ongoing.


The good news is that the software can still function under partial failures. The bad news is that replacing damaged components may require shutting down parts — or the entirety — of the system again.
 
Top Bottom