You are here: Foswiki>LabstaffWeb Web>LabstaffDowntime>20100526Extensive (27 May 2010, lindss2@LAB.CS.RPI.EDU)EditAttach

Outages 2010-05-26

Extensive outages caused by a failing console server that sent a break signal to all of the attached servers when it crashed.
  • The console server is experiencing fan failure, probably related to heat problems
  • Attempts to separate the systems in a way that could be easily reversed caused all of the attached systems to be interrupted again.
  • We left things put together so that we could still get to the servers remotely and it died a third time around 7:20am.
  • Things have now been separated to prevent similar future outages but will make fixing other problems slower until we can come up with an alternate access solution.

Unrelated but simultaneous problems
  • Hard drive in main raid array was marked as having problems, tested ok
  • Mail server's fans and built in ethernet failed. Motherboard replaced.

-- StevenLindsey - 2010-05-27
Topic revision: r1 - 27 May 2010, lindss2@LAB.CS.RPI.EDU
 

This site is powered by FoswikiCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Foswiki? Send feedback