I woke up this morning to a text from my ISP, “There is an outage in your area, we are working to resolve the issue”
I laugh, this is what I live for! Almost all of my services are self hosted, I’m barely going to notice the difference!
Wrong.
When the internet went out, the power also went out for a few seconds. Four small computers host all of my services. Of those, one shutdown, and three rebooted. Of the three that ugly rebooted some services came back online, some didn’t.
30 minutes later, ISP sends out the text that service is back online.
2 hours later I’m still finding down services on my network.
Moral of the story: A UPS has moved to the top of the shopping list! Any suggestions??
I reboot every box monthly to flush out such issues. It’s not perfect, since it won’t catch things like circular dependencies or clusters failing to start if every member is down, but it gets lots of stuff.