Yes, it is. Including a failure of the alerting system, which failed to wake us up until now (currently 5:30am UK time).
Posting this was clearly always going to bring down the wrath of the Uptime Gods, but I didn’t expect it to be this dramatic. Stand by please.
EDIT: And we’re back.
Update: Now we’ve had time to do a post-mortem, we’ve concluded that the proximate cause was in relatively minor (a deadlock bug exposed by scaling pressures, which was dislodged by a global restart and fixed within minutes - it would have been a couple of minutes of downtime, but not a major incident). The main cause of the prolonged outage was the alerting failure. We’ve identified the failure, have applied appropriate quantities of duct tape, are now doing a ground-up rework of our alerting system. Thank you for your patience!