Summary: Today at around 16:10 the NFS traffic destined for our mail storage system dropped complete from it’s usual 1000Mbit/s to less than 50Mbit/s – at this point our mail servers began queuing NFS reads/writes and we started seeing a backlog in logins, email sending etc. Thus resulting in very slow performance and eventually the unability to connect.
A temporary work around has been put in place as of 16:30 which has brought mail services back online. We’re going to open a ticket with our storage vendor to see if we there is an issue on the live node and fail over to it’s hot spare. Further updates will be send as required.
Update 20:00: After monitoring the system for the past few hours the system has been stable. No further action is required.