On Jun 1st, we received the following from AWS:
We have important news about your account (AWS Account ID: 866597297522). EC2 has detected degradation of the underlying hardware hosting your Amazon EC2 instance (instance-ID: i-7f8940e1) in the ap-northeast-1 region. Due to this degradation, your instance could already be unreachable. After 2018-06-12 11:00 UTC your instance, which has an EBS volume as the root device, will be stopped.
The solution was:
You can wait for the scheduled retirement date - when the instance is stopped - or stop the instance yourself any time before then. Once the instances has been stopped, you can start the instance again at any time.
We started and stopped the instance on Jun 6th, which should have been enough for not having any degradation. But this was not the case. We reacted in 70min, and restored the Analytics service.
No data has been lost, and no message has not been sent. It was only the Analytics frontend.
We have added extra monitoring so that we react faster next time.