Elasticsearch server is down
Resolved
Jun 13 at 10:17am WIB
Postmortem:
Our elasticsearch server instance have the JVM setting for memory used set to 10GB. The instance is using EC2 spot instance and managed with https://spot.io for better cost optimization.
We configured the ES instance within spot.io in such a way that we can use a variety of EC2 spot instance types, depending on which type have the best value at that time.
But the problem is some of the configured instance types doesn't have the memory capacity required for the JVM setting above, and by the time the issue appear the instance that's being used is only have a maximum of 8GB memory, well below the required 10GB so the ES process cannot start.
Once we know what the problem was we proceed reconfiguring the ES server to only use EC2 types with adequate resources to run the ES process. After the ES instance is replaced with the correct type the ES process is able to start and the issue is resolved.
Affected services
Tada Insight
Tada Insight API
Updated
Jun 13 at 09:37am WIB
Incident resolved, We will now start to repopulating the data to Elastic search during the 9 mins downtime
Affected services
Tada Insight
Tada Insight API
Created
Jun 13 at 09:28am WIB
One of our Elasticsearch server instance is down due to incorrect instance type which have insufficient resource to run the Elasticsearch server. We're currently provision the correct instance type to replace the old one. ETA 30 minutes.
Affected services
Tada Insight
Tada Insight API