We need to upgrade a ElasticSearch server many times. This requires the ES servers to restart. However, at any given timepoint, only a subset of ES servers can restart at the same time, or otherwise the cluster will be down.
- Ability to apply an arbitrary Terraform+Ruby script to all the ElasticSearch node in a rolling upgrade
- Each ElasticSearch index contains many nodes. Each ElasticSearch node contains two shards of the same index. During a rolling upgrade, no two nodes that contain the same shard should be allowed to upgrade. Here is a script wrote to check for the next batch of available nodes to upgrade
- We need to retain the ability to approve Ansible script before it’s run. And ability to approve Terraform before it’s run.
- The script should check and make sure that the ElasticSearch cluster is in green status before proceeding to the next batch of servers.
We need be able to see which nodes we are applying the Ansible+Terraform script before it’s run.