-
Notifications
You must be signed in to change notification settings - Fork 11
Coordinating-only node in medium deployment fails to start #276
Description
Bug Report
What sort of bug did you find? Please give us a clear description of the problem.
Attempted to deploy a test instance of scorestack to GCP and ran into errors related to this GitHub issue on all Elasticsearch nodes. I attempted to add the workaround suggested (commenting out the problem lines in the config file in Ansible), which allowed the Ansible deployment to move further along. However, when Ansible attempted to restart the coordinating-only Elasticsearch node, the service failed to start with an error about the JVM garbage collector. The master nodes were able to start, just not the coordinator.
Expected behavior
What did you expect to happen, that didn't happen?
Deployment of scorestack instance.
Actual behavior
What happened that was a problem?
Coordinating node fails to start after configuration.
Replication steps
For us to be able to fix the bug, it's important that we can replicate it. Please provide as detailed a guide as possible to reliably replicate the issue, if possible.
- Follow steps in medium deployment guide for GCP.
- After first error message, replace these lines in elasticsearch.jvm.options:
-XX:+UseConcMarkSweepGC
-XX:CMSInitiatingOccupancyFraction=75
-XX:+UseCMSInitiatingOccupancyOnly
with either
#-XX:+UseConcMarkSweepGC
#-XX:CMSInitiatingOccupancyFraction=75
#-XX:+UseCMSInitiatingOccupancyOnly
or
8-13:-XX:+UseConcMarkSweepGC
8-13:-XX:CMSInitiatingOccupancyFraction=75
8-13:-XX:+UseCMSInitiatingOccupancyOnly
- Re-run
ansible-playbook playbook.yml -i inventory.ini.
Screenshots
If applicable, add some screenshots to help explain the problem.
Ansible error message after making change to elasticsearch.jvm.options:

Log entries in elasticsearch4 after error:

systemctl status command on elasticsearch1 showing successful start
