How to Set Master Nodes in Elasticsearch

Problem

In order to horizontally scale an elasticsearch cluster in the cloud we need to make sure we don’t remove the master nodes if we scale the cluster down during times of less usage. We do this in Amazon Web Services by having two groups of compute instances. The first block is equal to the number of shards in the cluster. We set a unique name on these so they can be easily found in AWS. We treat these as masters and never scale them down. Preserving the master nodes helps prevent data loss. Unfortunately elasticsearch doesn’t let us set master nodes and it doesn’t guarantee that which nodes in the cluster are the master nodes.

Solution

We need to force a set of shards onto the instances of our choice. The following is a script that queries aws based on a tag and then uses the reallocation endpoint in elasticsearch to swap shards.

fun forceUniqueness(ips: List<String>, numberShards: Int) {
        //check to see if we are unique
        //yes exit
        if (validateUniqueShards(ips)) {
            return
        } else {
            //no ->
            val fullMap = getShardMap()
            val fullList = getShardList()
            var duplicateMap = mutableMapOf<String, Int>()
            var duplicatedShards = mutableSetOf<Int>()
            var missingShards = mutableSetOf<Int>()
            var listMoveTo = mutableListOf<String>()

            var tmpSet = mutableSetOf<Int>()

            //which ones are duplicated
            for (ip in ips) {
                if (tmpSet.contains(fullMap.get(ip))) {
                    duplicateMap.put(ip, fullMap.getValue(ip))
                    duplicatedShards.add(fullMap.getValue(ip))
                    listMoveTo.add(ip)
                } else {
                    tmpSet.add(fullMap.get(ip)!!)
                }
            }

            //which ones are missing
            for (i in 0..numberShards - 1) {
                if (!tmpSet.contains(i)) {
                    missingShards.add(i)
//                    println(i)
                }
            }

            println("FoundShards")
            println(tmpSet)
            println("MissingShards")
            println(missingShards)
            println("DuplicateMap")
            println(duplicateMap)

            //find a shard with missing and move it over
            var tripleList = mutableListOf<Triple<String, String, Int>>()
            for (shard in missingShards) {
                for (json in fullList) {
                    //if it is in an ignore ip list because of multiple on demand data integrity lines continue
                    if (ignoreIps.contains(json.getString("ip"))) {
                        continue
                    }
                    if (json.getInt("shard") == shard &&  json.getString("prirep") == "r") {

                        var moveFrom = json.getString("node")
                        var moveTo = getNodeName(listMoveTo.removeAt(0))
                        tripleList.add(Triple(moveFrom, moveTo, shard))

                        //todo move the one that is duplicated and swap
                        var tmpShard = getShardByNode(moveTo)
                        tripleList.add((Triple(moveTo, moveFrom, tmpShard)))
                        break
                    }
                }
            }
            moveShards(tripleList)
        }
    }

 

Comments are closed.