Quickly Deploy a Three Node GridDB Cluster

With GridDB Community Edition 4.1, online node addition and removal has been added; this feature was only previously available in the commercial standard edition. This first blog, of a two part series, will showcase how to setup a three-node cluster on public cloud infrastructure. The second post will showcase the process of recovering from a failure as well as adding and removing nodes.
We’ll assume you have deployed three Centos 7 instances on the same vnet, griddb1 (192.168.1.10), griddb2 (192.168.1.11), and griddb3 (192.168.1.12) so that we can follow the following steps, running each step on each node before moving on to the next step.

#1 Install GridDB

$ sudo rpm -Uvh https://github.com/griddb/griddb_nosql/releases/download/v4.1.1/griddb_nosql-4.1.1-1.linux.x86_64.rpm

#2 Create and deploy Configuration Files

The first step is to disable iptables and firewalld. In a production environment you should create rules that allow all of the GridDB ports to run.

$ sudo su -
# systemctl disable firewalld
# systemctl stop firewalld
# iptables -F INPUT

All GridDB operations should be performed as the gsadm user.

$ sudo su - gsadm

gs_cluster.json specifies the name of the cluster, the definition of the nodes in the cluster, and the replication number. With replicationNum: 2, two nodes will contain each piece of data stored in GridDB.

$ cat > /var/lib/gridstore/conf/gs_cluster.json << EOF
{
        "dataStore":{
                "partitionNum":128,
                "storeBlockSize":"64KB"
        },
        "cluster":{
                "clusterName":"defaultCluster",
                "replicationNum":2,
                "notificationInterval":"5s",
                "heartbeatInterval":"5s",
                "loadbalanceCheckInterval":"180s",
                "notificationMember": [
                        {
                                "cluster": {"address":"192.168.1.10", "port":10010},
                                "sync": {"address":"192.168.1.10", "port":10020},
                                "system": {"address":"192.168.1.10", "port":10080},
                                "transaction": {"address":"192.168.1.10", "port":10001},
                                "sql": {"address":"192.168.1.10", "port":20001}
                        },
                        {
                                "cluster": {"address":"192.168.1.11", "port":10010},
                                "sync": {"address":"192.168.1.11", "port":10020},
                                "system": {"address":"192.168.1.11", "port":10080},
                                "transaction": {"address":"192.168.1.11", "port":10001},
                                "sql": {"address":"192.168.1.11", "port":20001}
                        },
                        {
                                "cluster": {"address":"192.168.1.12", "port":10010},
                                "sync": {"address":"192.168.1.12", "port":10020},
                                "system": {"address":"192.168.1.12", "port":10040},
                                "transaction": {"address":"192.168.1.12", "port":10001},
                                "sql": {"address":"192.168.1.12", "port":20001}
                        }
                ]
        },
        "sync":{
                "timeoutInterval":"30s"
        }
}
EOF

In gs_node.json, you set how much memory and the number of threads to use. The following values are a good starting point for a node with 8 cores and 8GB of RAM.

$ cat > /var/lib/gridstore/conf/gs_node.json << EOF
{
        "dataStore":{
                "dbPath":"data",
                "backupPath":"backup",
                "storeMemoryLimit":"5120MB",
                "storeWarmStart":true,
                "concurrency":8,
                "logWriteMode":1,
                "persistencyMode":"NORMAL",
                "affinityGroupSize":4
        },
        "checkpoint":{
                "checkpointInterval":"1200s",
                "checkpointMemoryLimit":"1024MB",
                "useParallelMode":false
        },
        "cluster":{
                "servicePort":10010
        },
        "sync":{
                "servicePort":10020
        },
        "system":{
                "servicePort":10040,
                "eventLogPath":"log"
        },
        "transaction":{
                "servicePort":10001,
                "connectionLimit":5000
        },
        "trace":{
                "default":"LEVEL_ERROR",
                "dataStore":"LEVEL_ERROR",
                "collection":"LEVEL_ERROR",
                "timeSeries":"LEVEL_ERROR",
                "chunkManager":"LEVEL_ERROR",
                "objectManager":"LEVEL_ERROR",
                "checkpointFile":"LEVEL_ERROR",
                "checkpointService":"LEVEL_INFO",
                "logManager":"LEVEL_WARNING",
                "clusterService":"LEVEL_ERROR",
                "syncService":"LEVEL_ERROR",
                "systemService":"LEVEL_INFO",
                "transactionManager":"LEVEL_ERROR",
                "transactionService":"LEVEL_ERROR",
                "transactionTimeout":"LEVEL_WARNING",
                "triggerService":"LEVEL_ERROR",
                "sessionTimeout":"LEVEL_WARNING",
                "replicationTimeout":"LEVEL_WARNING",
                "recoveryManager":"LEVEL_INFO",
                "eventEngine":"LEVEL_WARNING",
                "clusterOperation":"LEVEL_INFO",
                "ioMonitor":"LEVEL_WARNING"
        }
}
EOF

Finally, set the password for the admin user. Use something more secure if not just testing!

$ gs_passwd -u admin -p admin

#3 Start GridDB

$ sudo su - gsadm
$ gs_startnode
$ gs_joincluster -u admin/admin -n 3

#4 Finished

Finally, run gs_stat to see if the cluster is running correctly. If a complete node list is not shown as below, move to the master node and run gs_stat there. If gs_stat shows SUB_CLUSTER, the cluster is _NOT_ running correctly, and is likely having communication issues between nodes.

$ sudo su - gsadm
$ gs_stat -u admin/admin
... snip ...
    "cluster": {
        "activeCount": 3,
        "clusterName": "defaultCluster",
        "clusterStatus": "MASTER",
        "designatedCount": 3,
        "loadBalancer": "ACTIVE",
        "master": {
            "address": "192.168.1.10",
            "port": 10040
        },
        "nodeList": [
            {
                "address": "192.168.1.10",
                "port": 10040
            },
            {
                "address": "192.168.1.11",
                "port": 10080
            },
            {
                "address": "192.168.1.12",
                "port": 10080
            }
        ],
        "nodeStatus": "ACTIVE",
        "notificationMode": "FIXED_LIST",
        "partitionStatus": "NORMAL",
        "startupTime": "2019-03-15T04:57:16Z",
        "syncCount": 173
    },
... snip ...

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.