Tuesday, July 21, 2015

How to run two Drill clusters on the same nodes.


If users want to run two Drill clusters on the same nodes, say production cluster and QA cluster,  this article shows the steps of how to do that.
This is only to segregate the functionalists of the two clusters.
However the two clusters could impact the performance of each other in this architecture.


Drill 1.1


1. Decide the architecture of the two Drill clusters.

For example, production cluster is running on nodes h1, h2 and h3, and another QA cluster will be put on nodes h2 and h3.

2. Create a different DRILL_HOME for QA cluster.

For example, production cluster's DRILL_HOME is /opt/mapr/drill/drill-1.1.0, and we plan to put the DRILL_HOME of QA cluster to /opt/mapr/drill/drill-1.1.0-QA.
On nodes h2 and h3:
cd /opt/mapr/drill
cp -r drill-1.1.0 drill-1.1.0-QA

3. Modify DRILL_LOG_DIR in drill-env.sh to a different log directory.

export DRILL_LOG_DIR="/opt/mapr/drill/drill-1.1.0-QA/logs"

4. Modify below configurations in drill-override.conf

  • drill.exec.cluster-id
  • drill.exec.zk.root
  • drill.drill.exec.rpc.user.server.port
  • drill.exec.rpc.bit.server.port
  • drill.exec.http.port
Sample is
drill.exec: {
  cluster-id: "MyCluster-drillbits-QA",
  zk.connect: "h2.poc.com:5181,h3.poc.com:5181,h4.poc.com:5181",
  zk.root: "drill-QA",
  sys.store.provider.zk.blobroot: "maprfs:///mydrill/",
  rpc.user.server.port: 21010,
  rpc.bit.server.port: 21011,
  http.port: 7047
The key change here is to modify the 4 ports used by drillbit. So before starting drillbit, make sure the new ports are not occupied by any other process.

Port Name Configuration Default Value
user port drill.exec.rpc.user.server.port 31010
control port drill.exec.rpc.bit.server.port 31011
data port N/A control port + 1
web server port  drill.exec.http.port 8047

5. Start QA Drill cluster

On nodes h2 and h3:
export DRILL_HOME=/opt/mapr/drill/drill-1.1.0-QA
/opt/mapr/drill/drill-1.1.0-QA/bin/drillbit.sh start

6. Confirm the new znodes are created in zookeeper.

This is for production cluster:
[zk: h2.poc.com:5181,h3.poc.com:5181,h4.poc.com:5181(CONNECTED) 4] ls /drill
[sys.options, running, MyCluster-drillbits, sys.storage_plugins]
This is for QA cluster:
[zk: h2.poc.com:5181,h3.poc.com:5181,h4.poc.com:5181(CONNECTED) 5] ls /drill-QA
[sys.options, running, MyCluster-drillbits-QA, sys.storage_plugins]

Note: the two clusters have separate pstores which includes configurations, storage plugins, etc.
So users need to maintain 2 sets of configurations for them.

No comments:

Post a Comment

Popular Posts