Thursday, May 14, 2015

Drill Workshop -- Persistent Configuration Storage

Env:

Drill 0.9

Theory:

  • Drill stores persistent configuration data in a persistent configuration store (PStore).
  • This data is encoded in JSON or Protobuf format.
  • Drill can use the local file system, ZooKeeper, HBase, or MapR-DB to store this data. 

Goal:

Know what is stored in PStore.
Know how to change PStore locations.

Workshop:

1. Zookeeper as PStore(Default)

The ZooKeeper PStore provider stores all of the persistent configuration data in ZooKeeper except for query profile data.
It is confirmed by:
(Before Drill 1.0)
> select * from sys.options where name = 'drill.exec.sys.store.provider.class';
+------------+------------+------------+------------+------------+------------+------------+------------+
|    name    |    kind    |    type    |   status   |  num_val   | string_val |  bool_val  | float_val  |
+------------+------------+------------+------------+------------+------------+------------+------------+
| drill.exec.sys.store.provider.class | STRING     | BOOT       | BOOT       | null       | "org.apache.drill.exec.store.sys.zk.ZkPStoreProvider" | null       | null       |
+------------+------------+------------+------------+------------+------------+------------+------------+
(Drill 1.0 or above version)
select * from sys.boot where name = 'drill.exec.sys.store.provider.class';

Zookeeper stores non-default sys.options, cluster id, storage plugins, semaphores, etc.
[zk] ls /drill
[sys.options, running, MyCluster-drillbits, semaphore, sys.storage_plugins]

[zk] get /drill/sys.options/exec.queue.small
{
  "name" : "exec.queue.small",
  "kind" : "LONG",
  "type" : "SYSTEM",
  "num_val" : 1
}
[zk] get /drill/sys.storage_plugins/hive
{
  "type" : "hive",
  "enabled" : true,
  "configProps" : {
    "hive.metastore.uris" : "thrift://h1.poc.com:9083"
  }
}
SQL profile data is stored on each node under local directory "/opt/mapr/drill/drill-<version>/logs/profiles":
[root@h1 profiles]# pwd
/opt/mapr/drill/drill-1.0.0/logs/profiles
[root@h1 profiles]# ls -altr
total 112
drwxrwxrwx 3 mapr root 4096 May 14 17:42 ..
-rw-r--r-- 1 mapr mapr   20 May 14 17:52 .2aab2061-c1d7-ab52-ab20-7d7385c0fcbe.sys.drill.crc
-rw-r--r-- 1 mapr mapr 1379 May 14 17:52 2aab2061-c1d7-ab52-ab20-7d7385c0fcbe.sys.drill
You can also store all SQL profiles from all nodes on MapR-FS, so that the any Drill web UI can view any SQL profile.  To do this, set drill.exec.sys.store.provider.zk.blobroot in drill-override.conf on all nodes, and restart all drillbits. 
# cat drill-override.conf
drill.exec: {
  cluster-id: "MyCluster-drillbits",
  zk.connect: "h2.poc.com:5181,h3.poc.com:5181,h4.poc.com:5181",
  sys.store.provider.zk.blobroot: "maprfs:///mydrill/"
}
After restarting drillbits, MapR-FS directory /mydrill/profiles should be created to hold all SQL profiles from all nodes.

2. MapR-DB as PStore

The MapR-DB Pstore will also include SQL profiles.
It is enabled by modifying drill-override.conf and restarting drillbits.
$ cat drill-override.conf
drill.exec: {
  cluster-id: "MyCluster-drillbits",
  zk.connect: "h2.poc.com:5181,h3.poc.com:5181,h4.poc.com:5181",
  sys.store.provider: {
     class: "org.apache.drill.exec.store.hbase.config.HBasePStoreProvider",
     hbase: {
       table: "/tables/drill_store"
     }
  }
}
It is confirmed by:
(Before Drill 1.0) 
> select * from sys.options where name = 'drill.exec.sys.store.provider.class';
+------------+------------+------------+------------+------------+------------+------------+------------+
|    name    |    kind    |    type    |   status   |  num_val   | string_val |  bool_val  | float_val  |
+------------+------------+------------+------------+------------+------------+------------+------------+
| drill.exec.sys.store.provider.class | STRING     | BOOT       | BOOT       | null       | "org.apache.drill.exec.store.hbase.config.HBasePStoreProvider" | null       | null       |
+------------+------------+------------+------------+------------+------------+------------+------------+
(Drill 1.0 or above version)
select * from sys.boot where name = 'drill.exec.sys.store.provider.class';
Run some queries, and make sure the SQL profiles are stored in MapR-DB also.
hbase> get '/tables/drill_store', "profiles\x002aa9d18a-85b2-addf-cc80-df9b2d77cee0"
COLUMN                                                               CELL
 s:d                                                                 timestamp=1431711350588, value={"id":{"part1":3074218613535845855,


No comments:

Post a Comment

Popular Posts