Thursday, November 5, 2015

How to use Hive custom serde in Drill

Env:

Drill 1.2

Goal:

How to use Hive custom serde in Drill.

Solution:

Take Hive json serde for example.
1. Create a Hive external table based on json serde in above link.
ADD JAR /home/mapr/hive-json-serde-0.2.jar;
CREATE EXTERNAL TABLE h1_jsonserde ( field1 string, field2 int, field3 string, field4 double )
       ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.JsonSerde'
       LOCATION '/user/hive/warehouse/h1_jsonserde';
2. Drill fails to read this Hive table because there is no such serde jar file.
select * from hive.h1_jsonserde;
Caused By (java.lang.ClassNotFoundException) org.apache.hadoop.hive.contrib.serde2.JsonSerde
3. Put the serde jar in /opt/mapr/drill/drill-<version>/jars/3rdparty/ on all drill nodes.
[root@h1 ~]# clush -a ls -altr /opt/mapr/drill/drill-1.2.0/jars/3rdparty/hive-json-serde-0.2.jar
h1.poc.com: -rw-r--r-- 1 mapr root 112508 Dec 22  2010 /opt/mapr/drill/drill-1.2.0/jars/3rdparty/hive-json-serde-0.2.jar
h2.poc.com: -rw-r--r-- 1 mapr root 112508 Nov  5 22:34 /opt/mapr/drill/drill-1.2.0/jars/3rdparty/hive-json-serde-0.2.jar
h3.poc.com: -rw-r--r-- 1 mapr root 112508 Nov  5 22:34 /opt/mapr/drill/drill-1.2.0/jars/3rdparty/hive-json-serde-0.2.jar
4. Restart all drillbits.
5. Now Drill can read the Hive table based on custom serde.
0: jdbc:drill:zk=> select * from hive.h1_jsonserde;
+---------+---------+-------------+----------+
| field1  | field2  |   field3    |  field4  |
+---------+---------+-------------+----------+
| data1   | 100     | more data1  | 123.001  |
| data2   | 200     | more data2  | 123.002  |
| data3   | 300     | more data3  | 123.003  |
| data4   | 400     | more data4  | 123.004  |
+---------+---------+-------------+----------+
4 rows selected (2.518 seconds)

No comments:

Post a Comment

Popular Posts