Wednesday, August 12, 2015

How to run Hive UDF in Drill

Env:

Drill 1.1
Hive 1.0
MapR 4.1

Goal:

How to run Hive UDF in Drill

Solution:

1. Create a Hive UDF 

Sample code is https://github.com/viadea/HiveUDF

a. Make sure UDF code has the annotation which contains the function name.

For example in MyUpper.java:
@Description(
        name = "my_upper",
        value = "_FUNC_(str) - Converts a string to uppercase",
        extended = "Example:\n" +
        "  > SELECT my_upper(a) FROM test;\n" +
        "  ABC"
        )

b. Make sure UDF code package contains an empty drill-module.conf so that Drill can find this function.
# ls -altr src/main/resources/drill-module.conf
-rw-r--r-- 1 root root 0 Aug 12 23:16 src/main/resources/drill-module.conf
After building the jar file, make sure the drill-module.conf is in the jar.
eg:
# jar tf target/MyUDF-1.0.0.jar  |grep -i drill
drill-module.conf

Note: Starting from Drill 1.4, the drill-module.conf needs to contain the classpath like below:
[root@v1 HiveUDF]# cat src/main/resources/drill-module.conf
drill.classpath.scanning.packages += "openkb.hive.udf.MyUpper"
Otherwise, below error may show up:
Caused By (org.apache.calcite.sql.validate.SqlValidatorException) No match found for function signature my_upper(<CHARACTER>)

2. Test the UDF in Hive

hive> create function h1db.MyUpper as 'openkb.hive.udf.MyUpper' USING JAR 'maprfs:///github/hiveudf/target/MyUDF-1.0.0.jar';
converting to local maprfs:///github/hiveudf/target/MyUDF-1.0.0.jar
Added [/tmp/82c83467-2484-41aa-833c-1e12fcba7d13_resources/MyUDF-1.0.0.jar] to class path
Added resources: [maprfs:///github/hiveudf/target/MyUDF-1.0.0.jar]
OK
Time taken: 0.117 seconds
hive> select h1db.MyUpper('abc');
OK
ABC
Time taken: 0.355 seconds, Fetched: 1 row(s)

3. Put the JAR file into /jars/3rdparty directory on all Drill nodes.

clush -a cp /xxx/target/MyUDF-1.0.0.jar /opt/mapr/drill/drill-1.1.0/jars/3rdparty/

4. Restart all drill-bits.

maprcli node services -name drill-bits -action restart -filter csvc=="drill-bits"

5. Test the UDF in Drill

0: jdbc:drill:zk=h2.poc.com:5181> select my_upper('abc') from sys.version;
+---------+
| EXPR$0  |
+---------+
| ABC     |
+---------+
1 row selected (1.516 seconds)

No comments:

Post a Comment

Popular Posts