Tuesday, January 27, 2015

How to access Hive table with RCFile storage in Pig

Goal:

This article shows how to access Hive table with RCFile storage in Pig.

Solution:

1. Create a hive table with RCFile storage.

create table rcfile_table (x int) stored as rcfile;

2. Locate piggybank.jar. 

For example, in mapr-pig-0.13, it is located at:
/opt/mapr/pig/pig-0.13/contrib/piggybank/java/piggybank.jar

3. Register the jar in pig and then you can load the RCFile table.

pig -useHCatalog
grunt> register /opt/mapr/pig/pig-0.13/contrib/piggybank/java/piggybank.jar
grunt> a = LOAD '/user/hive/warehouse/rcfile_table' USING org.apache.pig.piggybank.storage.HiveColumnarLoader('x int');

No comments:

Post a Comment

Popular Posts