However currently Hive does not validate the storage format when you run "load data into", which means if you accidentally load a plain text file into a ORC hive table, below error messages will show up:
CREATE TABLE IF NOT EXISTS orctest ( id string, id2 string, id3 string, id4 string ) STORED AS ORC; load data local inpath "/opt/tmp/testload2.txt" into table orctest; hive> select * from orctest limit 1; OK Failed with exception java.io.IOException:java.lang.RuntimeException: serious problem Time taken: 0.279 seconds
The correct way is to firstly load into a intermediate normal hive table with text format and then insert overwrite into the hive ORC table.
For example:
CREATE TABLE IF NOT EXISTS orctest_text ( id string, id2 string, id3 string, id4 string ) STORED AS TEXTFILE; load data local inpath "/opt/tmp/testload2.txt" into table orctest_text; INSERT OVERWRITE TABLE orctest SELECT * FROM orctest_text;
I've read this post and if I could I desire to suggest you some interesting things or suggestions. Perhaps you could write next articles referring to this article. I want to read more things about it!
ReplyDeletedata science course in Hyderabad