Monday, May 19, 2014

How to load windows newline characters to Greenplum using goload

Question: 

Could We load windows characters such as ^M(ctrl-V-M) and $(end of line) to Greenplum database using gpload?

Below is an example(one line):
1|2|"abc ^M     def^M     abc ^M     end of line$
another word"|col4|3$

You can vi the file and see above special characters by typing ":set list".

Answer:

If the data contains newline characters, you must use CSV
format. If you use Text format and the data contains newline
characters, the data after the newline character is treated as a new
record. In such situations, the gpload utility might reject or insert
incorrect data into the tables.

Here is a good YAML file which is using CSV instead of TEXT format.
%YAML 1.1
---
VERSION: 1.0.0.1

DATABASE: gpload
USER: username
PASSWORD: password
HOST: localhost
PORT: 5432
GPLOAD:
   INPUT:
     - SOURCE:
        FILE:
          - /Users/xxx/Documents/xxx/xxx/xxx/test.dat
     - COLUMNS:
         -  "col1":
         -  "col2":
         -  "col3":
         -  "col4":
         -  "col5":
     - FORMAT: CSV
     - DELIMITER: "|"
     - NULL_AS: '\N'
     - QUOTE: '"'
     - ENCODING: utf8
     - ERROR_LIMIT: 20000
     - ERROR_TABLE: xxx.xxx
   OUTPUT:
     - TABLE: xxx.test
     - MODE: INSERT
   PRELOAD:
     - TRUNCATE: False
     - REUSE_TABLES: False
Then the result:
gpload=# select * from xxx.test;
 col1| col2 |                        col3                 | col4 |col5
-----+------+---------------------------------------------+------+-----
 1   |    2 | abc \r     def\r     abc \r     end of line | col4 | 3
(1 row)

No comments:

Post a Comment

Popular Posts