Monday, August 4, 2014

Greenplum performance issue of INSERT SQL caused by index_insert

Symptom:

Below SQL can not finish after 5 hours and only 30% finished.
insert into target select * from source;
Both source table and target table are AOCO table with same distribution key.
Source table has more than 2 billion rows.

Env:

Greenplum 4.2.7.2

Troubleshooting:

On the first sight, this issue should be IO bound, however "iostat" did not show IO was used up on segments. By checking the session process stacktrace on segment, it is showing:
[gpadmin@sdw1 ~]$ pstack 101924
Thread 2 (Thread 0x7fa0b84f2700 (LWP 101929)):
#0  0x00007fa0b607ea63 in poll () from /lib64/libc.so.6
#1  0x0000000000b24914 in rxThreadFunc ()
#2  0x00007fa0b5b86851 in start_thread () from /lib64/libpthread.so.0
#3  0x00007fa0b608811d in clone () from /lib64/libc.so.6
Thread 1 (Thread 0x7fa0b8554700 (LWP 101924)):
#0  0x00007fa0b608a747 in semop () from /lib64/libc.so.6
#1  0x000000000085e298 in PGSemaphoreLock ()
#2  0x00000000008fa050 in LWLockAcquire ()
#3  0x000000000050d9bc in XLogInsert_Internal ()
#4  0x000000000050ef14 in XLogInsert ()
#5  0x00000000004daae2 in _bt_insertonpg ()
#6  0x00000000004dc64e in _bt_doinsert ()
#7  0x00000000004e27a4 in btinsert ()
#8  0x0000000000a7688b in FunctionCall6 ()
#9  0x00000000004d880d in index_insert ()
#10 0x0000000000707db3 in ExecInsertIndexTuples ()
#11 0x00000000006f1158 in ExecutePlan ()
#12 0x00000000006f1d50 in ExecutorRun ()
#13 0x0000000000913f2a in ProcessQuery ()
#14 0x0000000000916906 in PortalRun ()
#15 0x000000000090f249 in PostgresMain ()
#16 0x0000000000877415 in BackendStartup ()
#17 0x000000000087e480 in PostmasterMain ()
#18 0x000000000078289a in main ()
Then we found source table has an extra b-tree index.

Root cause:

INSERT is talking lot of time on inserting index.

Solution:

After dropping the index, such INSERT finishes in 20mins.
Then re-create the index.

No comments:

Post a Comment

Popular Posts