Sunday, June 15, 2014

How to populate a large hbase table

Sometime if you want to create a large sample table in hbase to do some performance test, here is the way:
1. Create a table with pre-split points according to expected size of the table.
create 'mytesttable', 'mycf', { SPLITS => ['200000', '400000', '600000' , '800000'] }
2. Insert as much data as you want using hbase shell.
You can ctrl-c if you want to stop inserting.

for a in '0'..'9' do 
for b in '0'..'9' do 
for c in '0'..'9' do 
for d in '0'..'9' do 
for e in '0'..'9' do 
for f in '0'..'9' do 
for g in '0'..'9' do 
for h in '0'..'9' do 
put 'mytesttable', "#{a}#{b}#{c}#{d}#{e}#{f}#{g}#{h}", "mycf:col1", "data-value-is-#{a}#{b}#{c}#{d}#{e}#{f}#{g}#{h}" 
end end end end end end end end
3. Check the count of table
hbase(main):005:0> count 'mytesttable', CACHE=>1000000 ,INTERVAL => 1000000
Current count: 1000000, row: 00999999
1283384 row(s) in 13.8240 seconds

1 comment:

  1. There are many types of website from where the people can easily hire the data scientist for their data handling work, but see this is one of the best and unique place to get the world best data scientist on a very affordable rate.


Popular Posts