All tests below are done in MapR 4.0.1.
Theory
1. auto split is enabled by default.Check using "maprcli table info", for example:
1 2 | # maprcli table info -path /maprtable -json|grep -i autosplit <span style= "background-color: white;" > "autosplit" : true ,< /span > |
Check using "maprcli table info", for example:
1 2 | # maprcli table info -path /maprtable -json|grep -i regionsizemb "regionsizemb" :4096, |
If autosplit is set to true, MapR-DB splits a region when the size of the region exceeds 50% of the average value. For example, if the average value is 4096 MB, MapR-DB splits a region that is larger than 6144 MB.
Note that before a table is smaller than 4 regions, MapR-DB ignores the regionsizemb parameter and aggressively distributes the table data.
Lab
To verify above theory, firstly let's create a table "/maprtable" and bulk load 76733449 rows using Spark following steps here.1. Disable auto split manually and merge them into one region with size about 4.6GB.
1 2 3 4 | # maprcli table edit -autosplit false -path /maprtable # maprcli table region list -path /maprtable numberofrows fid secondarynodes primarynode numberofrowswithdelete startkey logicalsize lastheartbeat endkey physicalsize 76733449 2115.523.263486 yarn-92 yarn-94 0 -INFINITY 4899782656 0 INFINITY 4964327424 |
1 | maprcli table edit -regionsizemb 512 -path /maprtable |
1 2 3 4 5 6 7 8 9 10 11 12 13 | maprcli table edit -autosplit true -path /maprtable # maprcli table region list -path /maprtable numberofrows fid secondarynodes primarynode numberofrowswithdelete startkey logicalsize lastheartbeat endkey physicalsize 9288340 2189.761.132748 yarn-94 yarn-92 0 -INFINITY 551845888 0 \x00Q\x03S 565313536 8218724 2191.465.132312 yarn-94 yarn-92 0 \x00Q\x03S 538714112 0 \x00\xAC\x89\xDE 553336832 7547911 2192.1708.134654 yarn-94 yarn-92 0 \x00\xAC\x89\xDE 486211584 0 \x01\x1E\xF2C 490905600 7628796 2193.34.131220 yarn-94 yarn-92 0 \x01\x1E\xF2C 489406464 0 \x01\x91\xC7\xCA 494075904 8536673 2194.34.131186 yarn-94 yarn-92 0 \x01\x91\xC7\xCA 547258368 0 \x02\x12-j 552493056 8650526 2195.723.132698 yarn-94 yarn-92 0 \x02\x12-j 557539328 0 \x02\x95v\xA6 562798592 8927659 2196.569.132256 yarn-94 yarn-92 0 \x02\x95v\xA6 573784064 0 \x03\x1CB\xAF 579248128 8973834 2116.322.263128 yarn-92 yarn-94 0 \x03\x1CB\xAF 578256896 0 \x03\xA4jS 583835648 8960986 2190.720.133176 yarn-94 yarn-92 0 \x03\xA4jS 576765952 0 INFINITY 582320128 |
1 2 3 4 | # maprcli table edit -autosplit false -path /maprtable # maprcli table region list -path /maprtable numberofrows fid secondarynodes primarynode numberofrowswithdelete startkey logicalsize lastheartbeat endkey physicalsize 76733449 2198.945.133124 yarn-92 yarn-94 0 -INFINITY 4899782656 0 INFINITY 4964327424 |
1 | maprcli table edit -regionsizemb 8192 -path /maprtable |
1 2 3 4 5 6 7 8 | maprcli table edit -autosplit true -path /maprtable # maprcli table region list -path /maprtable numberofrows fid secondarynodes primarynode numberofrowswithdelete startkey logicalsize lastheartbeat endkey physicalsize 38305525 2190.721.133178 yarn-94 yarn-92 0 -INFINITY 2426093568 0 \x01\xE6&R 2466988032 17322521 2192.1709.134656 yarn-94 yarn-92 0 \x01\xE6&R 1114742784 0 \x02\xECP\xBD 1125318656 16298948 2198.945.133124 yarn-92 yarn-94 0 \x02\xECP\xBD 1050279936 0 \x03\xE3\x9DM 1060331520 4806455 2191.466.132314 yarn-94 yarn-92 0 \x03\xE3\x9DM 308666368 0 INFINITY 311689216 |
Conclusion
1. MapR-DB splits the regions once the region size reaches 150% of "regionsizemb" table attributes.2. MapR-DB will aggressively splits to at least 4 regions.
No comments:
Post a Comment