Goal:
When running count(*) queries on parquet files, the physical plan may show below:Scan(groupscan=[org.apache.drill.exec.store.pojo.PojoRecordReader@331f5e71...This article is to explain what does "org.apache.drill.exec.store.pojo.PojoRecordReader" mean.
Root Cause:
For select count(*) or count( not-nullable-expr) queries on parquet files, Drill may do an optimization to read from Parquet metadata instead of reading the whole parquet file.This logic is in code exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/ConvertCountToDirectScan.java:
/** * This rule will convert * " select count(*) as mycount from table " * or " select count( not-nullable-expr) as mycount from table " * into * * Project(mycount) * \ * DirectGroupScan ( PojoRecordReader ( rowCount )) * * or * " select count(column) as mycount from table " * into * Project(mycount) * \ * DirectGroupScan (PojoRecordReader (columnValueCount)) * * Currently, only parquet group scan has the exact row count and column value count, * obtained from parquet row group info. This will save the cost to * scan the whole parquet files. */For example:
create table dfs.tmp.`prune2` as select 1 as id from hive.bigtable; refresh table metadata dfs.tmp.`prune2`; select count(*) from dfs.tmp.`prune2` ;The physical plan contains below line:
Scan(groupscan=[org.apache.drill.exec.store.pojo.PojoRecordReader@4b7000eb[columns = null, isStarQuery = false, isSkipQuery = false]])
49E7A33D35
ReplyDeletekiralık hacker
hacker arıyorum
kiralık hacker
hacker arıyorum
belek
766BD759
ReplyDeletekartal rus esçort
ereğli esçort
aydın esçort
pendik olgun esçort
altındağ esçort
adıyaman esçort
mudanya esçort
anadolu yakası yabancı esçort
kartal yabancı esçort
F027482A
ReplyDeleteesçort bayan kırşehir
esçort kilis
hopa esçort
eşme esçort
alpu esçort
ürgüp esçort
pursaklar esçort
datça esçort
kapaklı esçort