A small number of block size is large. Is it ok?
Posted: Thu Jan 28, 2010 12:40 am
Hi Experts,
I have a very large file A (>20 MG rows) match to another large file B (about 10 GB rows). I use QualityStage 7.0 designer.
In one of the passes, I bad better to block on: last name, first name, and Birth Year. In file B, the block has good number of records (<20). But in file A, there are about 10 blocks that have many records (>20), the largest block has 1000 records.
I read the user guid. the size of the block should be not large (around 20). If I have to use this block, how bad if a small number of blocks have large size?
Thanks in advance.
I have a very large file A (>20 MG rows) match to another large file B (about 10 GB rows). I use QualityStage 7.0 designer.
In one of the passes, I bad better to block on: last name, first name, and Birth Year. In file B, the block has good number of records (<20). But in file A, there are about 10 blocks that have many records (>20), the largest block has 1000 records.
I read the user guid. the size of the block should be not large (around 20). If I have to use this block, how bad if a small number of blocks have large size?
Thanks in advance.