Preformance in job contaning Column Import stage
Posted: Thu Sep 01, 2011 10:51 pm
I'm testing the performance of our system processing very large file (400 millim records) using some very simple jobs.
Here are two examples of my test jobs:
Job 1
Fileset ---> Copy
Job 2
Fileset ---> Column Import ---> Copy
The processing time increased from 10 mins for Job 1 to 25 mins for job 2.
Column Import stage splits one field DATA (varchar 2000) into 7 fields following the given schema file. I tried with the explicit column definition and it takes nearly the same amount of time.
I have turned on partitioning on the input link of the CI stage (Hash on DATA column), and it helped to reduce processing time to 15 mins.
I wonder what I can do to reduce this time even further.
I am using 16 nodes config file.
Am I correct in assuming the Column Import works in parallel? If not, how can I make it so?
Here is the job score:
Thanks.
Here are two examples of my test jobs:
Job 1
Fileset ---> Copy
Job 2
Fileset ---> Column Import ---> Copy
The processing time increased from 10 mins for Job 1 to 25 mins for job 2.
Column Import stage splits one field DATA (varchar 2000) into 7 fields following the given schema file. I tried with the explicit column definition and it takes nearly the same amount of time.
I have turned on partitioning on the input link of the CI stage (Hash on DATA column), and it helped to reduce processing time to 15 mins.
I wonder what I can do to reduce this time even further.
I am using 16 nodes config file.
Am I correct in assuming the Column Import works in parallel? If not, how can I make it so?
Here is the job score:
Code: Select all
main_program: This step has 1 dataset:
ds0: {op0[16p] (parallel fs_DataIn)
eAny=>eCollectAny
op1[16p] (parallel APT_CombinedOperatorController:ci_Data)}
It has 2 operators:
op0[16p] {(parallel fs_DataIn)
on nodes (
node1[op0,p0] node2[op0,p1] node3[op0,p2] node4[op0,p3] node5[op0,p4] node6[op0,p5]
node7[op0,p6] node8[op0,p7] node9[op0,p8] node10[op0,p9] node11[op0,p10] node12[op0,p11]
node13[op0,p12] node14[op0,p13] node15[op0,p14] node16[op0,p15]
)}
op1[16p] {(parallel APT_CombinedOperatorController:
(ci_Data)
(Copy_188)
) on nodes (
node1[op1,p0] node2[op1,p1] node3[op1,p2] node4[op1,p3] node5[op1,p4] node6[op1,p5]
node7[op1,p6] node8[op1,p7] node9[op1,p8] node10[op1,p9] node11[op1,p10] node12[op1,p11]
node13[op1,p12] node14[op1,p13] node15[op1,p14] node16[op1,p15]
)}
It runs 32 processes on 16 nodes.