slow read from dataset - combinability mode issue?

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
miwinter
Participant
Posts: 396
Joined: Thu Jun 22, 2006 7:00 am
Location: England, UK

slow read from dataset - combinability mode issue?

Post by miwinter »

Hi,

I have a job with one side of the join sequential and one side dataset (don't ask why, I wonder myself why the sequential is not also a dataset - surely this would be more efficient...?)

The sequential side read in 500k rows in seconds. The dataset side of the join is reading in at just 66 rows/sec however - surely this is very low? It seems to be a real bottleneck in the job. I've noticed on the join stage that combinability mode is also set to "Don't Combine" but I can't see any apparent reason why.

I've already searched on here and read about the Combinability Mode, but it didn't reveal much more than the Advanced PX Developer's Guide did.

I hope that made sense... any input appreciated!

Cheers,

M
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Are you using the same configuration file and the same partitioning when reading the Data Set that were used when it was written? If not you are incurring the cost of repartitioning these data, as well as of partitioning the sequential file data (which is unavoidable for parallel execution).
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Andet
Charter Member
Charter Member
Posts: 63
Joined: Mon Nov 01, 2004 9:40 am
Location: Clayton, MO

Post by Andet »

When reading datasets it's important to keep in mind what is actually happening. Datasets are not normally sequential files. Just like a partitioned database, you have separate partitions that should be on separate devices or at least separate file systems. If you monitor the job, you should be able to see your throughput by partition/stream and this might give you a clue. Check your configuration file. Are you using the configuration file saved with the dataset or another? If the configuration file you're using does not match the dataset, you could have bad performance or ever crash.
You're reading this into a join....is the dataset sorted by key? Is the sort key also the partition key of the dataset?
Do you have more partitions than CPUs? What's your page/swap rate? Are you buffering? We had a job that ran for 15 minutes until we specified buffering and now it runs in 4 minutes. Etc. etc.

Ande
kumar_s
Charter Member
Charter Member
Posts: 5245
Joined: Thu Jun 16, 2005 11:00 pm

Post by kumar_s »

In addition it also depends on the number of field. Hundreds of fields in Dataset will be slower in extraction than fewer fields in sequential file. Is it a outer join? If os on which stream?
Impossible doesn't mean 'it is not possible' actually means... 'NOBODY HAS DONE IT SO FAR'
miwinter
Participant
Posts: 396
Joined: Thu Jun 22, 2006 7:00 am
Location: England, UK

Post by miwinter »

Thanks for the replies so far. In response:

- It's an inner join and both inputs are partitioned/sorted on the join key already

- We run on 4 nodes, that is, it is 4-way partitioned and operate across 10 CPUs

- The buffering on the join stage is set to 'Default'

- The dataset metadata is just 3 columns... approx 70 bytes per record across 2.1 million records in all
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Look at the score (set APT_DUMP_SCORE to True). Is DataStage inserting tsort operators or buffer operators?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Look at the score (set APT_DUMP_SCORE to True). Is DataStage inserting tsort operators or buffer operators?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
miwinter
Participant
Posts: 396
Joined: Thu Jun 22, 2006 7:00 am
Location: England, UK

Post by miwinter »

Cheers Ray - I've checked and the score shows none of these being added:

main_program: This step has 8 datasets:
ds0: {/gcdm/prd/workingarea/MI/Post-results/LCDM_2875_MI_POSTRESULTS/ALGO_SA_CONTR_2875_DS
[pp] eSame=>eCollectAny
op1[4p] (parallel SAContr_Read_DS)}
ds1: {op0[1p] (sequential TrDep_Read_FF)
eAny<>eCollectAny
op2[4p] (parallel APT_TransformOperatorImplV0S3_GleamMIPostAlgoTrDepDervJob_TrDepDerv_Tfp in TrDepDerv_Tfp)}
ds2: {op1[4p] (parallel SAContr_Read_DS)
[pp] eSame=>eCollectAny
op4[4p] (parallel buffer(0))}
ds3: {op2[4p] (parallel APT_TransformOperatorImplV0S3_GleamMIPostAlgoTrDepDervJob_TrDepDerv_Tfp in TrDepDerv_Tfp)
eOther(APT_HashPartitioner { key={ value=AGG_ID,
subArgs={ cs }
}
})#>eCollectAny
op3[4p] (parallel TrDep_Srt)}
ds4: {op3[4p] (parallel TrDep_Srt)
[pp] eSame=>eCollectAny
op5[4p] (parallel buffer(1))}
ds5: {op4[4p] (parallel buffer(0))
[pp] eSame=>eCollectAny
op6[4p] (parallel APT_JoinSubOperator in TrDep_Jon)}
ds6: {op5[4p] (parallel buffer(1))
[pp] eSame=>eCollectAny
op6[4p] (parallel APT_JoinSubOperator in TrDep_Jon)}
ds7: {op6[4p] (parallel APT_JoinSubOperator in TrDep_Jon)
eOther(APT_DB2Partitioner {})#>eCollectAny
op7[10p] (parallel MIResults_Update_EETab)}
It has 8 operators:
op0[1p] {(sequential TrDep_Read_FF)
on nodes (
node1[op0,p0]
)}
op1[4p] {(parallel SAContr_Read_DS)
on nodes (
node1[op1,p0]
node2[op1,p1]
node3[op1,p2]
node4[op1,p3]
)}
op2[4p] {(parallel APT_TransformOperatorImplV0S3_GleamMIPostAlgoTrDepDervJob_TrDepDerv_Tfp in TrDepDerv_Tfp)
on nodes (
node1[op2,p0]
node2[op2,p1]
node3[op2,p2]
node4[op2,p3]
)}
op3[4p] {(parallel TrDep_Srt)
on nodes (
node1[op3,p0]
node2[op3,p1]
node3[op3,p2]
node4[op3,p3]
)}
op4[4p] {(parallel buffer(0))
on nodes (
node1[op4,p0]
node2[op4,p1]
node3[op4,p2]
node4[op4,p3]
)}
op5[4p] {(parallel buffer(1))
on nodes (
node1[op5,p0]
node2[op5,p1]
node3[op5,p2]
node4[op5,p3]
)}
op6[4p] {(parallel APT_JoinSubOperator in TrDep_Jon)
on nodes (
node1[op6,p0]
node2[op6,p1]
node3[op6,p2]
node4[op6,p3]
)}
op7[10p] {(parallel MIResults_Update_EETab)
on nodes (
db2node[op7,p0]
db2node[op7,p1]
db2node[op7,p2]
db2node[op7,p3]
db2node[op7,p4]
db2node[op7,p5]
db2node[op7,p6]
db2node[op7,p7]
db2node[op7,p8]
db2node[op7,p9]
)}
ajit
Participant
Posts: 16
Joined: Wed Oct 05, 2005 7:43 am

Check partitioning when creating the dataset

Post by ajit »

Hi
Have you also checked the partitioning type used when the dataset was being created? This is to just ensure there is no repartioning when the data is being read. I am just suggesting...... :)
miwinter
Participant
Posts: 396
Joined: Thu Jun 22, 2006 7:00 am
Location: England, UK

Post by miwinter »

miwinter wrote:...both inputs are partitioned/sorted on the join key already
I meant this in reference to the job which produces the dataset, which is being used as an input in the job which is slow to read in a dataset.

The sequential file input to the 'problem job' is partitioned/sorted within this job
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

op4 and op5 look like buffers to me. These have probably been inserted to avoid deadlocks due to different throughput rates. The Sequential File stage (import operator) uses the C I/O STREAMS module, and is very fast compared to all other read mechanisms.

Also, be very wary of rows/sec as a metric; there are lots of reasons it can be misleading.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
miwinter
Participant
Posts: 396
Joined: Thu Jun 22, 2006 7:00 am
Location: England, UK

Post by miwinter »

These operators seem to be linked to the join, so that the two streams to be joined are managed - I assume these need to remain to manage this correctly. Is there any tuning can be done on these links?
Mark Winter
<i>Nothing appeases a troubled mind more than <b>good</b> music</i>
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

What happens if you allow the partitioning from the Data Set to be (Auto) rather than Same? Do you get a repartitioning icon on the link? (You don't need to re-run the job - just change the job, then exit without saving, to answer this question.)

The inserted buffer operators are there to attempt to keep pipeline parallelism happening. Each is 3MB by default, which should be adequate unless you've got really wide rows. They are tunable, but this should be the very last thing tuned.

Do you have explicit sorts on the input links to the Join stage? If not, one thing that might help is explicit Sort stages, with the Sort Mode set to "Don't sort (previously sorted)".
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply