Hi,
I was trying to understand the score dump of my PX job. My query is related to buffering. I read in Advanced guide that score dump provides information on where data is buffered.
But I could not understand the meaning of an datasets and operators for buffering in the score dump i.e.
Data Set example:
ds18: {op18[4p] (parallel APT_TransformOperatorImplV0S3_PatientAccountStdFileStgPX_File_Tra_DischargeDate in Tra_DischargeDate)
eAny=>eCollectAny
op20[4p] (parallel buffer(0))}
Another example,
ds29: {op20[4p] (parallel buffer(0))
eSame=>eCollectAny
op21[4p] (parallel APT_LUTProcessOp in Lookup_EffDate)}
Operator Example:
op19[1p] {(parallel APT_LUTCreateOp in Lookup_EffDate)
on nodes (
node1[op19,p0]
)}
op20[4p] {(parallel buffer(0))
on nodes (
node1[op20,p0]
node2[op20,p1]
node3[op20,p2]
node4[op20,p3]
)}
op23[4p] {(parallel APT_TransformOperatorImplV0S22_PatientAccountStdFileStgPX_File_Tra_GetEffDate in Tra_GetEffDate)
on nodes (
node1[op23,p0]
node2[op23,p1]
node3[op23,p2]
node4[op23,p3]
)}
op24[4p] {(parallel buffer(1))
on nodes (
node1[op24,p0]
node2[op24,p1]
node3[op24,p2]
node4[op24,p3]
)}
Queries:
1) What does these data sets and operator for buffer means
2) What impact they have on performance
3) Do these operator execute according to the serial number assigned to them
Thanks,
Shashank
Query on score dump
Moderators: chulett, rschirm, roy
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Data Sets, if the descriptor file name ends in ".v", are virtual Data Sets, corresponding with links in the job. Buffer operators are inserted by the Orchestrate framework to handle conditions in which flows from multiple inputs are likely to be arriving at different rates and which, without the buffer operators, would likely cause a deadlock situation.
The "serial numbers" as you call them exist purely to provide for unique generic names. They do not have any effect on the order of execution. Execution is parallel on as many nodes as the operator executes on, and data are spread over those nodes in accordance with the partitioning information given in the Data Sets section of the score.
The "serial numbers" as you call them exist purely to provide for unique generic names. They do not have any effect on the order of execution. Execution is parallel on as many nodes as the operator executes on, and data are spread over those nodes in accordance with the partitioning information given in the Data Sets section of the score.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Hi Ray,
Thanks a lot. With careful examination of job score I'm able to relate what said with the actual score.
By sequence I meant, any particular record in a given partition will be passed from operator to operator in the given sequence only. But that is obvious because we put stages in the same order.
Thanks,
Shashank
Thanks a lot. With careful examination of job score I'm able to relate what said with the actual score.
By sequence I meant, any particular record in a given partition will be passed from operator to operator in the given sequence only. But that is obvious because we put stages in the same order.
Thanks,
Shashank