Query on score dump

LD · Post by LD » Fri Dec 24, 2010 12:08 am

Hi,

I was trying to understand the score dump of my PX job. My query is related to buffering. I read in Advanced guide that score dump provides information on where data is buffered.
But I could not understand the meaning of an datasets and operators for buffering in the score dump i.e.

Data Set example:
ds18: {op18[4p] (parallel APT_TransformOperatorImplV0S3_PatientAccountStdFileStgPX_File_Tra_DischargeDate in Tra_DischargeDate)
eAny=>eCollectAny
op20[4p] (parallel buffer(0))}

Another example,

ds29: {op20[4p] (parallel buffer(0))
eSame=>eCollectAny
op21[4p] (parallel APT_LUTProcessOp in Lookup_EffDate)}

Operator Example:
op19[1p] {(parallel APT_LUTCreateOp in Lookup_EffDate)
on nodes (
node1[op19,p0]
)}
op20[4p] {(parallel buffer(0))
on nodes (
node1[op20,p0]
node2[op20,p1]
node3[op20,p2]
node4[op20,p3]
)}

op23[4p] {(parallel APT_TransformOperatorImplV0S22_PatientAccountStdFileStgPX_File_Tra_GetEffDate in Tra_GetEffDate)
on nodes (
node1[op23,p0]
node2[op23,p1]
node3[op23,p2]
node4[op23,p3]
)}
op24[4p] {(parallel buffer(1))
on nodes (
node1[op24,p0]
node2[op24,p1]
node3[op24,p2]
node4[op24,p3]
)}

Queries:

1) What does these data sets and operator for buffer means
2) What impact they have on performance
3) Do these operator execute according to the serial number assigned to them

Thanks,

Shashank

ray.wurlod · Post by **ray.wurlod** » Fri Dec 24, 2010 4:02 pm

Data Sets, if the descriptor file name ends in ".v", are virtual Data Sets, corresponding with links in the job. Buffer operators are inserted by the Orchestrate framework to handle conditions in which flows from multiple inputs are likely to be arriving at different rates and which, without the buffer operators, would likely cause a deadlock situation.

The "serial numbers" as you call them exist purely to provide for unique generic names. They do not have any effect on the order of execution. Execution is parallel on as many nodes as the operator executes on, and data are spread over those nodes in accordance with the partitioning information given in the Data Sets section of the score.

LD · Post by LD » Wed Dec 29, 2010 12:20 pm

Hi Ray,

Thanks a lot. With careful examination of job score I'm able to relate what said with the actual score.

By sequence I meant, any particular record in a given partition will be passed from operator to operator in the given sequence only. But that is obvious because we put stages in the same order.

Thanks,

Shashank