What is a Combine Records Stage ?

Adam_Clone · Post by **Adam_Clone** » Mon Apr 18, 2005 5:09 am

hi
I tried a lot to get a combine records stage workking. I just can't figure out the idea given in the documentation.
Well, i did get the idea that it combines records from the input link based on the key fields specified, but cannot digest this idea of sub-records and all that. I did try it out and even though my job is compiling, it is giving an error :
"
Sequential_File_2: Error when checking operator: Could not find input field ID
Error when checking operator: Could not find input field Name
Error when checking operator: Input "Addr" is not a value field
"

The table definition I've used contains 3 columns ID (key) and two other varchar fields. I've used the same for the output sequential file stage also.
Should I be using a different metadata in the output link

SOS !!!

ray.wurlod · Post by **ray.wurlod** » Mon Apr 18, 2005 5:23 am

Strange as this suggestion may seem, go back to Chapter 3 of the Parallel Job Developer's Guide, and find the table where all the stage types are listed. (Don't have access at the moment, or I'd be more precise.) I've found that seeing the groups of stages together makes it easier to understand what each particular stage in that group does. For example (and from memory), the Combine Records stage is one of the "Restructure" group of stages, and does a vertical pivot of sets of rows into vectors of subrecords. Even just looking at the icons gives some idea how the restructure stages are treating the data.

roy · Post by **roy** » Mon Apr 18, 2005 6:17 am

Hi,
in the EE manual page from page 895 (section 45-1)
you can see the documented info + 2 examples.
(It also explains the meta data you'll need)

From what I undestood it forms a single row, in the target data set, containing a vector of all the rows with the same key column/s ( that you specify) from the source data set.

(can't test it myself right now

)

IHTH,