Page 1 of 1

Flat File with Parent Child relation

Posted: Wed Oct 16, 2013 6:54 am
by gssr
I am having a fixed width file with different record layout.
The file is having a hierarchical structure with a parent Child Relation.

Example:
000HEADER
001PARENT1_COL1_COL2
010CHILD1_COL4_COL5_COL6
020CHILD2_COL7
030CHILD3_COL_8_COL9
001PARENT2_COL1_COL2
010CHILD1_COL4_COL5_COL6
020CHILD2_COL7
030CHILD3_COL_8_COL9
..........
999TRAILER


i need to extract the file as
001PARENT1_COL1_COL2_COL4_COL5_COL6_COL7_COL_8_COL9
001PARENT2_COL1_COL2_COL4_COL5_COL6_COL7_COL_8_COL9
...........
I tried it with CFF stage, but i cannot able to preserver the parent child relation.
Any suggestion is mostly appriciated
Thanks in advance

Posted: Wed Oct 16, 2013 8:39 am
by shank
Do you have any record field identifier which says it is a parent/Child record ?
I guess there should be some. (Lets say .. First three bytes in your example. If it is '001' it is parent record else its a child record)
In that case, use Sequential file to read the file and a Filter on that Identifier field to write Parent and Child files seperately.

Hope this helps..

Posted: Thu Oct 17, 2013 3:51 am
by gssr
Shank,
I dont want to read the parent and Child records seperatly.

the first three characters represents the record type.
Parent record - 001
Child record - 010,020,030...

Here , i am getting set of records with thhis combination.
and i donk have any key to differentiate each group

In the ablove example,
Below is the first set of record,
001PARENT1_COL1_COL2
010CHILD1_COL4_COL5_COL6
020CHILD2_COL7
030CHILD3_COL_8_COL9

And i have to preserver the relation and write this as below
001PARENT1_COL1_COL2_COL4_COL5_COL6_COL7_COL_8_COL9

Similarly , i have to extract each set of records.

Posted: Thu Oct 17, 2013 3:05 pm
by ray.wurlod
Read the entire record as a single VarChar and parse it within the Transformer stage.

Use stage variables to detect the arrival of a new parent and to accumulate the output record.

Use a downstream RemoveDuplicates stage to preserve the last of each group (with the same parent record).

You didn't mention needing to do anything with the header and trailer records, so I haven't discussed these.

Posted: Fri Oct 18, 2013 9:31 am
by FranklinE
Some assumptions:

That the number of bytes between the first byte of the 001 "parent" and the last byte prior to the next 001 is always the same.

That there is always a fixed-width (padded if necessary) set of 010, 020 and 030 children.

Then, a variation of Ray's suggestion will work for you out of a CFF stage:

Record keys are 000 for header, 999 for trailer (they can be dropped or have their own output links) and 001 for details, and the record is parsed into a table definition for the "parent" and "child" columns.

I put quotes around the terms because what you've described is standard for a Cobol-generated source.

The most important implication is that every record -- 000, 001 and 999 -- is fixed on the same width. If the header and trailer are different lengths, you have a different problem.

Posted: Mon Oct 21, 2013 6:54 am
by gssr
its working good now.
I missed to add and information that, there could be multiple child records with same identifier (Say there could be more than 010 record.)

The job design is,
1. CFF Stage to extract all type of records(Header, Parent, Child and Trailer)
2. Transformer Stage (run in sequence) to generate a column to identify each new parent record.
The stage variable derivation is,

Code: Select all

vChk ==>  if DSLink118.Field001 = "001" Then vChk+1 else vChk
3.Used different links for each type of record
4.Join all the Links with key column as vChk

Thanks for all your responces