Append Header and Trailer Record for each group

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
rao_2004
Charter Member
Charter Member
Posts: 7
Joined: Wed Nov 10, 2004 3:38 pm
Contact:

Append Header and Trailer Record for each group

Post by rao_2004 »

Hi,
I have a scenario like this reading data from a file and writing to a file
Say for ex their are some 100 groups in a file and each group might have 50 records.
And i need to append HDR record with seq no and TRL record with count of the records and sum of one column in that group
Any Suggestions on how to implement this logic

Here is the sample output i need for the input.

Input:
A B C
1 24 25
1 24 25
2 88 89
2 88 89
3 44 47
3 44 47

OutPut:
BHDR01
1 24 25
1 24 25
BTLR02(COUNT OF RECORDS)+SUM OF COLUMN B
BHDR02
2 88 89
2 88 89
BTLR02(COUNT OF RECORDS)+SUM OF COLUMN B
BHDR03
3 44 47
3 44 47
BTLR02(COUNT OF RECORDS)+SUM OF COLUMN B
Ronetlds
Participant
Posts: 28
Joined: Thu Mar 30, 2006 12:48 pm

Post by Ronetlds »

To get you started:

3 fields in - Grp field, sum field, 3rd field.

Sort stage:
key change on field Grp field.

Transformer:
3 out links, addl column rec type - 1 for hdr, 2 for detail, 3 for tlr

make all rec layouts match for merge

key change = 1 constraint for hdr, tlr, refresh rec ctr & sum stage variables

Merge:

Sort:

fld1, rec type, fld 2, fld 3, output only needed flds.
rao_2004
Charter Member
Charter Member
Posts: 7
Joined: Wed Nov 10, 2004 3:38 pm
Contact:

Post by rao_2004 »

Can you explain me little bit briefly after sorting data..

Thanks
rao
Ronetlds
Participant
Posts: 28
Joined: Thu Mar 30, 2006 12:48 pm

Post by Ronetlds »

You need to build two addl record types (header and trailer) with layouts matching your detail rec so you can merge them to one file. You can play around with the existing columns or append new columns to do that.

So you need three output links. Two of the output links (header and trailer) will need to be created on break - get this by first using key change feature in a sort stage.

You accumulate the group record counter and group sum in stage variables. You also need a stage variable as a hold area for the group value. Upon a break (key change =1) you will write the header and trailer recs using key change =1 as constraint, populate from the stage variable values as needed, then initalize the stage variables for the next group.

You may need two sets of the above stage variables to do this, one set for hold/write, one set for initalizing on refresh.

Additionally you will need to add a column for rec type 1=header, 2=detail, and 3=trailer, to be used for sorting later.

So your output from the transformer will be something like:

rec type, group, other columns

2 1 other (detail grp 1)
2 1 other (detail grp 1)
1 1 other (Header grp 1)
3 1 other (Trailer grp 1)
2 2 other (detail grp 2)
2 2 other (detail grp 2)
2 2 other (detail grp 2)
1 2 other (Header grp 2)
3 2 other (Trailer grp 2)

Sort on group, rec type

This is the logic flow, you will need to fill in the blanks from here.
throbinson
Charter Member
Charter Member
Posts: 299
Joined: Wed Nov 13, 2002 5:38 pm
Location: USA

Post by throbinson »

Are you using a single node configuration file? If not, then partitioning will severely impact this design if not done correctly. That is, the partitioning must exactly match the sorting scheme. All sort keys must be in the same partition.
rao_2004
Charter Member
Charter Member
Posts: 7
Joined: Wed Nov 10, 2004 3:38 pm
Contact:

Post by rao_2004 »

Thanks for breif explanation of the logic..
I worry little bit abt performance ,how this logic will impact on 10 - 20 million records.
Once i implement this logic ,i'll share my results ...
i use 4 nodes config file..
Post Reply