Complex Flat File and multple, conditional record types.

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
SettValleyConsulting
Premium Member
Premium Member
Posts: 72
Joined: Thu Sep 04, 2003 5:01 am
Location: UK & Europe

Complex Flat File and multple, conditional record types.

Post by SettValleyConsulting »

Currently pondering the most elegant solution to this :-

I have an incoming mainframe file, EBCDIC, fixed 80-char width. It has a variety of different record types within the file, identified by a record type column.

First thoughts it's a no-brainer to use a Complex Flat File. BUT once we have stripped out various headers and trailers we are interested in two record types , Transaction and Addendum - Record Type 5 and 6. The transaction record has an addendum indicator and if this is set to 1 then it will be immediately followed in the file by an addendum record. Where this happens a single output record needs to be built containing information merged from both the transaction and the addendum. Unfortunately there are no key or common fields on both records, the only thing that connects transaction and addendum is the fact that the addendum immediately follows its parent transaction. Not all transactions have addendums (addenda?).

SO I am now thinking of :

Reading the file in a CFF

Using two output links

1. Transactions with no addendum. (Constraint: Rec type = 5 AND Addendum Indicator = '0') - use CFF to parse into columns.

2. Transactions with addendums, and addendums. (Constraint: (Rec type = 5, AND Addendum Indicator = '1') OR Rec type = 6)


Write the output from link 2 to a transformer, for each Transaction record write the record to a Stage Variable, without writing it forward, when the Addendum row arrives write a record containing the required columns from the Transaction and Addendum then downstream combine these into the required format using some combination of Transformer and Column Import (I have some zoned decimals to parse out). Add in a few checks (e.g. every expected addendum actually exists), and funnel with Link 1 (Once we've done this, sort order becomes irrelevant).

I am not a Complex Flat File Stage guru by any means, so I was wondering if there is a more elegant solution using the native capabilities of this stage? I've posted this in the PX forum, however there is no objection to doing this in Server it it is easier ....
Phil Clarke
SettValleyConsulting
Premium Member
Premium Member
Posts: 72
Joined: Thu Sep 04, 2003 5:01 am
Location: UK & Europe

Doh!

Post by SettValleyConsulting »

Well that's not gonna work is it? :roll:

The CFF only permits you to interrogate a single Record Id column and send all rows of that record type down the same link. I need to send Types 5 AND 6 down the same link, and as the sequence is the only thing that connects them, to preserve the sort order.

Scratch the above and implement the same logic but using a sequential file stage, transformers and Column Import .... it ain't gonna be pretty....
Phil Clarke
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Can't you send 6 and 5 along separate links into a Funnel?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
JRodriguez
Premium Member
Premium Member
Posts: 425
Joined: Sat Nov 19, 2005 9:26 am
Location: New York City
Contact:

Post by JRodriguez »

SettValleyConsulting,

You want to use CFF stage to get this requirement done

If your input records are like:

type 1 + columns
type 2 + columns
...
..
type 5 + transaction columns
type 6 + addendum columns

Then don't worry about the sort order and don't worry about the common field between transaction and addendum records. For each set of input records (Type1, 2,3,4,5,6 ...) CFF stage will output one single flatten record with all columns from the set of input records

In those cases where a parent transaction is not follow by an addendum record ... the columns coming from the addendum record will be empty

Post any issues
Julio Rodriguez
ETL Developer by choice

"Sure we have lots of reasons for being rude - But no excuses
SettValleyConsulting
Premium Member
Premium Member
Posts: 72
Joined: Thu Sep 04, 2003 5:01 am
Location: UK & Europe

Post by SettValleyConsulting »

Thanks for the suggestions -

Ray -
Can't you send 6 and 5 along separate links into a Funnel?
Well, yes I could but (a) they have different metadata and (b) how would I guarantee sort order?

Maybe I did not express the requirement sufficiently clearly - A Type 5 record may or may not be immediately followed in the physical input file by a Type 6. Where a Type 6 does exist I need to build a single output row from fields from both the Type 5 and the Type 6. The only thing that conencts a Type 6 (addendum) with its parent Type 5 (transaction) is the fact that it is physically adjacent in the input file.


JRodriguez - Could you expand upon your answer? What properties do I need to set to instruct a CFF stage to combine two input rows of different types into a single output row?

thanks again,
Phil Clarke
priyadharsini
Participant
Posts: 40
Joined: Mon May 11, 2009 12:19 am
Location: Madurai

Post by priyadharsini »

Try reading the file as single column. process the records sequentially. using stage variables you can combine the transaction and addenda record.
SettValleyConsulting
Premium Member
Premium Member
Posts: 72
Joined: Thu Sep 04, 2003 5:01 am
Location: UK & Europe

Post by SettValleyConsulting »

Try reading the file as single column. process the records sequentially. using stage variables you can combine the transaction and addenda record.
Yep. I am reading, using a sequential file stage, the whole record into a single column RECORD, type char, substringing column 1 to get the record type, and column 80 to get the addendum indicator. Types 5 and 6 go down a single link to a transformer. This has

A constraint RECTYPE = 6 OR (Rectype = '5' and ADDENDUM_IND = '0' )

A stage variable that stores the RECORD if the record type is '5' And ADDENDUM_IND = '1'

Derivations that retrieve svRECORD and substring out the 'Addendum' columns (all char type) if the RECTYPE is '6'.

So:-

If Rectype = '5' [Transaction] and ADDENDUM_IND = '0' (No addendum) the row is written forward. Addendum columns are empty.

If Rectype = '5' [Transaction] and ADDENDUM_IND = '1' (Addendum to follow) the row stored in svRECORD, and not written forward.

If Rectype = '6' [Addendum] svRECORD is written forward and addendum columns populated from the input row.

Think I'll add a flag to indicate 'addendum expected' and throw a warning if the following row is not type 6. Now I'd better get on and code the thing ..... :lol:

All stages that are sort order sensitive to be set to sequential, (Do it in Server?).
Phil Clarke
Post Reply