How to read this file?

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
ds_user78
Participant
Posts: 23
Joined: Thu Nov 11, 2004 5:39 pm

How to read this file?

Post by ds_user78 »

Hi,
I have a data file which has the following layout. How do I read this in DS?
  • 10|ASBT115072 25C4|05|50001670|01|05192003|1|4|EA
    40|ASBT115072 25C4|05|50001670|01|SINGLE PHASE PAD TYPE 2 (line 1)
    20|ASBT115072 25C4|05|50001670|01|0002||02005|CP05|5RTST|Ratio Test|1.000|EA|X|1|1|1.500|H|5.650|H|3.300|H|0.00
    21|ASBT115072 25C4|05|50001670|01|0002||Ratio Test
    40|ASBT115072 25C4|05|50001670|01|Header Long Text line 2
Based on the first two characters the structure of the record can vary. Also there is no fixed number of records for each record type except that 10 is the first record type always and it occurs only once for a set of records containing the record types 10,20,21,40. I tried using CFF stage. I am not sure how I should go about it.
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

How about declaring the record as containing just one column, then in a transform get the record type by using the function FIELD(InRecord,"|",1). Then you can split and parse the rest of the columns using the FIELD function as required. This will work in your case, sometimes these types of data rows contains over 50 columns and then it becomes a bit tedious to manually program all the splits.
roy
Participant
Posts: 2598
Joined: Wed Jul 30, 2003 2:05 am
Location: Israel

Post by roy »

Hi,
To add on what was said you could also use the 1 column definition and transformer to split the file to several files and read each one with it's real table definition if you like to and can afford another write to disk of the source file.

by the way for the creative mind, one could combine a filter command with some OS grep or sed for reading the file several times in parallel using different table definition each reading the rows relevant to that table definition and has it's own flow.

usually a CFF is the natural choice but if for some reason you can't use it in your case you have several alternatives :)

IHTH,
Roy R.
Time is money but when you don't have money time is all you can afford.

Search before posting:)

Join the DataStagers team effort at:
http://www.worldcommunitygrid.org
Image
Sainath.Srinivasan
Participant
Posts: 3337
Joined: Mon Jan 17, 2005 4:49 am
Location: United Kingdom

Post by Sainath.Srinivasan »

If they imply different meaning and must be processed differently, then use a 'grep' as filter command and extract only the relevant rows.

Alternatively, use some functionality like 'awk' as filter to set the number of columns to be same and define all as VARCHAR(9999) delimited by '|' symbol.

You can define the processing logic in your job.
clshore
Charter Member
Charter Member
Posts: 115
Joined: Tue Oct 21, 2003 11:45 am

Post by clshore »

So, is this a translated COBOL file or what? Is the FD available?
Perhaps you can deal with it as Complex Flat File.

Carter
vigneshra
Participant
Posts: 86
Joined: Wed Jun 09, 2004 6:07 am
Location: Chennai

Post by vigneshra »

Hi
Instead of juggling with DS logic, better you use grep command and seperate the records based on the value in the first column and put the records in seperate file. Later develop jobs for these individual files. That would be a better option to do if the number of values the first column can take is less. Or else go with the options that other gurus are suggesting.
Vignesh.

"A conclusion is simply the place where you got tired of thinking."
sumitgulati
Participant
Posts: 197
Joined: Mon Feb 17, 2003 11:20 pm
Location: India

Post by sumitgulati »

Is the number of record types fixed. In other words there are maximum four record types 10,20,21,40.

-Sumit
Post Reply