Page 1 of 1

Mainframe COBOL file

Posted: Wed Nov 17, 2004 7:09 am
by mleroux
I have a mainframe file that's been FTP'd to the DataStage server and this data has to go out into a normal Unix flat file. I haven't worked with an EBCDIC mainframe file in DataStage before, and it shows, so I could do with some help please.

I don't have access to the CFD but the file definition has been given me:

Code: Select all

X72002-CO-ID              PIC XX.
X72002-ACCOUNT-NO         PIC S9(15)      COMP-3.
X72002-PRDCT-CD           PIC X(03).
X72002-SUB-PRODUCT        PIC X(02).
X72002-CUST-NO            PIC 9(15).
X72002-PRIMARY-OFFICER    PIC X(05).
X72002-BALANCE            PIC S9(15)      COMP-3.
X72002-UNCL-FUNDS         PIC S9(11)V99   COMP-3.
X72002-OVERDRAFT-LIMIT    PIC S9(11)V99   COMP-3.
X72002-UNCLRD-FUNDS-IND   PIC X.
X72002-GROUP-NO           PIC 9(04).
X72002-GROUP-LIMIT        PIC S9(11)V99   COMP-3.
As far as I know there is only one level in the file. I used a Complex Flat File (CFF) stage and defined all the above as source columns, typing PIC X's as CHARACTER, PIC 9's as DISPLAY_NUMERIC and the PIC S9(11)V99 as DECIMAL. I couldn't find anything special to do about the packed COMP-3's except for ticking the Verify sign value in DECIMAL COMP-3 data box.

I've defined the data format as EBCDIC and the record style as Binary (CR/LF record style produces a View Data that looks much more chaotic). I've made the level for all columns 02.

When I do a View Data I can see some patterns which look like data but occurring in different fields, e.g. a pattern appears in BALANCE, then a few rows later in UNCL-FUNDS and then in OVERDRAFT-LIMIT so it looks like the rows are too long, but there's also tons of garbage in the view and I can't see all the data I'm supposed to be seeing.

As far as I know the FTP from the mainframe was done by the mainframe guys and as a binary FTP so the file should be OK.

Am I barking up the wrong tree here?

Re: Mainframe COBOL file

Posted: Wed Nov 17, 2004 11:41 am
by ogmios
From experience... it's much easier to have the mainframe guys unpack all fields, strip all low- and high fields and do the EBCDIC/ASCII conversion.

In the very beginning I spent weeks trying to get a couple of EBCDIC files processed.

Ogmios

Posted: Wed Nov 17, 2004 1:27 pm
by chucksmith
Since you are on a unix server, try using the od -xc command to product a hex dump of the file. True, it will be ebcdic, but the ebcdic conversion table available via online help is a handy tool.

With the dump file, verify your metadata. Comp-3 fields will be obvious using the dump. Each hex digit will be one decimal number.

The CFF stage should do what you need, but, like all stages, it is only as good as your metadata.

Posted: Wed Nov 17, 2004 2:12 pm
by coolkhan08
The pic s9 is the one that could be creating the offset for you. In the transformer You need to use fmt DatatypePicComp for converting that field. I had the same problem but used fmt(DatatypePicComp(link.columnname),"R#9") for a pic s9 (09) field and it solved the problem. Hope this helps.
Sam.

Posted: Thu Nov 18, 2004 1:03 am
by ray.wurlod
Where do you get 95 from? The addresses down the left are still in hex, and I observe that 0x92 (the end of the first record) is 146 in decimal.

Sorry I can't help with the rest of it just now; deadlines loom!

Posted: Thu Nov 18, 2004 1:05 am
by mleroux
Apologies, "DDA" must be something else- it's too early in the record. The string "000000000124507" must be ACCOUNT-NO though, since it's 15 bytes long.

00 0F 00 00 consistently occurs at the start of each new record...

Posted: Thu Nov 18, 2004 1:17 am
by ray.wurlod
Are you beginning to suspect that the file definition does not match the data? There are certainly more than two characters preceding the 15 digit number! :x

Posted: Thu Nov 18, 2004 1:17 am
by mleroux
Apologies again, I forgot about the COMP-3 packing, so the string mentioned earlier won't necessarily be ACCOUNT-NO, right? The next 15-digit number that's not COMP-3'd is CUST-NO.

So, considering the COMP-3 packing, "DDA" could very well be PRDCT-CD. :?

Posted: Thu Nov 18, 2004 1:26 am
by mleroux
I am suspecting that the definition doesn't match the data but don't want to jump to conclusions with my limited mainframe file knowledge.

Posted: Thu Nov 18, 2004 2:16 am
by mleroux
From my guesswork it looks like it could be:

Code: Select all

0Ah - 0Ch (bytes 11 - 13): PRDCT-CD ("DDA")
0Dh - 0Eh (bytes 14 - 15): SUB-PRODUCT (null value)
0Fh - 1Dh (bytes 16 - 30): CUST-NO ("000000000124507")
1Eh - 22h (bytes 31 - 35): PRIMARY-OFFICER (spaces?)
The rest is gobbledeygook again, since there are three COMP-3 fields following these fields in the definition.

I have to attend to more pressing matters for now but thanks for everyone's help thus far.

To quote some Austrian-turned-Californian guy, "AYE'LL BEEE BAAAK." :wink:

Posted: Thu Nov 25, 2004 5:52 am
by mleroux
At last I got a bit of time to spend on this again. It turned out that the definitions supplied are indeed slightly off from the actual data:

Code: Select all

X72002-CO-ID              PIC XX. 
X72002-ACCOUNT-NO         PIC S9(15)      COMP-3. 
X72002-PRDCT-CD           PIC X(03). 
X72002-SUB-PRODUCT        PIC X(02). 
X72002-CUST-NO            PIC 9(15). 
X72002-PRIMARY-OFFICER    PIC X(05). 
X72002-BALANCE            PIC S9(11)V99   COMP-3.   <= *
X72002-UNCL-FUNDS         PIC S9(11)V99   COMP-3. 
X72002-OVERDRAFT-LIMIT    PIC S9(11)V99   COMP-3. 
X72002-UNCLRD-FUNDS-IND   PIC X. 
X72002-GROUP-NO           PIC X(04).                <= *
X72002-GROUP-LIMIT        PIC S9(11)V99   COMP-3.
The asterisks (*) above indicate the two column definitions that differ. So the actual data is only the first 68 bytes (00h - 43h) while the record length is 146 bytes. The padding at the end of the record are nulls.

So all's well and the data is looking good. A nice learning experience on the complex flat file stage!

A tip for anyone that might stumble across this thread in some future search: Right-clicking a column entry in the CFF's Source Columns tab and then clicking Edit row... will pop up a form where the column definitions can be edited. There, a read-only box indicating the Storage length (in bytes) is very useful when reverse-engineering COMP-3 packed-decimal fields.