COBOL flat file import

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

FTP is notorious for stripping trailing spaces in records. Can you find another mechanism to move the files, say like scp?
-craig

"You can never have too many knives" -- Logan Nine Fingers
FranklinE
Premium Member
Premium Member
Posts: 739
Joined: Tue Nov 25, 2008 2:19 pm
Location: Malvern, PA

Post by FranklinE »

Code: Select all

Record type = implicit 
Delimiter = none 
Character set = EBCDIC 
Data format = binary 
Allow all zeroes = yes
Check your FTP properties against this list. If the session is dropping trailing spaces, I would guess (not having experimented directly) that you have something different for record type, delimiter and/or data format.

I always get trailing spaces. I can't think of an EBCDIC file I read that doesn't have trailing spaces at the end of every record.

Good luck.
Franklin Evans
"Shared pain is lessened, shared joy increased. Thus do we refute entropy." -- Spider Robinson

Using mainframe data FAQ: viewtopic.php?t=143596 Using CFF FAQ: viewtopic.php?t=157872
leathermana
Premium Member
Premium Member
Posts: 19
Joined: Wed Jul 14, 2010 1:10 pm

Post by leathermana »

I found the FTP command "quote site trailingblanks" executed before the "get" command solves the trailing blank problem. I now am getting "BAD_FIELD" messages in my log for the packed decimal fields. I think I've tried about every combination of parameters and beginnning to think that the ftp transfer from EBCDIC to ASCII is causing problems. The very first (of many) log error message is :
Complex_Flat_File_0,0: Importing CL_TRANS_AMT1: {00 00 00 00 fc 1a @}, at offset: 158
Complex_Flat_File_0,0: -> BAD_FIELD
The hex values for this field are 00 00 00 00 FC 1A 40. The field is defined as Decimal 13 2 and the Properties in the Records Tab in the CFF stage are as follows:
Native type = DECIMAL
length = 13
scale = 2
level number = 02
usage = COMP-3
Derived Attributes:
SQL type = Decimal
Storage length = 7
Picture = PIC 9(11)V9(2) COMP-3

I don't know how to determine if this is bad data for a packed decimal except DS is telling me it is.
Alden

"All we need is here." -Wendell Berry
FranklinE
Premium Member
Premium Member
Posts: 739
Joined: Tue Nov 25, 2008 2:19 pm
Location: Malvern, PA

Post by FranklinE »

That one might be easy, and might help you clean up your table definition.

00 00 00 00 FC 1A 40

That is not a valid value for a packed-decimal field. The FAQ references this (looking with a frown at our site support, ahem). In a COMP-3 field, storage contains each decimal integer in each half-byte with the righ-most half-byte reserved for the numeric sign. Using your 9(11)V9(2) example, +123456.78 would look like this in storage: 00 00 01 23 45 67 8C. The last half-byte would be "D" for a negative or "F" for unsigned.

I believe your source system is writing that field as something other than COMP-3. The only other possibility is that your field position and length is incorrect for the location of the packed decimal field.
Franklin Evans
"Shared pain is lessened, shared joy increased. Thus do we refute entropy." -- Spider Robinson

Using mainframe data FAQ: viewtopic.php?t=143596 Using CFF FAQ: viewtopic.php?t=157872
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

You cannot simply take a record that contains both string and packed decimal data in EBCDIC and convert the entire record to ASCII. Any possibility that is what is happening? You need to only convert the string fields, the packed fields will be happily destroyed if any 'conversion' is done on them. Packed is packed and they need to make the trip unmolested. Been there, been bit by that, got the t-shirt. :wink:

Back in the day (and from what little I recall as it was long ago and far away), we were using an FTP process that allowed us to specify the byte ranges to translate and the byte ranges to leave alone.

If you're already handling this then perhaps your fields positions / lengths are off as Franklin noted.
-craig

"You can never have too many knives" -- Logan Nine Fingers
leathermana
Premium Member
Premium Member
Posts: 19
Joined: Wed Jul 14, 2010 1:10 pm

Post by leathermana »

An update here.... I have now been able to bring in all the data types except the two PIC S9(4) COMP fields. I discovered an article in IBM documentation about the iSeries servers (which is not the server I'm transferring from) that suggested using type EBCDIC and mode BLOCK to ftp the data. That gets the packed decimal data and eliminates the trailingblanks issue as well. I am now trying to figure out what to do about the PIC S9(4) COMP fields. They come in as hex 07 DE and 07 DD (ebcdic)and are supposed to be 4 digit year data and DS log tells me BAD_FIELD. Guess I'll try the BINARY FTP again. Maybe I can get something working with that this time.

On another lovely note.... it's been 8 days since I got the email telling me that the access issue to the Using Mainframe Source Data FAQ would be fixed in 12 hours. I have a feeling that this could be REALLY REALLY helpful.
Alden

"All we need is here." -Wendell Berry
FranklinE
Premium Member
Premium Member
Posts: 739
Joined: Tue Nov 25, 2008 2:19 pm
Location: Malvern, PA

Post by FranklinE »

For the COMP data issue: You are clearly getting the correct values, you just need to get them "translated" properly to their decimal values.

I discovered a floating-point problem in there -- it's been a while, and I didn't search which thread discusses it -- but try forcing the EBCDIC to ASCII to see the data as integer instead of float. It could look something like this:

Code: Select all

DecimalToDecimal(AsInteger(inlink.COMP_FIELD))
Edit: See my opening post on viewtopic.php?t=146765
Franklin Evans
"Shared pain is lessened, shared joy increased. Thus do we refute entropy." -- Spider Robinson

Using mainframe data FAQ: viewtopic.php?t=143596 Using CFF FAQ: viewtopic.php?t=157872
leathermana
Premium Member
Premium Member
Posts: 19
Joined: Wed Jul 14, 2010 1:10 pm

Post by leathermana »

Franklin, I think I've picked up that you wrote the FAQ that I've seen referenced many times in regard to working with Mainframe files. I still am not able to get access to that topic. Is it possible you could get that info to me via email or another link? I tried the suggestion of DecimalToDecimal(AsInteger... ) you suggested and every variation and expanded variation of that I could think of, but am stymied on this. I am getting 0000 for these "year" fields. I have also tried defining the CFF Output Columns in all kinds of different formats and nothing is getting non-zero data. If I leave the input SQL Type as Binary in the Transformer I get a compile error "Invalid conversion requested from a raw to a dfloat" . How did you determine that I am "clearly getting the correct values"?
Alden

"All we need is here." -Wendell Berry
FranklinE
Premium Member
Premium Member
Posts: 739
Joined: Tue Nov 25, 2008 2:19 pm
Location: Malvern, PA

Post by FranklinE »

Alden, the easy answer first: the hex values you posted were (in order) decimal 2014 and 2013. You indicated that they were years.

DataStage has a serious flaw, in my never humble opinion. It has no transform functions from raw to complement the several ones to raw. I don't know why.

Let me see if I have any local (to me) examples. I'm puzzled that you can't find the right formatting.
Franklin Evans
"Shared pain is lessened, shared joy increased. Thus do we refute entropy." -- Spider Robinson

Using mainframe data FAQ: viewtopic.php?t=143596 Using CFF FAQ: viewtopic.php?t=157872
leathermana
Premium Member
Premium Member
Posts: 19
Joined: Wed Jul 14, 2010 1:10 pm

Post by leathermana »

I just had success on all fields with the Sequential File stage without a transformer. The combination of setting the Output Format parameters along with editting the individual Column Meta Data for the packed decimal and binary fields works like a charm. Clearly this wouldn't work with OCCURS fields etc. but apparently packed decimal and binary in combination with character fields is very doable.... once I got the file transferred from the Mainframe properly using ftp EBCDIC Block transfer options. If anyone would like details on settings used in the Sequential File Stage I can provide that.
Alden

"All we need is here." -Wendell Berry
FranklinE
Premium Member
Premium Member
Posts: 739
Joined: Tue Nov 25, 2008 2:19 pm
Location: Malvern, PA

Post by FranklinE »

Do me a favor and look at the setting for Byte Order. I just got my binary (COMP) dates to go through successfully using CFF on a sequential file by changing "Native-endian" to "Big-endian". My guess is that you are using "Big-endian".

I consider this worthy of a warning: For EBCDIC reads in CFF, check this setting (on Record Options tab) and adjust it after checking the byte order of your binary fields.
Franklin Evans
"Shared pain is lessened, shared joy increased. Thus do we refute entropy." -- Spider Robinson

Using mainframe data FAQ: viewtopic.php?t=143596 Using CFF FAQ: viewtopic.php?t=157872
leathermana
Premium Member
Premium Member
Posts: 19
Joined: Wed Jul 14, 2010 1:10 pm

Post by leathermana »

I've been running with Native Endian. I tried running with both Big and LIttle and see no difference in my results.
Alden

"All we need is here." -Wendell Berry
FranklinE
Premium Member
Premium Member
Posts: 739
Joined: Tue Nov 25, 2008 2:19 pm
Location: Malvern, PA

Post by FranklinE »

Thanks, Alden. This is very interesting. When I ran under Native, the COMP fields were both "read" in Little -- bytes in reverse order. The binary storage for "20130128" is 01 33 29 50, and it was coming through as 50 29 33 01. It was corrected only after I manually changed to Big.
Franklin Evans
"Shared pain is lessened, shared joy increased. Thus do we refute entropy." -- Spider Robinson

Using mainframe data FAQ: viewtopic.php?t=143596 Using CFF FAQ: viewtopic.php?t=157872
leathermana
Premium Member
Premium Member
Posts: 19
Joined: Wed Jul 14, 2010 1:10 pm

Post by leathermana »

Just to be clear about my last post, the no difference in results applies to both the CFF where I got bad results (0000 for year) and the Sequential File stage where I got correct results for year (COMP) data. Tried 'em all. Does seem strange.
Alden

"All we need is here." -Wendell Berry
Post Reply