compressed character fields
Moderators: chulett, rschirm, roy
-
- Participant
- Posts: 150
- Joined: Tue Mar 13, 2007 1:17 am
compressed character fields
Hi,
Well, We are getting files with compressed character fields from assembler programs(theese are not COMP-3 fields) which when we try to read through data stage shows junk characters. the character set is EBCDIC. hence, to read it properly we have to read these fields as ascii(we define this in the osh after the job is compiled) and then append a byte using a c program for each character(when I try to append a byte with the prefix byte option it still doesnt work). Is there a way or an NLS that we can define that we dont have to define ascii and then append the byte instead ds does it all it self using a NLS.
can we define our own NLS? if yes, then how?
Well, We are getting files with compressed character fields from assembler programs(theese are not COMP-3 fields) which when we try to read through data stage shows junk characters. the character set is EBCDIC. hence, to read it properly we have to read these fields as ascii(we define this in the osh after the job is compiled) and then append a byte using a c program for each character(when I try to append a byte with the prefix byte option it still doesnt work). Is there a way or an NLS that we can define that we dont have to define ascii and then append the byte instead ds does it all it self using a NLS.
can we define our own NLS? if yes, then how?
"compressed character fields"?
OK, so they're not COMP-3 but what exactly are they? Never heard of them and the only occurrance of that particular string that Google returns an exact match on is your post here. Can you get a more "official" name / description for what is in those fields? How exactly they were "compressed"?
![Confused :?](./images/smilies/icon_confused.gif)
OK, so they're not COMP-3 but what exactly are they? Never heard of them and the only occurrance of that particular string that Google returns an exact match on is your post here. Can you get a more "official" name / description for what is in those fields? How exactly they were "compressed"?
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers
-
- Participant
- Posts: 150
- Joined: Tue Mar 13, 2007 1:17 am
well, this is my understanding of the fields. when I say compressed characters I mean in general 1 character is stored in 1 byte in ascii and ebcdic. here they are storing 2 digits in 1 byte. to do that they are storing the data in the binary format so that they can save the nybble and store another digit in the saved nybble. and also, they have converted the values in hex which means in 1 nybble the can store the value upto 15 -->F and in 1 byte they can store the value upto 255 --> FF(2 digits stored in 1 byte). Hence, we have a tedious process of reading these files, so, wanted to know if any one of you have come across such scenario or can throw any light about how to proceed.
trying to read the field as binary has not helped anyways.
Thanks!!!!
trying to read the field as binary has not helped anyways.
Thanks!!!!
-
- Participant
- Posts: 150
- Joined: Tue Mar 13, 2007 1:17 am
-
- Participant
- Posts: 150
- Joined: Tue Mar 13, 2007 1:17 am
-
- Participant
- Posts: 150
- Joined: Tue Mar 13, 2007 1:17 am
A couple of example fields may suffice so, provided you can read it on your screen, you could type it in to a post rather than copy it from your clients network.
It would probably be easier for you in the long run if it can be identified and handled by a CFF rather than you re-inventing the wheel in a custom stage.
It would probably be easier for you in the long run if it can be identified and handled by a CFF rather than you re-inventing the wheel in a custom stage.
-
- Participant
- Posts: 150
- Joined: Tue Mar 13, 2007 1:17 am
I can't seem to be able to read this data using a CFF however,
If you read the data in a sequential file, define the column as binary with a length equal to the number of bytes.
In a BASIC transformer, in the column derivation use the transform DataTypePicComp3Unsigned(<column>)
Define the output column as integer (or decimal as you require) with the appropriate length
Your example of
01357
02468
Returns 12345678
Hope that helps unless/until someone can find a better way.
If you read the data in a sequential file, define the column as binary with a length equal to the number of bytes.
In a BASIC transformer, in the column derivation use the transform DataTypePicComp3Unsigned(<column>)
Define the output column as integer (or decimal as you require) with the appropriate length
Your example of
01357
02468
Returns 12345678
Hope that helps unless/until someone can find a better way.
-
- Participant
- Posts: 150
- Joined: Tue Mar 13, 2007 1:17 am
-
- Participant
- Posts: 150
- Joined: Tue Mar 13, 2007 1:17 am
Finally I have managed to get some dummy data out of clients network. below are the file details.
01 XXXXX.
05 XXXXX-ID PIC X(1).
05 XXXXX-NAME PIC X(1).
05 XXXXX-XC PIC X(1).
05 XXXXX-SC OCCURS 0 TO 255 TIMES
DEPENDING ON XXXXX-XC.
10 XXXXX-SC-TAMT PIC X(2).
10 XXXXX-SC-TDT PIC X(2).
the file is a variable block file like
BDW : 0407(this is stored in the hex value)
RDW: 0403(This is stored in the hex value)
XXXXX-ID --> 10 (although this is a string it has packed values without the sign nibble.)
XXXXX-NAME --> 20 (although this is a string it has packed values without the sign nibble.)
XXXXX-SC --> FF(Hex value)
XXXXX-SC-TAMT --> 100(255 times in packed format without the sign nibble)
XXXXX-SC-TDT --> 200(255 times in packed format without the sign nibble).
here is the link to the sample file
http://rapidshare.com/files/400142047/packed.html
01 XXXXX.
05 XXXXX-ID PIC X(1).
05 XXXXX-NAME PIC X(1).
05 XXXXX-XC PIC X(1).
05 XXXXX-SC OCCURS 0 TO 255 TIMES
DEPENDING ON XXXXX-XC.
10 XXXXX-SC-TAMT PIC X(2).
10 XXXXX-SC-TDT PIC X(2).
the file is a variable block file like
BDW : 0407(this is stored in the hex value)
RDW: 0403(This is stored in the hex value)
XXXXX-ID --> 10 (although this is a string it has packed values without the sign nibble.)
XXXXX-NAME --> 20 (although this is a string it has packed values without the sign nibble.)
XXXXX-SC --> FF(Hex value)
XXXXX-SC-TAMT --> 100(255 times in packed format without the sign nibble)
XXXXX-SC-TDT --> 200(255 times in packed format without the sign nibble).
here is the link to the sample file
http://rapidshare.com/files/400142047/packed.html