How to convert the Hexa decilmal file into txt formet

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

jwiles
Premium Member
Premium Member
Posts: 1274
Joined: Sun Nov 14, 2004 8:50 pm
Contact:

Post by jwiles »

And if not, what did you do to create this view that you are showing us? What utility/command/steps?
- james wiles


All generalizations are false, including this one - Mark Twain.
jwiles
Premium Member
Premium Member
Posts: 1274
Joined: Sun Nov 14, 2004 8:50 pm
Contact:

Post by jwiles »

To keep this going, you're most likely going to need to use the Complex Flat File stage to read this file--it would likely be the simplest method for you to use. However, you MUST have the layout/metadata info for the file and the various record types within it or you will not be able to accurately read the data.

If you are unable to obtain the metadata for this file, perhaps you can request that the data provider convert it to an ASCII csv file or similar, which will be much easier for you to recognize and read into DataStage.

Regards,
- james wiles


All generalizations are false, including this one - Mark Twain.
maheshkumar.bm
Participant
Posts: 19
Joined: Tue Feb 22, 2011 10:02 pm
Location: mumbai

Post by maheshkumar.bm »

I dont have any layout or cobal copy book ,but i should run that hexa file through CFF stage line by line. myself only create the RECORDs data.
Here i mentioned first record level number=2, native type is varchar and length is 5,second record level num=2,native type is varchar and length is 80.

when i ran this got Warning like :

Stg_Seq: When checking operator: When binding input interface field "FIRST_1" to field "FIRST_1": Implicit conversion from source type "string[max=5]" to result type "int32": Converting string to number.

and ERROR is like :

APT_CombinedOperatorController,0: Un-handled conversion error on field "FIRST_1 " from source type "string[max=5]" to destination type "int32":
source value=""; the result is non-nullable and there is no handle_null to specify a default value.
My doubt is without any layout we can extract the data from Hexadecimal file ,?is it possible ?
mahesh kumar B.M.
jwiles
Premium Member
Premium Member
Posts: 1274
Joined: Sun Nov 14, 2004 8:50 pm
Contact:

Post by jwiles »

You will find it very, very, very difficult to accomplish this without a layout from your data provider, especially given the fact that you are not familiar with EBCDIC/mainframe data formats. That unfamiliarity makes it much harder to recognize where individual data columns begin and end (which is already hard enough when dealing with non-character data). This is why I have pushed the need for layout information so strongly.

Above all else, I recommend that you request a layout and/or ask for additional help from someone who is familiar with EBC/mainframe data to aid you in examining the raw data.

For this error, I would first recommend disabling operator combination ($APT_DISABLE_COMBINATION=1). This will let you know exactly which operator is having the error. The error is happening because the stage is attempting to convert an empty/null varchar to a non-nullable integer. I'm confused at this point because based on the examples you gave previously, I can't figure out why you defined a 5-byte varchar field?

Regards,
- james wiles


All generalizations are false, including this one - Mark Twain.
jwiles
Premium Member
Premium Member
Posts: 1274
Joined: Sun Nov 14, 2004 8:50 pm
Contact:

Post by jwiles »

Mahesh,

I can no longer find the original dump of your data in this thread, so I will have to make a few guesses here.

First, as I mentioned earlier: This file appears to have been created on a mainframe as a variable-length record file. This means that each record varies in length---they are not fixed-length records. Although you want to create fixed-length records from them, you must read them as variable-length. Also, the data is in native mainframe format (EBCDIC characters and binary numeric data), which you have to view in hexadecimal form to see what is there. If you have a text editor that can support EBCDIC, you can probably read at least the character portion of the data with it.

FYI: The physical layout of this file is something like this:

The first 4 bytes are probably what's known as the "Block Descriptor Word", which can describe how long the block of data containing records is. In this case it was empty (all zeros).

The next 4 bytes are the "Record Descriptor Word", which describe the length of an individual record (including the RDW). The first two bytes are the actual record length in SmallInt form (values from 5-32768 bytes), the next two are filler.

Behind the RDW is the record data itself. The data length is RDW-4 bytes. So, if the RDW value is 80, then the record data is 80-4=76 bytes in length. I attempted to give you possible layout information on the first two records in your file (they were two different lengths) in my earlier posts.

Once a "block" of data has been filled by the record data, a new Block Descriptor Word is written, followed by additional RDW/RecordData combiniations.

In order to read the file, you will need to set the Record Type option on the File tab of CFF to either Variable or Variable-block (I'm leaning towards Variable Block at the moment). The Record Prefix, IIRC, is meant to describe the RDW (above). It should be either 2 or 4. If 2, then you need to add 2 bytes of filler to the record definition. I'm not in a position to try it out myself right now. Try either to see what happens.

Remember, the record length is a Small (2-byte) big-endian integer, not 1 byte.

The rest of the record data is where you really need the help, but you might initially try defining the records as VarChar() and concentrate on the options surrounding the RDW to try to at least read the file. Then you can work around with the record layouts themselves.

Regards,
- james wiles


All generalizations are false, including this one - Mark Twain.
maheshkumar.bm
Participant
Posts: 19
Joined: Tue Feb 22, 2011 10:02 pm
Location: mumbai

Post by maheshkumar.bm »

jwiles wrote:Mahesh,

The first 4 bytes are probably what's known as the "Block Descriptor Word", which can describe how long the block of data containing records is. In this case it was empty (all zeros).

The next 4 bytes are the "Record Descriptor Word", which describe the length of an individual record (including the RDW). The first two bytes are the actual record length in SmallInt form (values from 5-32768 bytes), the next two are filler.

Behind the RDW is the record data itself. The data length is RDW-4 bytes. So, if the RDW value is 80, then the record data is 80-4=76 bytes in length. I attempted to give you possible layout information on the first two records in your file (they were two different lengths) in my earlier posts.

It should be either 2 or 4. If 2, then you need to add 2 bytes of filler to the record definition. I'm not in a position to try it out myself right now. Try either to see what happens.

Remember, the record length is a Small (2-byte) big-endian integer, not 1 byte.

Regards,
Hi James ,

i follow what u said earlier and thank full to u
I tried the job specifying first record length is 76 and second is record length is 42 . job had executed successfully and 2 rows processed but in the out put there is no data two empty columns are created .
mahesh kumar B.M.
Post Reply