Low values in Mainframe file
Moderators: chulett, rschirm, roy
-
- Participant
- Posts: 40
- Joined: Mon May 11, 2009 12:19 am
- Location: Madurai
Low values in Mainframe file
Is it possible to identify low values in complex flat stage (source stage)?
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
If you are coming from a mainframe then low values a lot of times is a char(0). This can mess up a lot stuff. Remember PX jobs are really C++ underneath covers. char(0) can trigger an end of string. So functions like len() and other functions may give funny results. You need to pick the right charset and you need probably use convert to strip out char(0).
Mamu Kim
Assuming that your source system is in Cobol -- that being where the term "low values" is normally used -- the definition of each field is your correct starting point. Low values is the filling of x"00" in each byte. The difficulties start because x"00" is the ASCII code for null.
The safest way to handle this is to import your table definitions from the Cobol copybooks. They should be text files that show the field definitions using Cobol syntax. There is also the development of the Cobol programs and how well the hold to coding standards, one of which is very simple: Character fields -- PIC X -- must be initialized or filled with spaces, not low values.
If the Cobol standards are not being met, you must inspect your character fields and replace low values with spaces. Again, this is the best practice, because with variable length fields being the exception rather than the rule you will need to maintain field and record lengths or you will have problems every step afterwards.
The FAQ Using Mainframe Source Data at viewtopic.php?t=143596 provides a few more details to consider.
kduke: The charset is not critical. It's understanding how the mainframe charset translates to the one required on the DataStage side. Changing charsets should be a last resort, in my opinion.
The safest way to handle this is to import your table definitions from the Cobol copybooks. They should be text files that show the field definitions using Cobol syntax. There is also the development of the Cobol programs and how well the hold to coding standards, one of which is very simple: Character fields -- PIC X -- must be initialized or filled with spaces, not low values.
If the Cobol standards are not being met, you must inspect your character fields and replace low values with spaces. Again, this is the best practice, because with variable length fields being the exception rather than the rule you will need to maintain field and record lengths or you will have problems every step afterwards.
The FAQ Using Mainframe Source Data at viewtopic.php?t=143596 provides a few more details to consider.
kduke: The charset is not critical. It's understanding how the mainframe charset translates to the one required on the DataStage side. Changing charsets should be a last resort, in my opinion.
Franklin Evans
"Shared pain is lessened, shared joy increased. Thus do we refute entropy." -- Spider Robinson
Using mainframe data FAQ: viewtopic.php?t=143596 Using CFF FAQ: viewtopic.php?t=157872
"Shared pain is lessened, shared joy increased. Thus do we refute entropy." -- Spider Robinson
Using mainframe data FAQ: viewtopic.php?t=143596 Using CFF FAQ: viewtopic.php?t=157872
-
- Participant
- Posts: 40
- Joined: Mon May 11, 2009 12:19 am
- Location: Madurai