Handling hierarchical file in DataStage 6.0
Posted: Tue Feb 10, 2004 5:38 pm
We are using DataStage
6.0. Currently we are facing problem with a particular type of Source
files. These are Hierarical files in Unix. Originally they are XML files
but the source system which is sending these files are converting them
into Unix files with pipe delimited. I am attaching a sample file with few
records.
Record Description : The record starts with TroubleHeaderInfo and has 1.
TroubleActivityInfo, 2. LRDetailsInfo, 3. LRCustomerInfo, 4.
LRCentralOfficeInfo, 5. LRFacilityInfo, 6. LRServiceInfo as subtitles. And
each of this subtitle can repeat any number of times as you can see in the
sample file, TroubleActivityInfo has repeated many times in each record.
Moreover, each subtitle has a different number of columns. For example :
TroubleActivityInfo has 81 columns, LRDetailsInfo has 18 columns,
LRCustomerInfo has 14 columns,....... And every subtile starts with two
astriks and ends with a carriage return in Unix.
I would like to understand if there is a possibility for DataStage to
handle such files. Or do we have to do some preprocessing on the Souce
files. Can you also tell me, if it was a XML file, how would we handle it.
And which is the best way to handle it. Please let me know the
possibilities ASAP because we have to work on the design of the jobs
For example:
>**HeadedRecord|2001|7329063231|POMS|20|NE
>**TroubleInfo|2001|7329063231|..............................................80columns
>**TroubleInfo|2001|7329063231|..............................................80columns
>**TroubleInfo|2001|7329063231|..............................................80columns
>**TroubleInfo|2001|7329063231|..............................................80columns
>**LR-CustInfo|2001|7329063231|..............................................20
>
>columns
>**LR-ServiceInfo|2001|7329063231|.......................................10
>columns
>**HeadedRecord|2002|9081063231|POMS|50|NY
>**TroubleInfo|2002|9081063231|..............................................80columns
>**TroubleInfo|2002|9081063231|..............................................80columns
>**LR-CustInfo|2001|9081063231|..............................................20
>
>columns
>**LR-ServiceInfo|2001|9081063231|.......................................10
>columns
>
>This file will be loading many tables. Is it possible to handle this kind
>of file by CFF stage
6.0. Currently we are facing problem with a particular type of Source
files. These are Hierarical files in Unix. Originally they are XML files
but the source system which is sending these files are converting them
into Unix files with pipe delimited. I am attaching a sample file with few
records.
Record Description : The record starts with TroubleHeaderInfo and has 1.
TroubleActivityInfo, 2. LRDetailsInfo, 3. LRCustomerInfo, 4.
LRCentralOfficeInfo, 5. LRFacilityInfo, 6. LRServiceInfo as subtitles. And
each of this subtitle can repeat any number of times as you can see in the
sample file, TroubleActivityInfo has repeated many times in each record.
Moreover, each subtitle has a different number of columns. For example :
TroubleActivityInfo has 81 columns, LRDetailsInfo has 18 columns,
LRCustomerInfo has 14 columns,....... And every subtile starts with two
astriks and ends with a carriage return in Unix.
I would like to understand if there is a possibility for DataStage to
handle such files. Or do we have to do some preprocessing on the Souce
files. Can you also tell me, if it was a XML file, how would we handle it.
And which is the best way to handle it. Please let me know the
possibilities ASAP because we have to work on the design of the jobs
For example:
>**HeadedRecord|2001|7329063231|POMS|20|NE
>**TroubleInfo|2001|7329063231|..............................................80columns
>**TroubleInfo|2001|7329063231|..............................................80columns
>**TroubleInfo|2001|7329063231|..............................................80columns
>**TroubleInfo|2001|7329063231|..............................................80columns
>**LR-CustInfo|2001|7329063231|..............................................20
>
>columns
>**LR-ServiceInfo|2001|7329063231|.......................................10
>columns
>**HeadedRecord|2002|9081063231|POMS|50|NY
>**TroubleInfo|2002|9081063231|..............................................80columns
>**TroubleInfo|2002|9081063231|..............................................80columns
>**LR-CustInfo|2001|9081063231|..............................................20
>
>columns
>**LR-ServiceInfo|2001|9081063231|.......................................10
>columns
>
>This file will be loading many tables. Is it possible to handle this kind
>of file by CFF stage