XML Read with Different layout

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
dsdoubt
Participant
Posts: 106
Joined: Sat Jul 15, 2006 12:17 am

XML Read with Different layout

Post by dsdoubt »

I have the XML file with Different layout. It can be identified by the first field. Sample given as follows.

Code: Select all

<HEAD>
<TRANS cd="2520" no="235" sls_amt="8657.75" trans_cnt="1224"/>
<TRANS cd="2521" no="235" dt_of_tx="20080213" sls_typ="MCP"/>
</HEAD>
The metedata importer recognize only the first type of record and not the later one. Like dt_of_tx and sls_typ is not getting detected while importing.

May I know, what is the simple way to handle this kind of scenario.
I know that, preprocessing the file would be one option. May I know the other options?
lstsaur
Participant
Posts: 1139
Joined: Thu Oct 21, 2004 9:59 pm

Post by lstsaur »

So, you want your xml output looks like the following?
<HEAD>
<TRANS cd="2520" no="235" sls_amt="8657.75" trans_cnt="1224"/>
</HEAD>
<HEAD>
<TRANS cd="2521" no="235" dt_of_tx="20080213" sls_typ="MCP"/>
</HEAD>
If that's the case, then just use custom stylesheet.
dsdoubt
Participant
Posts: 106
Joined: Sat Jul 15, 2006 12:17 am

Post by dsdoubt »

Hi,
Do you mean to say, if I change the XML as given, I can import different layout at same time?
May I know what is custom Stylesheet?
dsdoubt
Participant
Posts: 106
Joined: Sat Jul 15, 2006 12:17 am

Post by dsdoubt »

Even if I try to edit the XML file to the below mode

Code: Select all

<HEAD>
<TRANS cd="2520" no="235" sls_amt="8657.75" trans_cnt="1224"/>
</HEAD>
<HEAD>
<TRANS cd="2521" no="235" dt_of_tx="20080213" sls_typ="MCP"/>
</HEAD> 
I am not able to import the two fields dt_of_tx and sls_typ.
lstsaur
Participant
Posts: 1139
Joined: Thu Oct 21, 2004 9:59 pm

Post by lstsaur »

No, I meant that's what you want your xml output looks look?
If that's the case, then using the custom stylesheet (XSLT), you can manipulate those two diffirent layouts based on the value in the "cd".
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

I get the impression they are trying to read it and failing. Can you do it in two jobs / passes - one for each TRANS types?
-craig

"You can never have too many knives" -- Logan Nine Fingers
eostic
Premium Member
Premium Member
Posts: 3838
Joined: Mon Oct 17, 2005 9:34 am

Post by eostic »

Is this a metadata problem, or a run-time problem? Just enter those two attributes yourself, manually, or get an xsd instead of an instance document to import the meta data....or just edit the document and copy those two attribtutes into the first Trans element. Because this is not an xsd, the importer does the best it can, reading thru the first instance of the element. Whatever attributes or sub-elements are there, will show up for import. Any others can be entered by hand, using the same pattern. Of course, the real "xsd" will have them all defined...these aren't different records, per se...they are simply additional and optional attributes that are allowed for inclusion in the Trans element.

Ernie
Ernie Ostic

blogit!
<a href="https://dsrealtime.wordpress.com/2015/0 ... ere/">Open IGC is Here!</a>
kumar_s
Charter Member
Charter Member
Posts: 5245
Joined: Thu Jun 16, 2005 11:00 pm

Post by kumar_s »

As our XML specialist Ernie suggested, you can include those two field dt_of_tx and sls_typ in the same TRANS. So that when you read the file, you ll get zeros or empty space for those fields, where data is not available. Later you can consolidate those with RemoveDuplicate or Aggregator stage based on the Key.
Impossible doesn't mean 'it is not possible' actually means... 'NOBODY HAS DONE IT SO FAR'
dsdoubt
Participant
Posts: 106
Joined: Sat Jul 15, 2006 12:17 am

Post by dsdoubt »

Thanks for all your input.
I am able to read the data using the mentioned method. ie., creating the metadata manually with all the columns that are necessary. And reading the XML file. But now I have one more query.

Code: Select all

<HEAD>
<TRANS cd="2520" no="235" sls_amt="8657.75" trans_cnt="1224" sls_typ="MCP"/>
</HEAD>
<HEAD>
<TRANS cd="2520" no="235" dt_of_tx="20080213" />
</HEAD>
From the above data, the output should be,
CD. No, Sls_Amt, Trns_cnt, Sls_Typ,Dt_Of_Tx
ie., the First two fields CD and NO are the keys. so that rest of all the fields can be added as extra fields.
But how can I mark two fields as key?
If i mark so, Iam getting the following warning.

Code: Select all

Error: The link contains more than one repetition rule
eostic
Premium Member
Premium Member
Posts: 3838
Joined: Mon Oct 17, 2005 9:34 am

Post by eostic »

What I hope you are getting, is two rows from this XML snippet. Each should have the same value for cd and no, and then have four additional columns....which in this example, will be mutually exclusive on the two rows, with the non existent columns carrying blanks. Pass it thru the aggregator or probably Remdup.

They "key" is not a key.. it's merely an indicator for which element is your lowest level "repeating row"..... You can only have one, and in this case it may even be necessary for you to also carry a dummy "TRANS" column and make it the key, because everything else is an attribute.

Ernie
Ernie Ostic

blogit!
<a href="https://dsrealtime.wordpress.com/2015/0 ... ere/">Open IGC is Here!</a>
Post Reply