Split XML file based on an input field

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
naveend
Participant
Posts: 11
Joined: Mon May 12, 2008 12:49 pm

Split XML file based on an input field

Post by naveend »

I am getting multiple XML responses for different data elements like (Business Units, Country codes, currency codes) from MDM-WCC in the same MQ. I need to get the message and based on <response type> field in the XML, move the message content to the element specific XML file. I thought I could use an XML input stage and xfmr stage, but I do not know what columns to load into the XML stage. Can anyone help?
eostic
Premium Member
Premium Member
Posts: 3838
Joined: Mon Oct 17, 2005 9:34 am

Post by eostic »

naveend...

It's possible you can easily do what you want to here..... is the <reponse type> element the root of your xml document? ...or at least close to it? ...like a sequential file, it sounds as though this "record type" may be dictating the format for the rest of the structure....if that's the case, we can probably assume that it's near the top of the structure and that getting to it is the same for each of the document types...... if so we can review it and branch accordingly to othre xml stages...

...however, depending on what the xml looks like, you may not have to do anything special....is each "response type" in its own sublevel node of the same structure?

Ernie
Ernie Ostic

blogit!
<a href="https://dsrealtime.wordpress.com/2015/0 ... ere/">Open IGC is Here!</a>
naveend
Participant
Posts: 11
Joined: Mon May 12, 2008 12:49 pm

Post by naveend »

Ernie,

The response type is at the top of the structure, but not the root. And yes, it does tell what the rest of the structure will be. Here is how it looks:
<?xml version="1.0"?>
<!DOCTYPE TCRMService SYSTEM "tCRMResponse.dtd">
<TCRMService>
<ResponseControl>
<ResultCode>SUCCESS</ResultCode>
<ServiceTime>50</ServiceTime>
<DWLControl>
<requesterLanguage>100</requesterLanguage>
<requesterName>Susan 000107564</requesterName>
</DWLControl>
</ResponseControl>
<TxResponse>
<RequestType>getAllCodeTypes</RequestType>

The rest of the XML has the deatils for this request type. This is how I am looking at building the job:

MQ ==> XML INPUT ==> XFMR ==> MULTIPLE XMLs

I can load part of the XML definition to the first XML input stage and pull the request type out and if it is "Country Code", then I need to load the whole structure to one of the output files.

How do I do this?
lstsaur
Participant
Posts: 1139
Joined: Thu Oct 21, 2004 9:59 pm

Post by lstsaur »

Where is the closing tag </TxResponse>?
naveend
Participant
Posts: 11
Joined: Mon May 12, 2008 12:49 pm

Post by naveend »

The closing tab is at the end of the XML details. The last two tabs are </TxResponse>
</TCRMService>. I hope this is what you are asking about.
eostic
Premium Member
Premium Member
Posts: 3838
Joined: Mon Oct 17, 2005 9:34 am

Post by eostic »

Paste a bit deeper view into the document, and also paste another response type, because there may be some alternatives....but one easy way to do this is to have two columns on an output liink...

ReponseType /TCRMService/TxReponse/RequestType/text()
theRest /TCRMService/TxResponse/


first char, other very long longvarchar. branch accordingly in a followup stage. make theRest the key.

the other XMLInputs will have xpath on their output links with with fewer high level elements (send the outputs to disk to review them first and you'll see what I mean).

Ernie
Ernie Ostic

blogit!
<a href="https://dsrealtime.wordpress.com/2015/0 ... ere/">Open IGC is Here!</a>
naveend
Participant
Posts: 11
Joined: Mon May 12, 2008 12:49 pm

Post by naveend »

Ernie,

Here is the XML:
<?xml version="1.0"?>
<!DOCTYPE TCRMService SYSTEM "tCRMResponse.dtd">
<TCRMService>
<ResponseControl>
<ResultCode>SUCCESS</ResultCode>
<ServiceTime>50</ServiceTime>
<DWLControl>
<requesterLanguage>100</requesterLanguage>
<requesterName>Susan 000107564</requesterName>
</DWLControl>
</ResponseControl>
<TxResponse>
<RequestType>getAllCodeTypes</RequestType>
<TxResult>
<ResultCode>SUCCESS</ResultCode>
</TxResult>
<ResponseObject>
<EObjCdBillTp>
<description></description>
<lang_tp_cd>100</lang_tp_cd>
<lang_type>English</lang_type>
<last_update_dt>2002-05-15 14:58:23.0</last_update_dt>
<name>No Billing</name>
<tp_cd>1</tp_cd>
</EObjCdBillTp>
<EObjCdBillTp>
<description></description>
<lang_tp_cd>100</lang_tp_cd>
<lang_type>English</lang_type>
<last_update_dt>2001-07-23 21:41:04.0</last_update_dt>
<name>Regular Billing</name>
<tp_cd>2</tp_cd>
</EObjCdBillTp>
<EObjCdBillTp>
<description></description>
<lang_tp_cd>100</lang_tp_cd>
<lang_type>English</lang_type>
<last_update_dt>2001-07-23 21:41:04.0</last_update_dt>
<name>Paid in Advance</name>
<tp_cd>3</tp_cd>
</EObjCdBillTp>
<EObjCdBillTp>
<description></description>
<lang_tp_cd>100</lang_tp_cd>
<lang_type>English</lang_type>
<last_update_dt>2001-07-23 21:41:04.0</last_update_dt>
<name>List Billing</name>
<tp_cd>4</tp_cd>
</EObjCdBillTp>
<EObjCdBillTp>
<description></description>
<lang_tp_cd>100</lang_tp_cd>
<lang_type>English</lang_type>
<last_update_dt>2001-07-23 21:41:04.0</last_update_dt>
<name>Payroll Deduction</name>
<tp_cd>5</tp_cd>
</EObjCdBillTp>
<EObjCdBillTp>
<description></description>
<lang_tp_cd>100</lang_tp_cd>
<lang_type>English</lang_type>
<last_update_dt>2001-07-23 21:41:04.0</last_update_dt>
<name>Electronic Funds Transfer</name>
<tp_cd>6</tp_cd>
</EObjCdBillTp>
<EObjCdBillTp>
<description></description>
<lang_tp_cd>100</lang_tp_cd>
<lang_type>English</lang_type>
<last_update_dt>2001-07-23 21:41:04.0</last_update_dt>
<name>Government Allotment</name>
<tp_cd>7</tp_cd>
</EObjCdBillTp>
<EObjCdBillTp>
<description></description>
<lang_tp_cd>100</lang_tp_cd>
<lang_type>English</lang_type>
<last_update_dt>2001-07-23 21:41:04.0</last_update_dt>
<name>Credit Card Billing</name>
<tp_cd>8</tp_cd>
</EObjCdBillTp>
<EObjCdBillTp>
<description></description>
<lang_tp_cd>100</lang_tp_cd>
<lang_type>English</lang_type>
<last_update_dt>2001-07-23 21:41:04.0</last_update_dt>
<name>Irregular Billing</name>
<tp_cd>9</tp_cd>
</EObjCdBillTp>
</ResponseObject>
</TxResponse>
</TCRMService>

In the output, I have the columns Resultcode, Service Time, requesterLanguage, requesterName, Request Type with their Xpaths. I defined another column RestofTheXML as LongVarchar=999999999. I tried the job like this, and I ended up with no tags for the RestofTheXML. Where did I go wrong??

Here is another response:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE TCRMService SYSTEM "tCRMResponse.dtd">
<TCRMService>
<ResponseControl>
<ResultCode>SUCCESS</ResultCode>
<ServiceTime>123</ServiceTime>
<DWLControl>
<requesterLanguage>100</requesterLanguage>
<requesterLocale>en</requesterLocale>
<requesterName>Mach 1 Pilot 1</requesterName>
<requestID>1000</requestID>
</DWLControl>
</ResponseControl>
<TxResponse>
<RequestType>getAllPartyAdminSysKeys</RequestType>
<TxResult>
<ResultCode>SUCCESS</ResultCode>
</TxResult>
<ResponseObject>
<TCRMAdminContEquivBObj>
<AdminContEquivIdPK>1161200637734949</AdminContEquivIdPK>
<AdminPartyId>01</AdminPartyId>
<AdminSystemType>1000000</AdminSystemType>
<AdminSystemValue>CDID</AdminSystemValue>
<ContEquivLastUpdateDate>2008-05-16 01:11:14.904</ContEquivLastUpdateDate>
<ContEquivLastUpdateTxId>263121091827476702</ContEquivLastUpdateTxId>
<ContEquivLastUpdateUser>settidr</ContEquivLastUpdateUser>
<PartyId>9151200637504807</PartyId>
</TCRMAdminContEquivBObj>
<TCRMAdminContEquivBObj>
<AdminContEquivIdPK>8771204905441573</AdminContEquivIdPK>
<AdminPartyId>1000</AdminPartyId>
<AdminSystemType>1000002</AdminSystemType>
<AdminSystemValue>Mach 1 Company Code</AdminSystemValue>
<ContEquivLastUpdateDate>2008-03-07 10:13:25.024</ContEquivLastUpdateDate>
<ContEquivLastUpdateTxId>1051204906404997</ContEquivLastUpdateTxId>
<ContEquivLastUpdateUser>settidr</ContEquivLastUpdateUser>
<PartyId>9151200637504807</PartyId>
</TCRMAdminContEquivBObj>
<TCRMAdminContEquivBObj>
<AdminContEquivIdPK>8491204906383748</AdminContEquivIdPK>
<AdminPartyId>01</AdminPartyId>
<AdminSystemType>1000004</AdminSystemType>
<AdminSystemValue>Facility Code</AdminSystemValue>
<ContEquivLastUpdateDate>2008-05-16 01:09:37.635</ContEquivLastUpdateDate>
<ContEquivLastUpdateTxId>374121091817682102</ContEquivLastUpdateTxId>
<ContEquivLastUpdateUser>settidr</ContEquivLastUpdateUser>
<PartyId>9151200637504807</PartyId>
</TCRMAdminContEquivBObj>
<TCRMAdminContEquivBObj>
<AdminContEquivIdPK>4171204910722301</AdminContEquivIdPK>
<AdminPartyId>1001</AdminPartyId>
<AdminSystemType>1000003</AdminSystemType>
<AdminSystemValue>SPM Company Code</AdminSystemValue>
<ContEquivLastUpdateDate>2008-04-09 14:55:35.257</ContEquivLastUpdateDate>
<ContEquivLastUpdateTxId>3121207770935217</ContEquivLastUpdateTxId>
<ContEquivLastUpdateUser>settidr</ContEquivLastUpdateUser>
<PartyId>9151200637504807</PartyId>
</TCRMAdminContEquivBObj>
<TCRMAdminContEquivBObj>
<AdminContEquivIdPK>661204910932511</AdminContEquivIdPK>
<AdminPartyId>12852</AdminPartyId>
<AdminSystemType>1000005</AdminSystemType>
<AdminSystemValue>Secretariat</AdminSystemValue>
<ContEquivLastUpdateDate>2008-03-07 11:28:52.511</ContEquivLastUpdateDate>
<ContEquivLastUpdateTxId>7621204910932501</ContEquivLastUpdateTxId>
<ContEquivLastUpdateUser>settidr</ContEquivLastUpdateUser>
<PartyId>9151200637504807</PartyId>
</TCRMAdminContEquivBObj>
</ResponseObject>
</TxResponse>
</TCRMService>
naveend
Participant
Posts: 11
Joined: Mon May 12, 2008 12:49 pm

Post by naveend »

:( I am still looking for help on this topic.
eostic
Premium Member
Premium Member
Posts: 3838
Joined: Mon Oct 17, 2005 9:34 am

Post by eostic »

first, work on getting the RequestType out into a single column:


RequestType /TCRMService/TxReponse/RequestType/text()


Assuming you can retrieve that and put each request type into a flat file, try adding more...such as the whole response:


theRest /TCRMServcie/TxResponse/

Does either work? For the second one, it's possible that you'll run into issues with a 8.x problem (there's a patch you can get from your provider) that removes all the tags from the resulting xml chunk. But even so, you should be able to see how the technique will operate, even if the problem exists.

Ernie
Ernie Ostic

blogit!
<a href="https://dsrealtime.wordpress.com/2015/0 ... ere/">Open IGC is Here!</a>
eostic
Premium Member
Premium Member
Posts: 3838
Joined: Mon Oct 17, 2005 9:34 am

Post by eostic »

actually...re-reading your latest entry, where you say "no tags"...you probably are doing everything right and just ran into this issue of tags being removed. You've got it right then --- get the patch. Sorry that I don't know the number of it.

Ernie
Ernie Ostic

blogit!
<a href="https://dsrealtime.wordpress.com/2015/0 ... ere/">Open IGC is Here!</a>
naveend
Participant
Posts: 11
Joined: Mon May 12, 2008 12:49 pm

Post by naveend »

Thanks Ernie. I ran the job in version 7.5 and it works just fine. I wish someone knew the patch number.
eostic
Premium Member
Premium Member
Posts: 3838
Joined: Mon Oct 17, 2005 9:34 am

Post by eostic »

I think it's 117664, but review it just the same with someone in support...there were a couple of XML fixes.

Ernie
Ernie Ostic

blogit!
<a href="https://dsrealtime.wordpress.com/2015/0 ... ere/">Open IGC is Here!</a>
Post Reply