Page 1 of 2

Extracting the Data from XML File

Posted: Tue Mar 24, 2009 11:56 pm
by balu536
Hi all,
I'm facing the problem while extracting the data from XML file.I've used External Source file stage and XML input stage to extract the data.I'm able to get the datat out of External Source file stage but after the XML stage no data is being fetched.I have loaded the Meta data too.Please explain me the procedure of extracting the data from this XML file which has the structure as mentioned below

Part of XML file structure(rest of the file has same structure too):

<?xml version="1.0" encoding="UTF-8"?>
<CRRDownload:allocatedCRRs xmlns:CRRDownload="http://crr.caiso.org/download/xml" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://crr.caiso.org/download/xml http://ftapjbos10:8080/crr/download/xml ... esults.xsd">
<CRRDownload:crr>
<CRRDownload:NominationID>48733</CRRDownload:NominationID>
<CRRDownload:CRRID></CRRDownload:CRRID>
<CRRDownload:Category>PTP</CRRDownload:Category>
<CRRDownload:Portfolio>Allocation_S2_LT</CRRDownload:Portfolio>
<CRRDownload:AssetOwner>1014</CRRDownload:AssetOwner>
<CRRDownload:Source>MALIN_5_RNDMTN</CRRDownload:Source>
<CRRDownload:Sink>LAP_SCE</CRRDownload:Sink>
<CRRDownload:StartDate>2009-04-01</CRRDownload:StartDate>
<CRRDownload:EndDate>2009-06-30</CRRDownload:EndDate>
<CRRDownload:HedgeType>OBL</CRRDownload:HedgeType>
<CRRDownload:CRRType>LSE</CRRDownload:CRRType>
<CRRDownload:TimeOfUse>ON</CRRDownload:TimeOfUse>
<CRRDownload:NominatedMW>6.900</CRRDownload:NominatedMW>
<CRRDownload:AllocatedMW>0.000</CRRDownload:AllocatedMW>
</CRRDownload:crr>
<CRRDownload:crr>
<CRRDownload:NominationID>48734</CRRDownload:NominationID>
<CRRDownload:CRRID></CRRDownload:CRRID>
<CRRDownload:Category>PTP</CRRDownload:Category>
<CRRDownload:Portfolio>Allocation_S2_LT</CRRDownload:Portfolio>
<CRRDownload:AssetOwner>1014</CRRDownload:AssetOwner>
<CRRDownload:Source>SYLMAR_2_NOB</CRRDownload:Source>
<CRRDownload:Sink>LAP_SCE</CRRDownload:Sink>
<CRRDownload:StartDate>2009-04-01</CRRDownload:StartDate>
<CRRDownload:EndDate>2009-06-30</CRRDownload:EndDate>
<CRRDownload:HedgeType>OBL</CRRDownload:HedgeType>
<CRRDownload:CRRType>LSE</CRRDownload:CRRType>
<CRRDownload:TimeOfUse>ON</CRRDownload:TimeOfUse>
<CRRDownload:NominatedMW>174.800</CRRDownload:NominatedMW>
<CRRDownload:AllocatedMW>0.000</CRRDownload:AllocatedMW>
</CRRDownload:crr>
</CRRDownload:allocatedCRRs>


Regards,
Balakrishna

Re: Extracting the Data from XML File

Posted: Wed Mar 25, 2009 12:49 am
by Pagadrai
Hi Balakrishna,
Are you seeing any warnings or errors in log?
There could be an issue with the xpath specifed.
For example, for an XML of the form:
<a>
<b>123</b>
<c>456</c>
</a>
I would give the xpath as /a/b and /a/c
let me know if you have any questions.

Re: Extracting the Data from XML File

Posted: Wed Mar 25, 2009 12:51 am
by Pagadrai
Hi Balakrishna,
Are you seeing any warnings or errors in log?
There could be an issue with the xpath specifed.
For example, for an XML of the form:
<a>
<b>123</b>
<c>456</c>
</a>
I would give the xpath as /a/b and /a/c
let me know if you have any questions.

Re: Extracting the Data from XML File

Posted: Wed Mar 25, 2009 2:34 am
by balu536
I have 0 warnings in my director log

Re: Extracting the Data from XML File

Posted: Wed Mar 25, 2009 3:00 am
by Pagadrai
balu536 wrote:I have 0 warnings in my director log
Hi,
can you give more details about the column derivation method you are using and what are the keys you have specified?

xml file extraction

Posted: Wed Mar 25, 2009 3:59 am
by bachi
Hi,
use seq file stage first then take xml input stage.In the seq file take 2 columns(varchar-1000),keep one column as key,select the non key field in thexml input stage,from xml input stage take 2 datasets one of it is reject, enable this option in xml input stage xml file wil be extractd

Posted: Wed Mar 25, 2009 4:37 am
by eostic
Hi...

If you are not getting any errors, but also aren't getting any data, then the issue is mostly likely in your xpath. This is the stuff in the Description property for your link columns.

Follow Pagadral's advice above... start small, and with just one column. We will assume you imported your metadata via the xml metadata importer, but still, choose just one column for inititial testing. Something like:

NominationID (char data type and length) with a Description property of

/CRRDownload:allocatedCRRs/CRRDownload:crr/CRRDownload:NominationID/text()

Because you have namespace prefixes, you will also have to load, on the Transformation Settings tab, the namespace details in the box at the bottom..this is effectively xmlns:CRRDownload="http......." etc. [that's the longest namespace prefix I've ever seen!].

Make that column a key. Let us know if you get some rows.

Ernie

Posted: Wed Mar 25, 2009 5:12 am
by balu536
Hi Ernie,
I'm following the same method that you mentioned.With that only i'm unable to fetch the records.I tried with single and multiple fileds but no change in the outcome

Regards,
Balakrishna

Posted: Wed Mar 25, 2009 5:20 am
by balu536
On whole i'm narrating the procedure incorporated,

Initially I imported the table definitions(Import-> Table definitions-> XML Table Definitions.I have opened the respective file and later performed Auto check(present in Edit Tab) and selected Nomination ID as the key column and saved the table definition.Later i loaded the name space declarations present under Transformation Settings tab(both under Stage and Output Tags) in XML Input Stage.Later i directly loaded the Columns under output tab by using the Load option.


Regards,
Balakrishna

Posted: Wed Mar 25, 2009 7:14 am
by verify
When you are reading the xml file using External source stage make sure it should be read as one record(Trim the new lines if at all present in your file).

Use this command in you external source stage:-

cat file_name | tr -d '\n\'

And in your xml input file same element is repeating twice, so properly select the key and enable the repetiotion element required property in your xml input stage.

Hope this helps ..

Posted: Wed Mar 25, 2009 7:39 am
by balu536
Hi Raju,
The output of External Source stage contains only one record and i'm taking Nomination Id as the key field which has unique values in the entire file.


Regards,
Balakrishna

Posted: Wed Mar 25, 2009 8:09 am
by chulett
Hmmm... the External Source stage should be passing the file/pathname to the XML Input stage and the XML stage should be the one "reading" the file. What is yours doing, exactly? :?

Posted: Wed Mar 25, 2009 2:29 pm
by eostic
Exactly...you should be sending filenames to XMLInput.....be sure you check the right "XML Content" radio button.

Posted: Thu Mar 26, 2009 12:40 am
by bachi
Hi,
take 2 columns -datarow,keyrow, laod the xml metadata in xml input stage in this u take the normal as per the busines repetion option enable in the xml input stage then run ................
eostic wrote:Hi...

If you are not getting any errors, but also aren't getting any data, then the issue is mostly likely in your xpath. This is the stuff in the Description property for your link columns.

Follow Pagadral's advice above... start small, and with just one column. We will assume you imported your metadata via the xml metadata importer, but still, choose just one column for inititial testing. Something like:

NominationID (char data type and length) with a Description property of

/CRRDownload:allocatedCRRs/CRRDownload:crr/CRRDownload:NominationID/text()

Because you have namespace prefixes, you will also have to load, on the Transformation Settings tab, the namespace details in the box at the bottom..this is effectively xmlns:CRRDownload="http......." etc. [that's the longest namespace prefix I've ever seen!].

Make that column a key. Let us know if you get some rows.

Ernie

Posted: Thu Mar 26, 2009 2:00 am
by balu536
The radio button checked is URL/File path.It gets automatically checked when we use External Source File stage.If we use the Sequential FIle stage then it is the XML Document checked by Default.In my case as i've used External Source File stage,URL/File path Radio button is checked


Regards,
Balakrishna