Reading XML file in windows through external source

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

lstsaur
Participant
Posts: 1139
Joined: Thu Oct 21, 2004 9:59 pm

Post by lstsaur »

I have a job (with * wild card), External source > XML input stage > Peek stage, works just fine.
So in your case, you should put ls /Datastage/Testinputfiles/XML_in_Books.xml in the Source Program of the External Source stage.
Coding with C:\Datastage\Testinputfiles\XML_in_Books.xml is never going to work. Then, code xmldoc as the Column Name under the Columns tab and send it down to the XML Input Stage.
Then, click the View Data and tell me what did you get before I go further with the XML Input stage.
Maximus_Jack
Premium Member
Premium Member
Posts: 139
Joined: Fri Apr 11, 2008 1:02 pm

Post by Maximus_Jack »

Hi
I tried as
External source ( without wild card) > XML input > peek stage and i have enabled tracing in the job

External source config:
===============
Source Program: ls /Datastage/Testinputfiles/XML_in_Books.xml

Output in the tracing for external source
==========================
XML_Input_26.DSLink24_Peek,0: filename:/Datastage/Testinputfiles/XML_in_Books.xml

and in the next step i'm getting the error as below

Datastage error
==========
APT_CombinedOperatorController,0: Fatal Error: Fatal: XML input document parsing failed. Reason: Xalan fatal error (publicId: , systemId: , line: 0, column: 0): An exception occurred! Type:RuntimeException, Message:The primary document entity could not be opened. Id=C:\IBM\InformationServer\Server\Projects\Training//Datastage/Testinputfiles/XML_in_Books.xml


am i doing something wrong here??

thanks for your response


cheers
MJ
eostic
Premium Member
Premium Member
Posts: 3838
Joined: Mon Oct 17, 2005 9:34 am

Post by eostic »

as noted up above, it at least "looks" like External Source is now working great!

Now look at your xml document.

Does it have a schemaLocation attribute? This points to a schema. Unless you say you want validation, which you should NOT have checked at this time, it shouldn't be looking for it, but you never know. Edit your xml document and DELETE this entire attribute and try that.

Make sure you truly have access to that subdirectory, to read it. I suspect that you do, but it's something to check.

Ernie
Ernie Ostic

blogit!
<a href="https://dsrealtime.wordpress.com/2015/0 ... ere/">Open IGC is Here!</a>
Maximus_Jack
Premium Member
Premium Member
Posts: 139
Joined: Fri Apr 11, 2008 1:02 pm

Post by Maximus_Jack »

hi..
This is the XML file i'm using in my job

http://msdn.microsoft.com/en-us/library ... s.85).aspx

i tried to find what is schemalocation attribute and it seems like there is nothing in the XML file as such, correct me if i'm wrong...

and FYI, i have disabled "validate input XML" and i have tried enabling and disabling include namespace declaration as well, but not change in the result.

can you pls let me know how if the schemalocation attribute is there in the XML file, thanks

cheers
MJ
eostic
Premium Member
Premium Member
Posts: 3838
Joined: Mon Oct 17, 2005 9:34 am

Post by eostic »

another thought came to mind also....

check the encoding value at the top.... see if it says something like encoding="UTF8" or similar, try a run in with a sample version of your xml document where you remove that attribute from the header entirely.

Ernie
Ernie Ostic

blogit!
<a href="https://dsrealtime.wordpress.com/2015/0 ... ere/">Open IGC is Here!</a>
Maximus_Jack
Premium Member
Premium Member
Posts: 139
Joined: Fri Apr 11, 2008 1:02 pm

Post by Maximus_Jack »

hi erine.. whatever i have pasted below is the only thing in the XML document.

and FYI, when i see the director log, the variable PWD is assigned as below
PWD=C:\IBM\InformationServer\Server\Projects\Training

i couldnt find this in the environment variable section, may be datastage is picking this path and searching in this directory, is there any way we can change this to "C:"

contents in XML doc
================

<catalog>
<book id="bk101">
<author>Gambardella, Matthew</author>
<title>XML Developer's Guide</title>
<genre>Computer</genre>
<price>44.95</price>
<publish_date>2000-10-01</publish_date>
<description>An in-depth look at creating applications
with XML.</description>
</book>
<book id="bk102">
<author>Ralls, Kim</author>
<title>Midnight Rain</title>
<genre>Fantasy</genre>
<price>5.95</price>
<publish_date>2000-12-16</publish_date>
<description>A former architect battles corporate zombies,
an evil sorceress, and her own childhood to become queen
of the world.</description>
</book>
</catalog>
Maximus_Jack
Premium Member
Premium Member
Posts: 139
Joined: Fri Apr 11, 2008 1:02 pm

Post by Maximus_Jack »

Hi all
I tried placing the file where datastage is getting pointed to
C:\IBM\InformationServer\Server\Projects\Training\Datastage\Testinputfiles\

its working as expected and i can see the data in the peek stage and in the seq file as well when replaced, so the only question here is how to change
the default behavior of datastage getting pointed to the present working
directory (PWD) or is there any other better way of reading XML by sending the path not as actual data itself...

kindly respond

thanks

cheers
MJ
lstsaur
Participant
Posts: 1139
Joined: Thu Oct 21, 2004 9:59 pm

Post by lstsaur »

No, you should be able to put the file under any directory that you want.
Maximus_Jack
Premium Member
Premium Member
Posts: 139
Joined: Fri Apr 11, 2008 1:02 pm

Post by Maximus_Jack »

ya i understand that, but i dont know why datastage is prefixing the pwd on top of the path i specify???any idea???
Post Reply