Reading XML file in windows through external source
Moderators: chulett, rschirm, roy
I have a job (with * wild card), External source > XML input stage > Peek stage, works just fine.
So in your case, you should put ls /Datastage/Testinputfiles/XML_in_Books.xml in the Source Program of the External Source stage.
Coding with C:\Datastage\Testinputfiles\XML_in_Books.xml is never going to work. Then, code xmldoc as the Column Name under the Columns tab and send it down to the XML Input Stage.
Then, click the View Data and tell me what did you get before I go further with the XML Input stage.
So in your case, you should put ls /Datastage/Testinputfiles/XML_in_Books.xml in the Source Program of the External Source stage.
Coding with C:\Datastage\Testinputfiles\XML_in_Books.xml is never going to work. Then, code xmldoc as the Column Name under the Columns tab and send it down to the XML Input Stage.
Then, click the View Data and tell me what did you get before I go further with the XML Input stage.
-
- Premium Member
- Posts: 139
- Joined: Fri Apr 11, 2008 1:02 pm
Hi
I tried as
External source ( without wild card) > XML input > peek stage and i have enabled tracing in the job
External source config:
===============
Source Program: ls /Datastage/Testinputfiles/XML_in_Books.xml
Output in the tracing for external source
==========================
XML_Input_26.DSLink24_Peek,0: filename:/Datastage/Testinputfiles/XML_in_Books.xml
and in the next step i'm getting the error as below
Datastage error
==========
APT_CombinedOperatorController,0: Fatal Error: Fatal: XML input document parsing failed. Reason: Xalan fatal error (publicId: , systemId: , line: 0, column: 0): An exception occurred! Type:RuntimeException, Message:The primary document entity could not be opened. Id=C:\IBM\InformationServer\Server\Projects\Training//Datastage/Testinputfiles/XML_in_Books.xml
am i doing something wrong here??
thanks for your response
cheers
MJ
I tried as
External source ( without wild card) > XML input > peek stage and i have enabled tracing in the job
External source config:
===============
Source Program: ls /Datastage/Testinputfiles/XML_in_Books.xml
Output in the tracing for external source
==========================
XML_Input_26.DSLink24_Peek,0: filename:/Datastage/Testinputfiles/XML_in_Books.xml
and in the next step i'm getting the error as below
Datastage error
==========
APT_CombinedOperatorController,0: Fatal Error: Fatal: XML input document parsing failed. Reason: Xalan fatal error (publicId: , systemId: , line: 0, column: 0): An exception occurred! Type:RuntimeException, Message:The primary document entity could not be opened. Id=C:\IBM\InformationServer\Server\Projects\Training//Datastage/Testinputfiles/XML_in_Books.xml
am i doing something wrong here??
thanks for your response
cheers
MJ
as noted up above, it at least "looks" like External Source is now working great!
Now look at your xml document.
Does it have a schemaLocation attribute? This points to a schema. Unless you say you want validation, which you should NOT have checked at this time, it shouldn't be looking for it, but you never know. Edit your xml document and DELETE this entire attribute and try that.
Make sure you truly have access to that subdirectory, to read it. I suspect that you do, but it's something to check.
Ernie
Now look at your xml document.
Does it have a schemaLocation attribute? This points to a schema. Unless you say you want validation, which you should NOT have checked at this time, it shouldn't be looking for it, but you never know. Edit your xml document and DELETE this entire attribute and try that.
Make sure you truly have access to that subdirectory, to read it. I suspect that you do, but it's something to check.
Ernie
Ernie Ostic
blogit!
<a href="https://dsrealtime.wordpress.com/2015/0 ... ere/">Open IGC is Here!</a>
blogit!
<a href="https://dsrealtime.wordpress.com/2015/0 ... ere/">Open IGC is Here!</a>
-
- Premium Member
- Posts: 139
- Joined: Fri Apr 11, 2008 1:02 pm
hi..
This is the XML file i'm using in my job
http://msdn.microsoft.com/en-us/library ... s.85).aspx
i tried to find what is schemalocation attribute and it seems like there is nothing in the XML file as such, correct me if i'm wrong...
and FYI, i have disabled "validate input XML" and i have tried enabling and disabling include namespace declaration as well, but not change in the result.
can you pls let me know how if the schemalocation attribute is there in the XML file, thanks
cheers
MJ
This is the XML file i'm using in my job
http://msdn.microsoft.com/en-us/library ... s.85).aspx
i tried to find what is schemalocation attribute and it seems like there is nothing in the XML file as such, correct me if i'm wrong...
and FYI, i have disabled "validate input XML" and i have tried enabling and disabling include namespace declaration as well, but not change in the result.
can you pls let me know how if the schemalocation attribute is there in the XML file, thanks
cheers
MJ
another thought came to mind also....
check the encoding value at the top.... see if it says something like encoding="UTF8" or similar, try a run in with a sample version of your xml document where you remove that attribute from the header entirely.
Ernie
check the encoding value at the top.... see if it says something like encoding="UTF8" or similar, try a run in with a sample version of your xml document where you remove that attribute from the header entirely.
Ernie
Ernie Ostic
blogit!
<a href="https://dsrealtime.wordpress.com/2015/0 ... ere/">Open IGC is Here!</a>
blogit!
<a href="https://dsrealtime.wordpress.com/2015/0 ... ere/">Open IGC is Here!</a>
-
- Premium Member
- Posts: 139
- Joined: Fri Apr 11, 2008 1:02 pm
hi erine.. whatever i have pasted below is the only thing in the XML document.
and FYI, when i see the director log, the variable PWD is assigned as below
PWD=C:\IBM\InformationServer\Server\Projects\Training
i couldnt find this in the environment variable section, may be datastage is picking this path and searching in this directory, is there any way we can change this to "C:"
contents in XML doc
================
<catalog>
<book id="bk101">
<author>Gambardella, Matthew</author>
<title>XML Developer's Guide</title>
<genre>Computer</genre>
<price>44.95</price>
<publish_date>2000-10-01</publish_date>
<description>An in-depth look at creating applications
with XML.</description>
</book>
<book id="bk102">
<author>Ralls, Kim</author>
<title>Midnight Rain</title>
<genre>Fantasy</genre>
<price>5.95</price>
<publish_date>2000-12-16</publish_date>
<description>A former architect battles corporate zombies,
an evil sorceress, and her own childhood to become queen
of the world.</description>
</book>
</catalog>
and FYI, when i see the director log, the variable PWD is assigned as below
PWD=C:\IBM\InformationServer\Server\Projects\Training
i couldnt find this in the environment variable section, may be datastage is picking this path and searching in this directory, is there any way we can change this to "C:"
contents in XML doc
================
<catalog>
<book id="bk101">
<author>Gambardella, Matthew</author>
<title>XML Developer's Guide</title>
<genre>Computer</genre>
<price>44.95</price>
<publish_date>2000-10-01</publish_date>
<description>An in-depth look at creating applications
with XML.</description>
</book>
<book id="bk102">
<author>Ralls, Kim</author>
<title>Midnight Rain</title>
<genre>Fantasy</genre>
<price>5.95</price>
<publish_date>2000-12-16</publish_date>
<description>A former architect battles corporate zombies,
an evil sorceress, and her own childhood to become queen
of the world.</description>
</book>
</catalog>
-
- Premium Member
- Posts: 139
- Joined: Fri Apr 11, 2008 1:02 pm
Hi all
I tried placing the file where datastage is getting pointed to
C:\IBM\InformationServer\Server\Projects\Training\Datastage\Testinputfiles\
its working as expected and i can see the data in the peek stage and in the seq file as well when replaced, so the only question here is how to change
the default behavior of datastage getting pointed to the present working
directory (PWD) or is there any other better way of reading XML by sending the path not as actual data itself...
kindly respond
thanks
cheers
MJ
I tried placing the file where datastage is getting pointed to
C:\IBM\InformationServer\Server\Projects\Training\Datastage\Testinputfiles\
its working as expected and i can see the data in the peek stage and in the seq file as well when replaced, so the only question here is how to change
the default behavior of datastage getting pointed to the present working
directory (PWD) or is there any other better way of reading XML by sending the path not as actual data itself...
kindly respond
thanks
cheers
MJ
-
- Premium Member
- Posts: 139
- Joined: Fri Apr 11, 2008 1:02 pm