Reading XML Files using seq file stage

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

kumar66
Participant
Posts: 265
Joined: Thu Jul 26, 2007 12:14 am

Reading XML Files using seq file stage

Post by kumar66 »

Hi All,


I am not able to read the xml files though seq file stage. My xml file is l


<?xml version="1.0" encoding="UTF-8" ?>
- <!-- - Generated by Ascential Software Corporation, DataStage - XMLOutput stage -
- Wed Jan 16 14:21:40 2008

-->
- <xtd:BO xmlns:xtd="http://GenericSchema" xmlns:esb="http://schemas/SHeader" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
- <esb:SHeader>
<esb:EnvironmentName>Development</esb:EnvironmentName>
</esb:SHeader>
- <xtd:load>
- <xtd:Data>
- <![CDATA[ 800009885|700930010|S 9885 PLACE ST FOY|STORES|MALL ENCLOSED|000000980|000000000|000007910|000007910|000008890|BDC-0015|ST FOY||CA|CANADA|Can Dollar|000000000|ACTIVE|01/03/09| | | |COMP|02/03/08|000009885|
</xtd:Data>
</xtd:load>
</xtd:BO>

My job design is

SeqFile Stage------> XML input------->SQL Server.

The issue is I see 10 rows from seq file stage to xml input stage . But there is no transfer of rows from xml input to the target.


In seg File stage :

Read Method = File Pattern
coloum = Record as Long varchar

In XM Input stage :

xml source column = Record
column content = xml source content.

Please Advice.


With Regards,
Kumar66
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Data in but not out of that stage would imply your XPath Expressions couldn't be matched back to the XML. How about posting them?
-craig

"You can never have too many knives" -- Logan Nine Fingers
dspxlearn
Premium Member
Premium Member
Posts: 291
Joined: Sat Sep 10, 2005 1:26 am

Post by dspxlearn »

kumar66,

If you can use the ExternalSourceStage instead,it would be easier though sequential stage can handle the wildcard pattern.

Properties in the External source stage goes this way:

Code: Select all

Source Method: Specific Programs 
Source Program: cat /#Path#/#filename.xml# 
(RecordLevel) Final Delimiter: none 
Record Delimiter: null 
column Datatype/length: LongNVarchar(9999) 

Code: Select all

XML file stage:Input Source Column: 'file name' 
Column Content: XML document 
Output tab: Enable repetition element required and tag an appropriate column as key. 
Refer to below url posted by Ernie and Search the posts.
http://dsrealtime.wordpress.com/2007/12 ... -a-source/
Thanks and Regards!!
dspxlearn
kumar66
Participant
Posts: 265
Joined: Thu Jul 26, 2007 12:14 am

Post by kumar66 »

Hi Chulett,

I have

Data RFinancials.Data /xtd:BO/xtd:BOload/xtd:Data/text()

in my experssion.

Thanks,

With Regards,
Kumar66
kumar66
Participant
Posts: 265
Joined: Thu Jul 26, 2007 12:14 am

Post by kumar66 »

Hi dspxlearn,

Thanks for your reply. I have tried what you have suggested . But I am not getting any records from xml input stage.

And I am not even able to view the data in the external source stage.I am getting this follwoing error when clicking view data button.

##E TFPM 000371 15:27:11(005) <APT_CombinedOperatorController,0> <_PEEK_IDENT_>: Error message (261662) exceeds the maximum message size (131056); dropping.


And I get these warnings :
Missing record delimiter "\x00", saw EOF instead
Import warning at record 0.


Please advise.

Thanks ,
With regards,
kumar66
eostic
Premium Member
Premium Member
Posts: 3838
Joined: Mon Oct 17, 2005 9:34 am

Post by eostic »

Now it may be more obvious why I send the url via External Source when using EE. It's not worth the headache. Let XMLInput read it.

Alternatively, how much data are we talking about? Use Server for this piece of your app and use Folder.

With either solution you'll be up and running 10 minutes from now. ;)

Ernie
Ernie Ostic

blogit!
<a href="https://dsrealtime.wordpress.com/2015/0 ... ere/">Open IGC is Here!</a>
kumar66
Participant
Posts: 265
Joined: Thu Jul 26, 2007 12:14 am

Post by kumar66 »

Hi Ernie ,

Thanks for your reply. Actually i have done this job in server . Now the requirement is to do in parallel .

Please help me solving this.


Thanks,

With Regards,
Kumar66
eostic
Premium Member
Premium Member
Posts: 3838
Joined: Mon Oct 17, 2005 9:34 am

Post by eostic »

Are you doing something with the content prior to the XMLInput Stage? If not, use the URL method.... is there another reason that you cannot?

Ernie
Ernie Ostic

blogit!
<a href="https://dsrealtime.wordpress.com/2015/0 ... ere/">Open IGC is Here!</a>
kumar66
Participant
Posts: 265
Joined: Thu Jul 26, 2007 12:14 am

Post by kumar66 »

Hi Ernie,

How to use the "URL" Method. Can you please explain in detail.

Thanks,

With Regards,
Kumar66
eostic
Premium Member
Premium Member
Posts: 3838
Joined: Mon Oct 17, 2005 9:34 am

Post by eostic »

Check out the link above (dsrealtime.wordpress.com ...there's an XML content entry) ...the syntax is there.... let us know if you find it ok and how it works out....
Ernie Ostic

blogit!
<a href="https://dsrealtime.wordpress.com/2015/0 ... ere/">Open IGC is Here!</a>
kumar66
Participant
Posts: 265
Joined: Thu Jul 26, 2007 12:14 am

Post by kumar66 »

Hi Ernie,

But the document doesn't say how to use the URL method. Can you please explain how it actually works.


Thanks ,

With Regards,
Kumar66
eostic
Premium Member
Premium Member
Posts: 3838
Joined: Mon Oct 17, 2005 9:34 am

Post by eostic »

Inside the External Source stage you can specify a command. The command to use is "ls" (unix list), with some other arguments. This creates a link column containing the results of the command (varchar and a length of 200 works for me). This link goes into your XMLInput Stage....the stage properties ask you for the column and whether the column contains "XML Content" or "URL"... Pick URL. That's it. The XMLInput stage will go and find the XML and read it from disk, so you don't have to. XML documents are sometimes little chunks of character data with no CRLFs, LFs or other special characters.....but sometimes they aren't...being entirely variable in length, and sometimes with a mixed set of CRLF's and other things. From an XML perspective, it's all noise --- CRLFs are meaningless --- but not so with the Sequential Stage. Unless you need specifically to process the xml content on your own before the XMLInput Stage, then stop trying to use the Sequential Stage and just go with External Source. I'll check my blog entry again to make it more clear how to do this.

Ernie
Ernie Ostic

blogit!
<a href="https://dsrealtime.wordpress.com/2015/0 ... ere/">Open IGC is Here!</a>
kumar66
Participant
Posts: 265
Joined: Thu Jul 26, 2007 12:14 am

Post by kumar66 »

Hi Ernie ,

Thanks very much for your reply. I tried your suggestion and still the problem is the same. I could not see any records from the xml input stage.


Please Advise.

Thanks,

With Regards,
Kumar66
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

chulett wrote:Data in but not out of that stage would imply your XPath Expressions couldn't be matched back to the XML.
-craig

"You can never have too many knives" -- Logan Nine Fingers
kumar66
Participant
Posts: 265
Joined: Thu Jul 26, 2007 12:14 am

Post by kumar66 »

Hi chulett,

Chulett I tried the same in server job . It works fine.

Thanks,

With Regards,
Kumar66
Post Reply