Read XML file in PX

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
Mat01
Premium Member
Premium Member
Posts: 50
Joined: Wed Jun 02, 2004 11:12 am
Location: Montreal, Canada

Read XML file in PX

Post by Mat01 »

Hi All,

I am trying to use the XML stages to read and transform XML data. I was able to read my file in Server (Thanks Kim for your ETLstats examples) but I haven't found a way to do this in PX. The main problem seems to be that I cannot read an entire file as one single field in PX. Hence, The XML input stage cannot parse the XML data. I have tried passing the stage a file name as specified in the XML source tag but this doesn't work either (job succesful, but empty output).

Has anybody found a way to read XML files with PX?

Thanks,

MAT
jatinrheen
Participant
Posts: 10
Joined: Sat Jan 01, 2005 1:32 pm

Re: Read XML file in PX

Post by jatinrheen »

Hi,

We have Transactional MQ from which we read the input XML , and then use COLUMN IMPORT in order to shift this input to a column name .Then Use XML Input / Tranformer stage in order to parse it .

Let me know if you have any problems so that I can send you the dsx.

Regards
Mat01 wrote:Hi All,

I am trying to use the XML stages to read and transform XML data. I was able to read my file in Server (Thanks Kim for your ETLstats examples) but I haven't found a way to do this in PX. The main problem seems to be that I cannot read an entire file as one single field in PX. Hence, The XML input stage cannot parse the XML data. I have tried passing the stage a file name as specified in the XML source tag but this doesn't work either (job succesful, but empty output).

Has anybody found a way to read XML files with PX?

Thanks,

MAT
Mat01
Premium Member
Premium Member
Posts: 50
Joined: Wed Jun 02, 2004 11:12 am
Location: Montreal, Canada

Post by Mat01 »

Thank you for your reply,

I'm not sure I follow you...How do you the use of the column import? I'd like to see an example if you can.

Thanks

Mat
michaeld
Premium Member
Premium Member
Posts: 88
Joined: Tue Apr 04, 2006 8:42 am
Location: Toronto, Canada

Post by michaeld »

This is how I did it:

Using a sequential file stage:

1)

Create a longvarchar (or longnvarchar) column that will hold your xml.

2)set format properties as follows:

record level->Record type = implicit
field level->Delimiter=none

NOTE: Make sure that you remove all other record level or field level properties!

This will put all of the data in the source file to a single column in a single row.

no charge:-)
Mike
vmcburney
Participant
Posts: 3593
Joined: Thu Jan 23, 2003 5:25 pm
Location: Australia, Melbourne
Contact:

Post by vmcburney »

IBM have published a tutorial on using the XML input stage in parallel jobs. See my blog post for details: 7 DataStage Tutorials.
JoshGeorge
Participant
Posts: 612
Joined: Thu May 03, 2007 4:59 am
Location: Melbourne

Re: Read XML file in PX

Post by JoshGeorge »

To avoid "record is too big to fit in a block" error, APT_DEFAULT_TRANSPORT_BLOCK_SIZE need to be set to the double of maximum possible message size expected.
Mat01 wrote:The main problem seems to be that I cannot read an entire file as one single field in PX. Hence, The XML input stage cannot parse the XML data
Joshy George
<a href="http://www.linkedin.com/in/joshygeorge1" ><img src="http://www.linkedin.com/img/webpromo/bt ... _80x15.gif" width="80" height="15" border="0"></a>
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Re: Read XML file in PX

Post by chulett »

Mat01 wrote:I have tried passing the stage a file name as specified in the XML source tag but this doesn't work either (job succesful, but empty output).
I'll wager you read the XML file just fine with that setup. Stuff in with no stuff out is generally a sign of improper XPath expressions in the XML Input stage.
-craig

"You can never have too many knives" -- Logan Nine Fingers
vmcburney
Participant
Posts: 3593
Joined: Thu Jan 23, 2003 5:25 pm
Location: Australia, Melbourne
Contact:

Post by vmcburney »

I haven't got it to work just passing the file name, I've usually passed the contents as well. You might need to tweak your imported XML definition such as switching the key field. You might need to test on smaller files to see if it is valid XML and not a volume problem.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Really? It certainly can be done in a Server job, don't see why that aspect of this would be different in PX. [shrug]

It's the exact same 'XML Input' stage in both products, yes? Not a PX specific flavor that I am aware of, so sure seems like you should be able to make that work the same way. :?

Make sure the filename is fully qualified, of course.
-craig

"You can never have too many knives" -- Logan Nine Fingers
balajisr
Charter Member
Charter Member
Posts: 785
Joined: Thu Jul 28, 2005 8:58 am

Post by balajisr »

URL/File Path option of XML Input Stage works in PX. But my xml size was small say about 2KB. I need to verify with bigger XML.
rajesh223
Participant
Posts: 26
Joined: Mon Dec 19, 2005 4:37 am

Post by rajesh223 »

I have tried to ready the xml by passing the xml path to the sequential fila and downstream to xml_input stage, then followed by the transformer stage and finally to the output sequential file.

I am able to read the data successfully, but there is one problem. As per the standard, all the file name and the location should be parameterized. In this case how can we pass the xml file location into the flat file.

I tried using rowgenerator>> transformer>> xml_input stage

I passed job parameter in the transformer as xml file path, then downstream to xml_input. I am unable to run the job, there is no error every thing looks fine. The data are flowing till transformer and not proceding after xml_inputstage. The xml_input stage unable to read the data from the prior stage (transformer). But it works for sequential file.

any thoughts on this ....

-Rajesh
eostic
Premium Member
Premium Member
Posts: 3838
Joined: Mon Oct 17, 2005 9:34 am

Post by eostic »

This subject continues to be a small nightmare. I share everyone's frustration here. Still another way to approach this, and the best solution that I've found so far is to use the ExternalSource Stage, choose "specific program" as the source, and then send in a subdirectory name in job parameter to an "ls"...... here's the property value for Source Program right from a working job:
ls #xmlSubDirectory#*.xml | sort ....that is then passed to a Transformer where (on Windows) I had to manually concatenate a C:.

We need a Stage that does exactly this, or an implementation of the Folder Stage with size flexibility.

Ernie
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

There are three different kinds of custom stage you can create in (for) parallel jobs. Any volunteers?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Maveric
Participant
Posts: 388
Joined: Tue Mar 13, 2007 1:28 am

Post by Maveric »

Ray, can custom, Build and Wrapped be classified as the three different kinds of custom stage? A wild guess.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Yes, that's them.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply