Page 1 of 1

how to import XML files in parallel jobs?

Posted: Wed Oct 25, 2006 11:10 pm
by peterxu
Hi guys, anybody experienced in importing xml files in parallel jobs?

I tried sequential file stage to import and it worked sometime but not always.

So any suggestions? Thanks in advance.

Posted: Thu Oct 26, 2006 1:12 am
by kumar_s
Hi peterxu, Welcome Aboard !!!:D

What difficulty you face while reading the XML file through sequential file stage, afterall if you assume it as an flat file
You can see some dedicated XML stage available for this. Try that out.

Posted: Thu Oct 26, 2006 1:57 am
by peterxu
Hi kumar, thanks for your reply.

I ,sometimes, met with the fatal error "Consumed more than 100,000 bytes looking for record delimiter; aborting " using the sequential stage.

I've found the xml stages like input/output and others. But I can't find a proper one to import the whole xml file. Could you name one? thank you!

Posted: Thu Oct 26, 2006 2:02 am
by kumar_s
Error gives you the information that, the delimiter that you set in sequential file (may be ',' or '|'...) is not available even after reading 100,000 bytes.
AS the name implies, you can use the XML Input as your input stage.

Posted: Thu Oct 26, 2006 2:26 am
by peterxu
Thanks a lot for your quick reply. :)

As I want to import the whole file as one record row for later parsing, I set the record delimiter "null" and the whole file size is much smaller than 100k. But it still failed. Do you think there could be any encoding issue to consider?

And for the xml input stage, I tried but can't find the property to locate the souce file position. How do you get that through? Any hints will be appreciated.

Posted: Thu Oct 26, 2006 3:03 am
by BalageBaju
kumar_s wrote:Error gives you the information that, the delimiter that you set in sequential file (may be ',' or '|'...) is not available even after reading 100,000 bytes.
AS the name implies, you can use the XML Input as your input stage.
Hi,
Kumar is correct. Use XML Input stage for importing data from XML files. Import the metadata for that file and use that in your XML Input file stage.

Posted: Thu Oct 26, 2006 5:32 am
by peterxu
Really thanks for all of your reply, especially fir kumar. But there is some misunderstanding about the problem now. That's my fault. Sorry I didn't describe the scenario clearly.

What I exactly want is to parse xml files with a java application wrapped as a stage. Thus I need the xml file to be imported as a whole file, which can be handled by the following wrapped stage.

In this case, I don't think the xml input stage can help( If I'm wrong, pls correct me).

So I have to find another stage to import the xmls as a normal file. And what I found is the sequential file stage, but it couldn't work steadily depending on the xml file format. Anyone can give me some better ideas or hints to solve this problem? Many thanks in advance.

import xml files

Posted: Thu Jul 30, 2009 12:34 am
by indukuri
kumar_s wrote:Hi peterxu, Welcome Aboard !!!:D

What difficulty you face while reading the XML file through sequential file stage, afterall if you assume it as an flat file
You can see some dedicated XML stage available for this. Try that out.
Hi kumar,
i dont know how to import xml files
can please send me the description about xml input stage.

Posted: Thu Jul 30, 2009 6:04 am
by eostic
Do some searches...there are many entries that describe how this is done and entries that point to other resorources here and on the web.

Ernie