How to load large XML files with External Source Stage

A forum for discussing DataStage<sup>®</sup> basics. If you're not sure where your question goes, start here.

Moderators: chulett, rschirm, roy

Post Reply
jpnrd
Participant
Posts: 19
Joined: Wed May 05, 2010 3:35 pm

How to load large XML files with External Source Stage

Post by jpnrd »

Hi

I am trying to create one parallel job where we will read one XML file as input into a XML input stage for further processing.
I used one sequential file stage where we read that entire XML file "test.xml" as a stream input into one single column "XML Record" of that sequential file with Record Type as implicit and Field Delimiter as none in Format page.
Then we have fed this column into XML Input Stage.

But the problem which I am facing is if the "test.xml" file size is huge (say more than 500 bytes) the input xml data is getting truncated into two rows in the sequential file and hence we are not able to feed those data into XML Input Stage.

Now I am trying to use the External Source stage to resolve this problem. Can anybody tell me how to use external source stage in present scenario.
What are the settings I need to make in the External Source stage?

[*Note - Title changed to be more descriptive - Andy*]
Thanx
Jaya
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Have you looked here yet?
-craig

"You can never have too many knives" -- Logan Nine Fingers
jpnrd
Participant
Posts: 19
Joined: Wed May 05, 2010 3:35 pm

Post by jpnrd »

Craig,

I looked at it. Its really Helpful. I dont why its not working for me. My sever is in D:/, is that the problem. If thats the problem Please tell me a solution.
Thanx
Jaya
jpnrd
Participant
Posts: 19
Joined: Wed May 05, 2010 3:35 pm

Post by jpnrd »

I kept all xml files in a shred folder and tried using shared folder. I can get the link out of the External Source Stage and passed it to XML input stage. The job ran well but 0 record are imported. It has only one warning


XML_Input_3,0: Warning: XMLTest4.XML_Input_3: XML input document parsing failed. Reason: Xalan fatal error (publicId: , systemId: , line: 0, column: 0): An exception occurred! Type:RuntimeException, Message:The primary document entity could not be opened. Id=D:\IBM\InformationServer\Server\Projects\ETL_DEV///skfb234/sharedata/EnterpriseDataArchitecture/ODS/PolicyCenterXMLdata/KFB_Liability_APIPCov.xml


Can you please help me with this warning.
Thanx
Jaya
eostic
Premium Member
Premium Member
Posts: 3838
Joined: Mon Oct 17, 2005 9:34 am

Post by eostic »

Do some careful debugging of the "list command" that you are using in the External Source Stage. I am surprised to see it appending the DS Project directory....... run the External Source to a flat file instead of XML...what do you get? You should have a complete list of your xml docs..nothing else........

Also..check the check box on the input link side of the xml stage...when you use External Source you must check URI/URL, not XML Content....perhaps, if you were using your old job, you still have XML Content checked......

Ernie
Ernie Ostic

blogit!
<a href="https://dsrealtime.wordpress.com/2015/0 ... ere/">Open IGC is Here!</a>
arunkumarmm
Participant
Posts: 246
Joined: Mon Jun 30, 2008 3:22 am
Location: New York
Contact:

Re: How to load large XML files with External Source Stage

Post by arunkumarmm »

jpnrd wrote:Hi

I am trying to create one parallel job where we will read one XML file as input into a XML input stage for further processing.
I used one sequential file stage where we read that entire XML file "test.xml" as a stream input into one single column "XML Record" of that sequential file with Record Type as implicit and Field Delimiter as none in Format page.
Then we have fed this column into XML Input Stage.

But the problem which I am facing is if the "test.xml" file size is huge (say more than 500 bytes) the input xml data is getting truncated into two rows in the sequential file and hence we are not able to feed those data into XML Input Stage.

Now I am trying to use the External Source stage to resolve this problem. Can anybody tell me how to use external source stage in present scenario.
What are the settings I need to make in the External Source stage?

[*Note - Title changed to be more descriptive - Andy*]

You can use sequential file stage and define your column as LongVarChar - 999999999. This will work.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

No... not really. At least not well.
-craig

"You can never have too many knives" -- Logan Nine Fingers
Post Reply