How to load large XML files with External Source Stage
Posted: Tue Jun 01, 2010 3:00 pm
Hi
I am trying to create one parallel job where we will read one XML file as input into a XML input stage for further processing.
I used one sequential file stage where we read that entire XML file "test.xml" as a stream input into one single column "XML Record" of that sequential file with Record Type as implicit and Field Delimiter as none in Format page.
Then we have fed this column into XML Input Stage.
But the problem which I am facing is if the "test.xml" file size is huge (say more than 500 bytes) the input xml data is getting truncated into two rows in the sequential file and hence we are not able to feed those data into XML Input Stage.
Now I am trying to use the External Source stage to resolve this problem. Can anybody tell me how to use external source stage in present scenario.
What are the settings I need to make in the External Source stage?
[*Note - Title changed to be more descriptive - Andy*]
I am trying to create one parallel job where we will read one XML file as input into a XML input stage for further processing.
I used one sequential file stage where we read that entire XML file "test.xml" as a stream input into one single column "XML Record" of that sequential file with Record Type as implicit and Field Delimiter as none in Format page.
Then we have fed this column into XML Input Stage.
But the problem which I am facing is if the "test.xml" file size is huge (say more than 500 bytes) the input xml data is getting truncated into two rows in the sequential file and hence we are not able to feed those data into XML Input Stage.
Now I am trying to use the External Source stage to resolve this problem. Can anybody tell me how to use external source stage in present scenario.
What are the settings I need to make in the External Source stage?
[*Note - Title changed to be more descriptive - Andy*]