row too big for inter stage rowbuffer

murali · Post by **murali** » Wed Jul 26, 2006 12:50 am

Hi All,
I am using a folder stage to with wild card as *.xml and then using xml input stage to extract the data from xml and then sequential file stage.

folderstage---->transfor---->XMLINPUT----Trans---->Sequentialfile

Code: Select all

Iam getting Test job..XMLInputT.link6: ds_intput() - row too big for inter stage rowbuffer

When i run this job i get the above warning message.
Can any one suggest why we get this error.
Thanks[/code]

kumar_s · Post by **kumar_s** » Wed Jul 26, 2006 1:33 am

Hi Murali,
There is a section called "Setting Up Properties for the XML Input Stage" in XMLPACK pdf. You might get more info reg this.

murali · Post by **murali** » Wed Jul 26, 2006 2:59 am

Hi kumar,
I had set every thing according to that and it is not able to handle large xml files.
If my file size is 200kb iam not getting the error.
If my file size is 3000kb then it is occuring in the error..
ds_intput() - row too big for inter stage rowbuffer

kumar_s · Post by **kumar_s** » Wed Jul 26, 2006 3:21 am

Have you tried increasing the Row buffer as it is asking for?
May I know what is the use of Transformer next to Folder stage?
Try using IPC stage before XMLINPUT stage, with necessay settings.

murali · Post by **murali** » Wed Jul 26, 2006 3:56 am

Kumar,

Have you tried increasing the Row buffer as it is asking for?
May I know what is the use of Transformer next to Folder stage?
Try using IPC stage before XMLINPUT stage, with necessay settings

*I had increased the row buffer to 256 ,previously it was 128.
*The use transformer is to have constarint @inrownum=1,as i told you iam using a wild card in the folder stage.

murali · Post by **murali** » Wed Jul 26, 2006 4:32 am

Hi,

I have worked out by placing IPC stage in front of the transformer,but it gives me the same error

Code: Select all

InterProcess_10.Linkname: ds_ipcgetnext() - timeout waiting for mutex

chulett · Post by **chulett** » Wed Jul 26, 2006 6:28 am

Don't let the Folder stage bring in huge files in the second field. Only use the first 'Filename' field in the Folder (delete the second) and then switch the XML Input stage to use the Column content option of URL/File path on the Input tab.

That should allow pretty much any size XML file to be read.

vijayindukuri · Post by **vijayindukuri** » Wed Jul 26, 2006 7:25 am

Chullet,
In the folder stage we are using a wild card *.txt and assume it selects 4 files and in the folder stage it self we select the order as descending so that it arranges the files in decending order
example file4,file3,file2,file1....
then we are using a transformer in which we mentioned the constraint as @inrownum=1 so that filename= file4 is come to the out put link of transformer....and then we passed this file4 to xmlinput stage and as u said we had taken Column content option of URL/File path..
But then we are not able to get the content....
Do we need to det any thing els in the xmlinput stage..

Thanks for ur reply...

chulett · Post by **chulett** » Wed Jul 26, 2006 8:03 am

Hi-jacking other people's threads is frowned upon, especially for non-related topics. Both being about 'XML' doesn't count.

Please start your own post on this topic and include all the relevant details - job type, O/S, version, etc.

murali · Post by **murali** » Wed Jul 26, 2006 8:08 am

Chullet,
Actually he belongs to my team and we are sorry for that.
Can u suggest us something on our last post.
Thanks in advance

murali · Post by **murali** » Wed Jul 26, 2006 8:28 am

Chullet,
Actually he belongs to my team and we are sorry for that.
Can u suggest us something on our last post.
Thanks in advance

chulett · Post by **chulett** » Wed Jul 26, 2006 8:31 am

Ok, wasn't easy to tell as the latest questions are about '*.txt' files not xml files. So, are these really XML files or text files?

Verify you are only using one field in the Folder stage. Expand on what 'we are not able to get the content' means.

murali · Post by **murali** » Wed Jul 26, 2006 8:39 am

Thanks chullet ,
We are using only one field that is filename in the folder stage and the file we are using is xml files.

CONTENT means...
The data in the particular file...as we are using wild card it selects a particular file in the folder and we need to get its content.
How can we acheive this...
As per your last post we had selected only one field in the folderstage that is filename and in the xml input stage we had selected Column content option of URL/File path . but we get only the file name but not the data.

chulett · Post by **chulett** » Wed Jul 26, 2006 9:32 am

Ok, when 'stuff' goes into the XML Input stage but no 'stuff' comes out, that is typically a problem with your XPath expressions in the stage. That is what drives the parsing of the XMl files.

Best way to get that right is to import the metadata of the file(s) you are trying to process, either directly from the XML or better yet from an .xsd you should have. That process will generate the XPath Expressions for you and then you can import that metadata into your job.

trobinson · Post by **trobinson** » Thu Feb 08, 2007 8:38 am

I happened to read this while searching and it struck me that the suggestions are way off the mark. I'm sure you must have solved this by now but thought I would attempt to clarify the problem when others see it.
So here goes;
When XML is being parsed by the XMLInput stage, it must be read in it's entirety. That is, from the root start tag to the root end tag. If link-to-link performance like in process or inter process row buffering is set for the job, then a single XML "row" must fit into the buffer for the XMLInput stage. The buffer max is 9999K. (I think). The buffer default is 126K. Should the XML "row" exceed the buffer size definition the error that started this post will be the result.