Page 1 of 1

External Source stage for XML input

Posted: Tue Mar 18, 2008 12:43 pm
by reddy
Hi,

I had a job which uses xml input stage to read teh xml and write them to datasets. I was using External Source Stage to read the xml by specifying URL as input column content. It was working fine till now. Today when i run it, I did not get any records as output. The job shows as completed in Director. No records exported. I get a warning for external source stage which is

"<External_Source_8,0> Source subproc: sh: /bin/ls: 0403-027 The parameter list is too long."

I think the problme is with External Source stage. It was runnign good till yesterday. Can anyone please help? Thanks in advance.

reddy

Posted: Tue Mar 18, 2008 12:47 pm
by chulett
It's a UNIX problem, and as noted your list of filenames is 'too long' to be handled. How many XML files exist out there? How long is each pathname?

Posted: Tue Mar 18, 2008 12:57 pm
by reddy
There are 400 xml files in that location. Each file is named after a number like "12000.xml". All these files are in a seperate folder. I was using teh same path earlier and it worked. Can you please tell me what do i need to do now to resolve this issue. In future there are going to be like upto 13000 xml files in that folder. Thank you very much


reddy

Posted: Tue Mar 18, 2008 12:58 pm
by reddy
There are 400 xml files in that location. Each file is named after a number like "12000.xml". All these files are in a seperate folder. I was using teh same path earlier and it worked. Can you please tell me what do i need to do now to resolve this issue. In future there are going to be like upto 13000 xml files in that folder. Thank you very much


reddy

Posted: Tue Mar 18, 2008 1:14 pm
by eostic
There's nothing fancy happening in the External Source stage --- it is merely (assuming you are using a form of "list" command) issuing an "ls" and then sending the result down the output link. There are probably 100's of ways to address this requirement, using whatever extent of unix shell or other commands needed to obtain the fully qualified list that has to be sent into XMLInput. Whether you do that directly via single command in the External Source stage, or via shell that creates a flat file that you feed into the job later, is going to be dependent on other things. I'll leave it to the deeper unix gurus here to suggest best ways to capture an enormously large file list.

It would also be interesting to see if you can run the job more frequently...thus perhaps not having such a large number of filenames collected.......

Ernie

Posted: Thu Mar 20, 2008 1:54 pm
by jatayl
Instead of using ls command, use the find command and pipe it to your input.