External Source stage for XML input

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
reddy
Premium Member
Premium Member
Posts: 168
Joined: Tue Dec 07, 2004 12:54 pm

External Source stage for XML input

Post by reddy »

Hi,

I had a job which uses xml input stage to read teh xml and write them to datasets. I was using External Source Stage to read the xml by specifying URL as input column content. It was working fine till now. Today when i run it, I did not get any records as output. The job shows as completed in Director. No records exported. I get a warning for external source stage which is

"<External_Source_8,0> Source subproc: sh: /bin/ls: 0403-027 The parameter list is too long."

I think the problme is with External Source stage. It was runnign good till yesterday. Can anyone please help? Thanks in advance.

reddy
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

It's a UNIX problem, and as noted your list of filenames is 'too long' to be handled. How many XML files exist out there? How long is each pathname?
-craig

"You can never have too many knives" -- Logan Nine Fingers
reddy
Premium Member
Premium Member
Posts: 168
Joined: Tue Dec 07, 2004 12:54 pm

Post by reddy »

There are 400 xml files in that location. Each file is named after a number like "12000.xml". All these files are in a seperate folder. I was using teh same path earlier and it worked. Can you please tell me what do i need to do now to resolve this issue. In future there are going to be like upto 13000 xml files in that folder. Thank you very much


reddy
reddy
Premium Member
Premium Member
Posts: 168
Joined: Tue Dec 07, 2004 12:54 pm

Post by reddy »

There are 400 xml files in that location. Each file is named after a number like "12000.xml". All these files are in a seperate folder. I was using teh same path earlier and it worked. Can you please tell me what do i need to do now to resolve this issue. In future there are going to be like upto 13000 xml files in that folder. Thank you very much


reddy
eostic
Premium Member
Premium Member
Posts: 3838
Joined: Mon Oct 17, 2005 9:34 am

Post by eostic »

There's nothing fancy happening in the External Source stage --- it is merely (assuming you are using a form of "list" command) issuing an "ls" and then sending the result down the output link. There are probably 100's of ways to address this requirement, using whatever extent of unix shell or other commands needed to obtain the fully qualified list that has to be sent into XMLInput. Whether you do that directly via single command in the External Source stage, or via shell that creates a flat file that you feed into the job later, is going to be dependent on other things. I'll leave it to the deeper unix gurus here to suggest best ways to capture an enormously large file list.

It would also be interesting to see if you can run the job more frequently...thus perhaps not having such a large number of filenames collected.......

Ernie
Ernie Ostic

blogit!
<a href="https://dsrealtime.wordpress.com/2015/0 ... ere/">Open IGC is Here!</a>
jatayl
Premium Member
Premium Member
Posts: 47
Joined: Thu Jan 19, 2006 11:20 am
Location: Rogers, AR

Post by jatayl »

Instead of using ls command, use the find command and pipe it to your input.
Post Reply