Sequential file stage to read the multiple xml files in one

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
madhav62
Premium Member
Premium Member
Posts: 15
Joined: Sun Aug 15, 2010 9:26 pm

Sequential file stage to read the multiple xml files in one

Post by madhav62 »

HI,
I am Designed an Parallel job using Sequential File to read XML document.
But in production it supposed to read Multiple XML document i tried file pattern option but wasn't successful in doing so.

work flow is every 6 hours XML document are storied in server location and these ment to be processed in datastage.
File look like this:
META.INVOICEDOC.invoiceServices1-urlmsg01.625.200601011541687341.xml
META.INVOICEDOC.invoiceServices2-urlmsg01.694.200601011541147341.xml

META.INVOICEDOC it common in all the file
invoiceServices may change
nxt three digits are system generated and then followed by dot(.) and timestamp yyyymmdd ending with gain system generated number followed by .XML

Can any one suggest me file pattern for this.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

What "wasn't successful" about your file pattern attempt? Switch to using an External Source stage as detailed here.
-craig

"You can never have too many knives" -- Logan Nine Fingers
madhav62
Premium Member
Premium Member
Posts: 15
Joined: Sun Aug 15, 2010 9:26 pm

Post by madhav62 »

chulett wrote:What "wasn't successful" about your file pattern attempt? Switch to using an External Source stage as detailed here ...
i used ls -l /filelocation/META.INVOICEDOC.* for sequential file.

i tried External source stage but it shows error in args()
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Well... you can't use the Sequential File stage to reliably read XML files. Follow the link to Ernie's blog and use the ESS to get a list of the filenames only and then set the XML Input stage to do the actual reading of the files. Works way more better. Almost as good as the Folder stage in a Server job. :wink:
-craig

"You can never have too many knives" -- Logan Nine Fingers
madhav62
Premium Member
Premium Member
Posts: 15
Joined: Sun Aug 15, 2010 9:26 pm

Post by madhav62 »

Hi Chulett well my concern was that to read the file i need to give the ls command
like: ls #filePath#/Meta.Invoice.*
because the system generates the filenames automatically.
i dont have any problem in reading a single file using sequential or external source stage my concern is how to do it for multiple file.
:!:
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

One or multiple files, doesn't make any difference. :?

The details are in the blog I linked you too and they been posted here a bajillion times, search for them if you want to see what others are doing. All you should be delivering to the XML Input stage are the filenames and the stage should be set to the "URL/File path" option. Do that correctly and it will process 1 or 100,000 files no problem.

If you are still having problems, be specific. Tell us exactly how you have everything set up and exactly what your specific error message(s) are, then maybe someone can provide more specific help.
-craig

"You can never have too many knives" -- Logan Nine Fingers
madhav62
Premium Member
Premium Member
Posts: 15
Joined: Sun Aug 15, 2010 9:26 pm

Post by madhav62 »

design:
sequential stage ---->Xmliput------->dataset
table definitions:
Data longvarchar 9999
Xml schema definition.
sequential file settings:
file pattern #FilePath#/META.INVOICEDOC.*
error: file not found in directory.
and one more thing apart for invoice XML i get one more XML claims.
META.CLAIMDOC.----->xml to the same folder.


***How do i get a list of the filenames only***
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

madhav62 wrote:***How do i get a list of the filenames only***
That's in the blog I linked you to. Did you read it? :?

You'll also need to fix your "file pattern" as it doesn't seem to be finding any files.

As to your "one more thing" - you need to have all of the proper metadata / XPath Expressions generated for each of your XML files. Typically one would not do two completely different files at once, but it could be done I imagine if you split to two XML Input stages or know what you are doing xpath-wise but typically there would be two jobs since there are two different sources and (I imagine) two different targets. Or are you somehow planning on merging your invoice and claim data together simultaneously?
-craig

"You can never have too many knives" -- Logan Nine Fingers
madhav62
Premium Member
Premium Member
Posts: 15
Joined: Sun Aug 15, 2010 9:26 pm

Post by madhav62 »

i used same expression mentioned the article and used external source stage instead of sequential file stage it gave me error saying argument list is too long.
but i worked it around and now its importing 303 files but only file names not the xml data.
when i hit view data its just showing me just the file name not the content of the file.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

That's how it works. All it does it supply the file pathnames to the XML Input stage and that stage does the actual reading - as long as you selected the URL / File path option in the XML Input stage. Did you? What happens when the job runs?
-craig

"You can never have too many knives" -- Logan Nine Fingers
madhav62
Premium Member
Premium Member
Posts: 15
Joined: Sun Aug 15, 2010 9:26 pm

Post by madhav62 »

Yes its working thank you :)
Post Reply