multiple files into a single seq file???

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
ushasunkara
Participant
Posts: 23
Joined: Wed Jan 18, 2006 10:43 am

multiple files into a single seq file???

Post by ushasunkara »

Hello all,
What stage is best to use, for joining or add multiple xml files into a single seq file,
in parallel jobs(there is no folder stage) ,
from where this seq file -> xml input stage...
i thought it was file set, but i've never used file set stage, if that is the stage, how do i go about it...
thanks a lot in advance
Usha
kwwilliams
Participant
Posts: 437
Joined: Fri Oct 21, 2005 10:00 pm

Re: multiple files into a single seq file???

Post by kwwilliams »

So you have multiple files that you want to use to create one data set, like a union? If so you would use the funnel. If that's not what you want could you clarify what you are trying to accomplish. Describe how many files you have and what you would liek to happen to the data in those files.
Raog
Participant
Posts: 8
Joined: Thu Oct 13, 2005 8:53 am

Re: multiple files into a single seq file???

Post by Raog »

Usha:

Can u give more details about file content ie. all are having same xml type or different xml schema.

I think, when all are having different schema definition, u better to go for JOINER or MERGE.

Rgds,
Rao.
ushasunkara wrote:Hello all,
What stage is best to use, for joining or add multiple xml files into a single seq file,
in parallel jobs(there is no folder stage) ,
from where this seq file -> xml input stage...
i thought it was file set, but i've never used file set stage, if that is the stage, how do i go about it...
thanks a lot in advance
Usha
ameyvaidya
Charter Member
Charter Member
Posts: 166
Joined: Wed Mar 16, 2005 6:52 am
Location: Mumbai, India

Post by ameyvaidya »

Hi Usha,
The Sequential File Stage supports Reading both a Single file and multiple files (Given a file pattern).
Property to look for is:

Read Method, specify whether to read specific files (the
default) or all files whose name fits a pattern.

Parjdev.pdf Page 144

The Caveat here is that all files should have the same metadata.
IHTH
Amey Vaidya<i>
I am rarely happier than when spending an entire day programming my computer to perform automatically a task that it would otherwise take me a good ten seconds to do by hand.</i>
<i>- Douglas Adams</i>
ushasunkara
Participant
Posts: 23
Joined: Wed Jan 18, 2006 10:43 am

Post by ushasunkara »

Hi All,
thanks for all your opinions,
and ameyvaidya, i'm looking for a wildcard option, there, so in the property - read method - file pattern - what way can i give this expression?
for specific file - i give the pathname of the file location,
then for file pattern - what path do i give? the path of the folder location??
Correct me if i am wrong, what i understood is in Server edition - folder stage can have many files, and when u link to a flat file, u get the names of the files,
but i need here, in Parallel edition,
i've 3 xml files of the same type, but jus for different sources, columns would be same, i need all the 3 xml files to write into one single seq file,
so after the seq file, the xml input stage can take it easily - the whole xml file to go ahead with other transformations...

So, the design is seqfile --> xml input stage --> diff odbc stages....

this seq file - should take 3 xml files - with same content but jus diff source systems(company1, company2, company3)... into one single seq file....

Plz let me know, what do i need to use in this case...
thank you so much....
DSguru2B
Charter Member
Charter Member
Posts: 6854
Joined: Wed Feb 09, 2005 3:44 pm
Location: Houston, TX

Post by DSguru2B »

You can develop a job in PX using multiple instances. If the target is same and the source file is of same TYPE(ie, seq,xml,etc).
Then load the records schema file that overwrites any settings in the format and columns tab. You can have multiple record schema files of your source in your unix dir.
This design would save you mutliple jobs of the same kind.
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
ameyvaidya
Charter Member
Charter Member
Posts: 166
Joined: Wed Mar 16, 2005 6:52 am
Location: Mumbai, India

Post by ameyvaidya »

I havent had to work with XML Files.. but heres how to go about it:

Directory /home/dsadm/infiles

File 1 name: DB_Cust_XML_1.xml
File 2 name: DB_Cust_XML_2.xml
File 3 name: DB_Cust_XML_3.xml
File 4 name: DB_Cust_XML_4.xml
File 5 name: DB_Cust_XML_5.xml
File 6 name: DB_Cust_XML_6.xml
File 7 name: DB_Cust_XML_7.xml

Set sequential file stage to read from a file pattern.
set file property to:
/home/dsadm/infiles/DB_Cust_XML_*.xml

Also if your file names do not fit any pattern; while reading up on the Sequential File Stage I came across this:
File
This property defines the flat file that data will be read from. You can
type in a pathname, or browse for a file. You can specify multiple files
by repeating the File property.
Do this by selecting the Properties
item at the top of the tree, and clicking on File in the Available
properties to add window. Do this for each extra file you want to
specify.
Amey Vaidya<i>
I am rarely happier than when spending an entire day programming my computer to perform automatically a task that it would otherwise take me a good ten seconds to do by hand.</i>
<i>- Douglas Adams</i>
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Tip: If you use multiple File property values, the technique is Specific File, not Pattern.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
ushasunkara
Participant
Posts: 23
Joined: Wed Jan 18, 2006 10:43 am

Post by ushasunkara »

sorry for the late reply...
thank you so much...ameyvaidya and ray...
it works :D
thank you...
Post Reply