Loading mutiple files using the same job

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
tsktsk123
Participant
Posts: 32
Joined: Thu Dec 25, 2003 11:59 am

Loading mutiple files using the same job

Post by tsktsk123 »

Hi,

We have a requirement to do a loading from multiple source files i.e. the source system will send data based on extract date, if the extract date is less than the sysdate then we need to load the file as the reload process. Say if the sysdate is 05-Dec-2008 and we receive 3 different files from same source system ABC with three different extract date 02-Dec-2008,. 03-Dec-2008 and 04-Dec-2008 then we need to extract data from all three files and load it to target. We have a job curretly used to extract data from single file and load it target, I need your help to understand how the same can be extended if I use multiple files. The files coming from source will be dynamic i.e. sometimes it will send a single file for reload, some other time it may send 2 or 3. Currently we are using a UNIX script to see if the reporting date is less than sysdate then the file will be processed as reload file.

Appreciate any help on this.
kumar_s
Charter Member
Charter Member
Posts: 5245
Joined: Thu Jun 16, 2005 11:00 pm

Post by kumar_s »

Currently how are you identifying the files with previous date?
If its based on filename, you can still use the script to consolidate the file and pass it to Datastage. Are execute the same script in Datastage.
If the files are identified by Data, use datastage and discard the unecessary records.
You can use Filter option in Sequential file to read multiple files with wildcard.
Impossible doesn't mean 'it is not possible' actually means... 'NOBODY HAS DONE IT SO FAR'
bkumar103
Participant
Posts: 214
Joined: Wed Jul 25, 2007 2:29 am
Location: Chennai

Post by bkumar103 »

You can use the file pattern to read from the multiple file provided the metadata is same for all the file.
Birendra
tsktsk123
Participant
Posts: 32
Joined: Thu Dec 25, 2003 11:59 am

Post by tsktsk123 »

Hi,

Thanks for your reply, we are getting 3 different control files and 3 different data files for 3 different dates. We need to load extract date from the control file so we can't use file pattern as we have one control file for each data file.
bkumar103 wrote:You can use the file pattern to read from the multiple file provided the metadata is same for all the file.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

How about a job sequence with a StartLoop activity specifying the "list of things" (file names), possibly initialized by an upstream command or routine into a user variable?
Last edited by ray.wurlod on Tue Apr 15, 2008 6:44 pm, edited 1 time in total.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
kumar_s
Charter Member
Charter Member
Posts: 5245
Joined: Thu Jun 16, 2005 11:00 pm

Post by kumar_s »

Atlest the does the control file has any patterns?
Read all the control files, check for the date.
I beleive the control file will have the date and the Data file name.
Based on the date, create a SourceProgram file for External Source stage. Which simply has cat <filename> and the sequence of it.
Impossible doesn't mean 'it is not possible' actually means... 'NOBODY HAS DONE IT SO FAR'
Rubu
Premium Member
Premium Member
Posts: 82
Joined: Sun Feb 27, 2005 9:09 pm
Location: Bangalore

Post by Rubu »

Hi tsktsk123

I believe handling multiple files in same job will not be a good idea if you are maintaining process control metadata for processed file status. The solution will depend on the method of scheduling the jobs.

1. If you are scheduling through some scheduling tools(like control-M), Event based (file arrive) process triggering will solve the problem. Hopefully date is a part of the file names.

2. Otherwise going for loop activity as Ray specified will be good idea. You can write a routine to get file names as they arrive, return to a User Variable and use inside the loop.
Regards
Palas
JoshGeorge
Participant
Posts: 612
Joined: Thu May 03, 2007 4:59 am
Location: Melbourne

Post by JoshGeorge »

Amend the Unix script to read the control files and call your existing job as multiple instances.
Joshy George
<a href="http://www.linkedin.com/in/joshygeorge1" ><img src="http://www.linkedin.com/img/webpromo/bt ... _80x15.gif" width="80" height="15" border="0"></a>
tsktsk123
Participant
Posts: 32
Joined: Thu Dec 25, 2003 11:59 am

Post by tsktsk123 »

Hello All,

thanks for all your help on this, i will try to implement the solution provided by you guys, will let you know if i need any help on that.

thanks
Post Reply