Page 1 of 1

Dynamically create as many files as values in a column

Posted: Sun Apr 08, 2012 2:58 am
by jerome_rajan
Hi,

I came across this interesting requirement. Pardon me if this has already been discussed but I couldn't find it anywhere.

I have a source table with columns:

Name, Age, Location

The Location column can contain any value and is not fixed. For e.g. It may contain New York. it may also contain US. So it can be a city, state, country or even a district.

The requirement is simply to load records belonging to different locations to different files. So if I have 20 locations, the job would have to determine the number of values in Location at runtime and create as many files.

I have devised a logic to implement this using a UNIX script and a looped sequencer. But I was wondering if there's an easier way to implement this.

Help Appreciated

Posted: Sun Apr 08, 2012 6:41 am
by chulett
Off the top of my head, a Server job (or Server Shared Container) writing to the Folder stage or using the UNIX csplit function after the file has been written could do this.

Posted: Sun Apr 08, 2012 9:26 am
by jerome_rajan
I'm not sure if I'd want to use server components in my parallel job. Will look into the UNIX command. Thanks you!

Posted: Sun Apr 08, 2012 2:00 pm
by chulett
In the immortal words of Blue Oyster Cult: Don't Fear the Server.

:wink:

Posted: Sun Apr 08, 2012 11:05 pm
by jerome_rajan
ha...i am not yet a seasoned campaigner to justify the presence of a server component in my jobs! :(

Re: Dynamically create as many files as values in a column

Posted: Sun Apr 08, 2012 11:31 pm
by vamsi.4a6
Can you explain the logic in simple steps how you implmented the above requirement by using Looping and Unix?

Posted: Mon Apr 09, 2012 6:26 am
by jerome_rajan
1. Create a single job with a source connector that extracts data for one location. This condition should be parametrized (e.g. WHERE LOCATION=#jp_Location_Nm#).

2. Implement your logic and create a file where again the file's name should be parametrized (e.g. File Name= Target_#jp_Location_Nm#).

3. Now create a looped sequencer with one UNIX script out of the loop. The UNIX script should execute a query against the DB to extract a unique list of all locations and store it in a file (delimited)

4. The file created in the above step gets passed in the Start Loop activity which will pick up one location at a time from the file and pass it to the parameter #jp_Location_Nm#.

5. The sequencer loop gets executed as many times as there are locations in the file and a separate file gets created for each location.

I know that this logic sounds complicated but is actually pretty easy to implement. There might be an easier way out but I can't think of any.