Dynamic Output File Names

A forum for discussing DataStage<sup>®</sup> basics. If you're not sure where your question goes, start here.

Moderators: chulett, rschirm, roy

Post Reply
wjfitzgerald
Participant
Posts: 72
Joined: Tue Feb 05, 2008 4:38 am

Dynamic Output File Names

Post by wjfitzgerald »

Hi,

I have spent the last day or so trawling previous posts but cannot find anything to satisfy my requirement. If any one can help I would really appreciate it.

I need to extract a large volume of data from a DB and then split the data in appropriately named files.

For the purposes of the exercise, the task can be simplified as follows:
1. extract all orders sorting by customer number
2. create a new output file for each customer, such that for customer C002 the file name would be EXT/CUST/C002

Obviously the "EXT/CUST/" is fine. I just cannot see how to get the C002 on to the end.

Thanks in advance for any suggestions help.
John Fitz
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

There is no easy, builtin way to do this with DataStage. There are ways to code this that are more or less complicated. If you have a known and limited number of output files then you can hard-code the paths in a DataStage job.
If your list of customers/output files changes over time or is large then this method won't work and you will need to use another approach. This approach really depends more on your particular skill set or site standards than anything else.
I like DS/BASIC and would use a custom routine from a server job that reads the file and uses the OPENSEQ/WRITESEQ commands to direct the output depending on data.
A similarly designed routine written in C++ and called from a PX job could effectively do the same thing.
You can also use UNIX shell scripting to do this - either with perl, awk, sed or even iterations of simple cat & grep commands.
battaliou
Participant
Posts: 155
Joined: Mon Feb 24, 2003 7:28 am
Location: London
Contact:

Post by battaliou »

Try a type 1 hashed file. Key the hashed file as customer number and call the hashed file EXT/CUST
3NF: Every non-key attribute must provide a fact about the key, the whole key, and nothing but the key. So help me Codd.
wjfitzgerald
Participant
Posts: 72
Joined: Tue Feb 05, 2008 4:38 am

Post by wjfitzgerald »

Any chance you could expand on the idea of the hash file approach, please? Fairly new to all this so not to clued in i'm afraid
John Fitz
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

Martin - GREAT suggestion, that will work. The OP didn't mention if they are willing to use server but if so, using a type 1 or 19 hashed file and updating/appending records will work like a charm. The only downside is performance, but the coding will take only minutes.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Prefer type 19 if the names are going to exceed 14 characters in length.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
JoshGeorge
Participant
Posts: 612
Joined: Thu May 03, 2007 4:59 am
Location: Melbourne

Post by JoshGeorge »

For Job Type: Parallel, a parallel (C++) routine which creates files dynamically will be the best way to do this using a single job. Aggregate / Collect all the records related to each customer in a transformer using stage variables (Append new line character for each record) and use a remove duplicate stage and pass each customer record to the final transformer stage which calls the parallel routine. You can pass the file name also dynamically. See if this POST helps.
Joshy George
<a href="http://www.linkedin.com/in/joshygeorge1" ><img src="http://www.linkedin.com/img/webpromo/bt ... _80x15.gif" width="80" height="15" border="0"></a>
wjfitzgerald
Participant
Posts: 72
Joined: Tue Feb 05, 2008 4:38 am

Post by wjfitzgerald »

thanks, that looks very promising
John Fitz
jdmiceli
Premium Member
Premium Member
Posts: 309
Joined: Wed Feb 22, 2006 10:03 am
Location: Urbandale, IA

Post by jdmiceli »

Depending on your scripting skills, this would be a breeze in Perl. Extract the data to a sequential file ordered by customer number, then parse the rows by that field location into separate files. It would be wicked fast and could be fired off from a DS job if you wanted.

Just a thought!
Bestest!

John Miceli
System Specialist, MCP, MCDBA
Berkley Technology Services


"Good Morning. This is God. I will be handling all your problems today. I will not need your help. So have a great day!"
Post Reply