Hi,
I have spent the last day or so trawling previous posts but cannot find anything to satisfy my requirement. If any one can help I would really appreciate it.
I need to extract a large volume of data from a DB and then split the data in appropriately named files.
For the purposes of the exercise, the task can be simplified as follows:
1. extract all orders sorting by customer number
2. create a new output file for each customer, such that for customer C002 the file name would be EXT/CUST/C002
Obviously the "EXT/CUST/" is fine. I just cannot see how to get the C002 on to the end.
Thanks in advance for any suggestions help.
Dynamic Output File Names
Moderators: chulett, rschirm, roy
-
- Participant
- Posts: 72
- Joined: Tue Feb 05, 2008 4:38 am
Dynamic Output File Names
John Fitz
There is no easy, builtin way to do this with DataStage. There are ways to code this that are more or less complicated. If you have a known and limited number of output files then you can hard-code the paths in a DataStage job.
If your list of customers/output files changes over time or is large then this method won't work and you will need to use another approach. This approach really depends more on your particular skill set or site standards than anything else.
I like DS/BASIC and would use a custom routine from a server job that reads the file and uses the OPENSEQ/WRITESEQ commands to direct the output depending on data.
A similarly designed routine written in C++ and called from a PX job could effectively do the same thing.
You can also use UNIX shell scripting to do this - either with perl, awk, sed or even iterations of simple cat & grep commands.
If your list of customers/output files changes over time or is large then this method won't work and you will need to use another approach. This approach really depends more on your particular skill set or site standards than anything else.
I like DS/BASIC and would use a custom routine from a server job that reads the file and uses the OPENSEQ/WRITESEQ commands to direct the output depending on data.
A similarly designed routine written in C++ and called from a PX job could effectively do the same thing.
You can also use UNIX shell scripting to do this - either with perl, awk, sed or even iterations of simple cat & grep commands.
<a href=http://www.worldcommunitygrid.org/team/ ... TZ9H4CGVP1 target="WCGWin">
</a>
</a>
-
- Participant
- Posts: 72
- Joined: Tue Feb 05, 2008 4:38 am
Martin - GREAT suggestion, that will work. The OP didn't mention if they are willing to use server but if so, using a type 1 or 19 hashed file and updating/appending records will work like a charm. The only downside is performance, but the coding will take only minutes.
<a href=http://www.worldcommunitygrid.org/team/ ... TZ9H4CGVP1 target="WCGWin">
</a>
</a>
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
-
- Participant
- Posts: 612
- Joined: Thu May 03, 2007 4:59 am
- Location: Melbourne
For Job Type: Parallel, a parallel (C++) routine which creates files dynamically will be the best way to do this using a single job. Aggregate / Collect all the records related to each customer in a transformer using stage variables (Append new line character for each record) and use a remove duplicate stage and pass each customer record to the final transformer stage which calls the parallel routine. You can pass the file name also dynamically. See if this POST helps.
Joshy George
<a href="http://www.linkedin.com/in/joshygeorge1" ><img src="http://www.linkedin.com/img/webpromo/bt ... _80x15.gif" width="80" height="15" border="0"></a>
<a href="http://www.linkedin.com/in/joshygeorge1" ><img src="http://www.linkedin.com/img/webpromo/bt ... _80x15.gif" width="80" height="15" border="0"></a>
-
- Participant
- Posts: 72
- Joined: Tue Feb 05, 2008 4:38 am
Depending on your scripting skills, this would be a breeze in Perl. Extract the data to a sequential file ordered by customer number, then parse the rows by that field location into separate files. It would be wicked fast and could be fired off from a DS job if you wanted.
Just a thought!
Just a thought!
Bestest!
John Miceli
System Specialist, MCP, MCDBA
Berkley Technology Services
"Good Morning. This is God. I will be handling all your problems today. I will not need your help. So have a great day!"
John Miceli
System Specialist, MCP, MCDBA
Berkley Technology Services
"Good Morning. This is God. I will be handling all your problems today. I will not need your help. So have a great day!"