Adding a Flush record to the end of a file using DataStage

abhilashnair · Post by **abhilashnair** » Fri Jun 19, 2009 4:39 am

I need to convert a Unix shell script into a DS PX job. The shell script is taking a fixed width file as input sorting it and then adding an extra flush record to the end of the sorted output file. The extra record is nothing but spaces in all fields. i.e suppose the width of the i/p file is 100 bytes and it has 100 rows, the Unix shell script will create an output file which is sorted and wil contain 101 rows, the last row being 100 spaces. How can this be done is a DS job?

ArndW · Post by **ArndW** » Fri Jun 19, 2009 5:01 am

The simplest means of doing this is calling an after-job which shells out to UNIX and just issue the command to append those spaces to the file.

abhilashnair · Post by **abhilashnair** » Fri Jun 19, 2009 5:42 am

This option was initially thought upon but then rejected. Any way to do it in the job itself?

balajisr · Post by **balajisr** » Fri Jun 19, 2009 5:46 am

abhilashnair wrote:This option was initially thought upon but then rejected. Any way to do it in the job itself?

Why was it rejected? This may help us in giving a solution.

Sreenivasulu · Post by **Sreenivasulu** » Fri Jun 19, 2009 6:17 am

The suggestion by ArndW is a simple solution fo this problem. Datastage solution could be a complex one and most probably you can do it only in a server job (i.e not in a parallel job)

Regards
Sreeni

ArndW wrote:The simplest means of doing this is calling an after-job which shells out to UNIX and just issue the command to append those spaces to the file. ...

ArndW · Post by **ArndW** » Fri Jun 19, 2009 7:31 am

Another solution is to could add a merge stage to append a line to your output, just ensure that the order is set correctly. This line would be created using a row generator stage.

rcanaran · Post by **rcanaran** » Fri Jul 24, 2009 11:51 am

ArndW wrote:Another solution is to could add a merge stage to append a line to your output, just ensure that the order is set correctly. This line would be created using a row generator stage.

I was looking for something similar to finish off a vertical pivot being coded in a transformer (7.5.1 Parallel).

I used a rowgenerator stage to generate the single row. Then a transformer to overwrite the key value with high values (not cobol/mainframe high values (hex 'FF's), but the upper end of the valid range for the datatype). The stream gets sorted later by key and I needed to ensure the generated row is the last record after the sort.

The last transformer performed the vertical pivot logic writing out only the previously accumulated record on a keychange. I needed this "last record" to flush out the real lsat accumualted data record.

ray.wurlod · Post by **ray.wurlod** » Fri Jul 24, 2009 7:32 pm

There may be an "end of data" token available in a future release (8.5?).

datisaq · Post by **datisaq** » Sun Jul 26, 2009 8:40 am

You can row generator to generate the last row(having same meta data) and club with the original dataset using funnel stage.In funnel stage there is an option available where you can append the first dataset to the output and then the second(row generator).

DS experts, please correct me if i'm wrong..

rcanaran · Post by **rcanaran** » Sun Jul 26, 2009 8:26 pm

I did indeed have a funnel after the row generator and transformer. The funnel can physically insert the row as the last one. But a subsequent stage needs the input sorted by a key field and a randomly generated value by the row generator would have then ended up in an unpredictable position. I needed to guarantee the row ended up being last before going to that stage. I'm sure I could have funneled in the generated row at a different point in the dsjob to avoid having the sort, but for other reasons it was better to have the funnel at that particular point.

Also, I think else where in this thread someone mentioned that they couldn't use and after job to concatenate the last row via a call to the OS (unix shell script). At my current site, I also cannot do this. Site strandard. I couldn't even do this when it was more efficient to use and external filter (SED in this case) to cleanse some extra data. Cleansing in a parallel derivation or stage variable would have resulted in SLOW execution. And none of the sites I have been were using parallel routines while I was there.

Supportability often wins instead of efficiency, simplicity or elegance.

DSXchange

Adding a Flush record to the end of a file using DataStage

Adding a Flush record to the end of a file using DataStage

Adding a Flush record to end of a file for vertical pivot