Page 1 of 1

Concatenate 3 files and generate sequence number

Posted: Wed Feb 04, 2009 9:33 pm
by sujaoschin
I have generated 3 output sequential files from 3 different input files.

All the output files has column headings and values separated by '|' Symbol

OUTPUT FILE1 NAME- FN23_1a.dat
======================

GLORDDER|COMPVALUE|COMPTEXT|ORGID|SID
1|1111|ANZ|X601|1
1|1112|ANE|X601|2
1|1113|ANB|X601|3
1|1114|ANY|X601|4
1|1114|ANO|X601|5
1|1115|ANU|X601|6


OUTPUT FILE2 NAME- FN23_1b.dat and it's content is
===================================

GLORDDER|COMPVALUE|COMPTEXT|ORGID|SID
1|100DAF|FNZ|X601|1
1|110DAF|KAZ|X601|2
1|112DAF|UNZ|X601|3
1|113DAF|AIZ|X601|4
1|140DAF|ANP|X601|5

OUTPUT FILE3 NAME- FN23_1c.dat and it's content is
===================================

GLORDDER|COMPVALUE|COMPTEXT|ORGID|SID
1|A.00000034.1|XXX|X601|1
1|A.00000036.1|YYY|X601|2
1|A.00000037.1|ZZZ|X601|3
1|A.00000039.1|AAA|X601|4

The requirement is I need to generate a single file out of this above 3 files and provide the sequence number continously to the column 'SID'

eg- After combining the 3 files, if the first file 'FN23_1a.dat' has 6 records(SID will end by 6) and then SID for 2nd file should start with 7 and end with 11 (as it has 5 records ) and sID for 3rd file start with 12 and end by 15.

I know 'CAT' command will concatenate all the 3 files.But don't know how to implement this in datastage and generate the sequence number consecutively. Please advise

Posted: Wed Feb 04, 2009 11:10 pm
by DSguru2B
You are on the right track. Pass the cat command to concatenate them and then generate the numbers.
You have multiple ways to generate sequence numbers. It has been discussed many times here. Use the search facility.

Posted: Thu Feb 05, 2009 12:23 am
by chulett
Cat the files together pre-job, read in the concatenated file and ignore the current sequence number. Assign a new one as you write the file back out using @OUTROWNUM as the derivation.

Constrain the output to skip the embedded header records when the first field has a value of "GLORDDER".

Posted: Thu Feb 05, 2009 12:40 am
by monaz
Here are the steps you could proceed with
1. In job properties, in after job subroutine tab select EXEC DOS

2. Can use the below command
(copy /b /y File1Name+File2Name+File3Name FinalOutputFileName)

Posted: Thu Feb 05, 2009 12:49 am
by sujaoschin
If the single input file is producing 3 different output files, I can concatenate using cat command in after job subroutine. But these files are produced using 3 different input files.

Should I use a 'link collector' to combine the 3 files and then use after job subroutine? Please advise.

Posted: Thu Feb 05, 2009 1:06 am
by ray.wurlod
No, just use the cat command as the Filter command of the Sequential File stage, which will then read the output of the cat command. Handle the column headings either in the Filter command (perhaps using head +2) or within your job. Use @OUTROWNUM to generate the line numbers.

Posted: Thu Feb 05, 2009 6:19 am
by vjonnala1516
Generate new Column (SID- with int Datatype) in Col Gen Stage just before Sequential file stage and in the output-Columns tab for SID use Generator as cycle with initial value as 1 and increment value also 1 and col gen stage should be on sequential mode.

Posted: Thu Feb 05, 2009 6:33 am
by chulett
Server not PX job, so no "Col Gen" stage and a UNIX server so no ExecDOS or "copy" command, guys.

This is an easy problem to solve and I posted a complete solution. Sure, you can cat in the Filter if you like rather than before job but I find that problematic at times and harder for new folks to understand.

Posted: Thu Feb 05, 2009 6:46 am
by dr.murthy
monaz wrote:Here are the steps you could proceed with
1. In job properties, in after job subroutine tab select EXEC DOS

2. Can use the below command
(copy /b /y File1Name+File2Name+File3Name FinalOutputFileName)
hi ,

can we concatenate theree files into a single file with out using funnel in datastage.how can i do that one in before job subroutine.could you explain me bit more clearly

Posted: Thu Feb 05, 2009 8:31 am
by chulett
What's unclear about that? For UNIX rather than DOS, use "cat". And in DOS you can use "type" as well.

Posted: Thu Feb 05, 2009 12:10 pm
by ray.wurlod
dr.murthy, your question is not the same as the original, because they are on UNIX and you are on Windows. You really ought to have begun a new thread.

Posted: Thu Feb 05, 2009 4:38 pm
by sujaoschin
Thank you all, for your responses.

I can understand chulett and Ray's solution.

Let me explain the scenario.

For eg- If a Job X is creating 3 different files FILE1,FILE2,FILE3 then in the job properties of Job X, in after sub routine - EXECSH- I can use the command of cat FILE1 FILE2 FILE3 > FILE4


But here the problem is
JOB A(its own job properties) is producing FILE1,
JOB B(its own job properties) is producing FILE2,
JOB C(its own job properties) is producing FILE3.

I mean 3 different jobs producing 3 different files.

The comman thing between these 3 files is all the 3 files are available in the same path
of
au05uap010btox2:/home/dsadm/projects/eswm/dev/output$FILE1
au05uap010btox2:/home/dsadm/projects/eswm/dev/output$FILE2
au05uap010btox2:/home/dsadm/projects/eswm/dev/output$FILE3

So I need to create a new job which will bring these 3 files together and concatenate. This is were I am facing the problem and I don't know how to create a job which will bring files together.Appreciate your help.

Posted: Thu Feb 05, 2009 4:44 pm
by ray.wurlod
My earlier post to this IS your solution.

Read the files with a Sequential File stage with a filter command (cat).
You now have a single stream of rows.
Use a Transformer stage to generate the line number (@OUTROWNUM) and to constrain so that the heading lines from job2 and job3 aren't passed (better is not to generate them in the first place).

Posted: Thu Feb 05, 2009 6:22 pm
by chulett
As is mine. :?

Posted: Tue Feb 10, 2009 1:21 am
by sujaoschin
The problem is resolved. I did like this,

Before -job subroutine
--------------------------
cat #sct_eswm_mig.$PROJECT_PATH#/#sct_eswm_mig.$OUTPUT_DIR#/#OUTPUT_FILE# > #sct_eswm_mig.$PROJECT_PATH#/#sct_eswm_mig.$STAGE_DIR#/#GLCOMP_FILE#.tmp

after job subroutine
------------------------
mv #sct_eswm_mig.$PROJECT_PATH#/#sct_eswm_mig.$OUTPUT_DIR#/#GLCOMP_FILE#_#BATCH_NUM#.dat #sct_eswm_mig.$PROJECT_PATH#/#sct_eswm_mig.$OUTPUT_DIR#/#GLCOMP_FILE#_#BATCH_NUM#_`date +%Y%m%d`.dat

Parameters-
--------------
OUTPUT_FILE- MAXIMO_FN23_*

GLCOMP_FILE- MAXIMO_GLCOMPONENTS

Then I gave the @OUTROWNUM in the transformer.

Thanks a lot to all .