Concatenate 3 files and generate sequence number

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
sujaoschin
Premium Member
Premium Member
Posts: 102
Joined: Tue Jan 31, 2006 4:13 am

Concatenate 3 files and generate sequence number

Post by sujaoschin »

I have generated 3 output sequential files from 3 different input files.

All the output files has column headings and values separated by '|' Symbol

OUTPUT FILE1 NAME- FN23_1a.dat
======================

GLORDDER|COMPVALUE|COMPTEXT|ORGID|SID
1|1111|ANZ|X601|1
1|1112|ANE|X601|2
1|1113|ANB|X601|3
1|1114|ANY|X601|4
1|1114|ANO|X601|5
1|1115|ANU|X601|6


OUTPUT FILE2 NAME- FN23_1b.dat and it's content is
===================================

GLORDDER|COMPVALUE|COMPTEXT|ORGID|SID
1|100DAF|FNZ|X601|1
1|110DAF|KAZ|X601|2
1|112DAF|UNZ|X601|3
1|113DAF|AIZ|X601|4
1|140DAF|ANP|X601|5

OUTPUT FILE3 NAME- FN23_1c.dat and it's content is
===================================

GLORDDER|COMPVALUE|COMPTEXT|ORGID|SID
1|A.00000034.1|XXX|X601|1
1|A.00000036.1|YYY|X601|2
1|A.00000037.1|ZZZ|X601|3
1|A.00000039.1|AAA|X601|4

The requirement is I need to generate a single file out of this above 3 files and provide the sequence number continously to the column 'SID'

eg- After combining the 3 files, if the first file 'FN23_1a.dat' has 6 records(SID will end by 6) and then SID for 2nd file should start with 7 and end with 11 (as it has 5 records ) and sID for 3rd file start with 12 and end by 15.

I know 'CAT' command will concatenate all the 3 files.But don't know how to implement this in datastage and generate the sequence number consecutively. Please advise
Sujatha K
DSguru2B
Charter Member
Charter Member
Posts: 6854
Joined: Wed Feb 09, 2005 3:44 pm
Location: Houston, TX

Post by DSguru2B »

You are on the right track. Pass the cat command to concatenate them and then generate the numbers.
You have multiple ways to generate sequence numbers. It has been discussed many times here. Use the search facility.
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Cat the files together pre-job, read in the concatenated file and ignore the current sequence number. Assign a new one as you write the file back out using @OUTROWNUM as the derivation.

Constrain the output to skip the embedded header records when the first field has a value of "GLORDDER".
-craig

"You can never have too many knives" -- Logan Nine Fingers
monaz
Participant
Posts: 98
Joined: Sat Aug 23, 2008 3:14 am

Post by monaz »

Here are the steps you could proceed with
1. In job properties, in after job subroutine tab select EXEC DOS

2. Can use the below command
(copy /b /y File1Name+File2Name+File3Name FinalOutputFileName)
sujaoschin
Premium Member
Premium Member
Posts: 102
Joined: Tue Jan 31, 2006 4:13 am

Post by sujaoschin »

If the single input file is producing 3 different output files, I can concatenate using cat command in after job subroutine. But these files are produced using 3 different input files.

Should I use a 'link collector' to combine the 3 files and then use after job subroutine? Please advise.
Sujatha K
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

No, just use the cat command as the Filter command of the Sequential File stage, which will then read the output of the cat command. Handle the column headings either in the Filter command (perhaps using head +2) or within your job. Use @OUTROWNUM to generate the line numbers.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
vjonnala1516
Participant
Posts: 18
Joined: Fri Jan 04, 2008 5:28 am
Location: Bangalore

Post by vjonnala1516 »

Generate new Column (SID- with int Datatype) in Col Gen Stage just before Sequential file stage and in the output-Columns tab for SID use Generator as cycle with initial value as 1 and increment value also 1 and col gen stage should be on sequential mode.
VJ
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Server not PX job, so no "Col Gen" stage and a UNIX server so no ExecDOS or "copy" command, guys.

This is an easy problem to solve and I posted a complete solution. Sure, you can cat in the Filter if you like rather than before job but I find that problematic at times and harder for new folks to understand.
-craig

"You can never have too many knives" -- Logan Nine Fingers
dr.murthy
Participant
Posts: 224
Joined: Sun Dec 07, 2008 8:47 am
Location: delhi

Post by dr.murthy »

monaz wrote:Here are the steps you could proceed with
1. In job properties, in after job subroutine tab select EXEC DOS

2. Can use the below command
(copy /b /y File1Name+File2Name+File3Name FinalOutputFileName)
hi ,

can we concatenate theree files into a single file with out using funnel in datastage.how can i do that one in before job subroutine.could you explain me bit more clearly
D.N .MURTHY
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

What's unclear about that? For UNIX rather than DOS, use "cat". And in DOS you can use "type" as well.
-craig

"You can never have too many knives" -- Logan Nine Fingers
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

dr.murthy, your question is not the same as the original, because they are on UNIX and you are on Windows. You really ought to have begun a new thread.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
sujaoschin
Premium Member
Premium Member
Posts: 102
Joined: Tue Jan 31, 2006 4:13 am

Post by sujaoschin »

Thank you all, for your responses.

I can understand chulett and Ray's solution.

Let me explain the scenario.

For eg- If a Job X is creating 3 different files FILE1,FILE2,FILE3 then in the job properties of Job X, in after sub routine - EXECSH- I can use the command of cat FILE1 FILE2 FILE3 > FILE4


But here the problem is
JOB A(its own job properties) is producing FILE1,
JOB B(its own job properties) is producing FILE2,
JOB C(its own job properties) is producing FILE3.

I mean 3 different jobs producing 3 different files.

The comman thing between these 3 files is all the 3 files are available in the same path
of
au05uap010btox2:/home/dsadm/projects/eswm/dev/output$FILE1
au05uap010btox2:/home/dsadm/projects/eswm/dev/output$FILE2
au05uap010btox2:/home/dsadm/projects/eswm/dev/output$FILE3

So I need to create a new job which will bring these 3 files together and concatenate. This is were I am facing the problem and I don't know how to create a job which will bring files together.Appreciate your help.
Sujatha K
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

My earlier post to this IS your solution.

Read the files with a Sequential File stage with a filter command (cat).
You now have a single stream of rows.
Use a Transformer stage to generate the line number (@OUTROWNUM) and to constrain so that the heading lines from job2 and job3 aren't passed (better is not to generate them in the first place).
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

As is mine. :?
-craig

"You can never have too many knives" -- Logan Nine Fingers
sujaoschin
Premium Member
Premium Member
Posts: 102
Joined: Tue Jan 31, 2006 4:13 am

Post by sujaoschin »

The problem is resolved. I did like this,

Before -job subroutine
--------------------------
cat #sct_eswm_mig.$PROJECT_PATH#/#sct_eswm_mig.$OUTPUT_DIR#/#OUTPUT_FILE# > #sct_eswm_mig.$PROJECT_PATH#/#sct_eswm_mig.$STAGE_DIR#/#GLCOMP_FILE#.tmp

after job subroutine
------------------------
mv #sct_eswm_mig.$PROJECT_PATH#/#sct_eswm_mig.$OUTPUT_DIR#/#GLCOMP_FILE#_#BATCH_NUM#.dat #sct_eswm_mig.$PROJECT_PATH#/#sct_eswm_mig.$OUTPUT_DIR#/#GLCOMP_FILE#_#BATCH_NUM#_`date +%Y%m%d`.dat

Parameters-
--------------
OUTPUT_FILE- MAXIMO_FN23_*

GLCOMP_FILE- MAXIMO_GLCOMPONENTS

Then I gave the @OUTROWNUM in the transformer.

Thanks a lot to all .
Sujatha K
Post Reply