Page 1 of 1

Transpose Row to column from two files

Posted: Tue Sep 20, 2011 4:16 am
by karthi_gana
All,

I have a sequential file as below.

UHDR|2011210
a|201101|1.2
b|201102|2.2
c|201103|3.2
d|201104|4.2
UTRL|005

I have designed a job to split the header and footer as below

SeqFile --> Transformer --> Head (1 Row) --> seq1.txt
|
v
Tail (1 Row)
|
v
seq2.txt
Now...I have two sequential file as below.

Just hardcoded the key value as '1' in both the file.

seq1.txt

key header
1 UHDR|2011210


seq2.text

key footer
1 UTRL|0015

Expected output:
key header footer
1 UHDR|2011210 UTRL|0015

I have to do this in a single job. I don't like to create another job to do this. is it possible to get the expected output in my above job design?

Posted: Tue Sep 20, 2011 4:30 am
by suse_dk
You can join the two output from the sequential file stage on the key value you are generating.

Posted: Tue Sep 20, 2011 4:38 am
by jyothisdasms
Add a join stage and join the two ouputs instead of keeping the files.Make the dummy field as the key
Then you will get the o/p as

Key col1 col2
1 header trailor

Hope it will help you
:)

Posted: Tue Sep 20, 2011 6:44 am
by karthi_gana
yes..i got it..now..i have to delete UHDR| and UTRL| from the output.

is there any function like substring() available in datastage?

Posted: Tue Sep 20, 2011 7:10 am
by karthi_gana
i found the way...thx

I would like to capture the current date in a datetime columns.

I used TimeDate() function and declared the column as TimeStamp 23 3
in datastage. But it is not working as expected.

Posted: Tue Sep 20, 2011 7:10 am
by chulett
All of the functions are documented.

Yes, there is a substring function and there is also square brackets "[]" which is the substring operator.

Posted: Tue Sep 20, 2011 2:49 pm
by ray.wurlod
TimeDate() is a server function. Maybe you needed CurrentTimestamp() function?

Posted: Wed Sep 21, 2011 4:11 am
by karthi_gana
yes.

i used Head & Tail stage to get the header & footer.

I tried with Filter stage too. It worked fine.

I tried with 'Constraint' too. This one also worked fine.

Which one is best (performance perspective) to filter the header and footer from a file?

a)Head & Tail stage
b) Filter stage
c) Constraint

how to identify which one is providing best performance? which paramater we need to verify?

Posted: Wed Sep 21, 2011 3:46 pm
by ray.wurlod
Define "performance".

You've run all three - did you bother to take relevant measures of whatever it is you consider performance to be, and did you use a statistically meaningful volume of data when doing so?

Posted: Thu Sep 22, 2011 12:44 am
by karthi_gana
When i say performance it would be

a) minimal execution time
b) minimal memory utilization
c) job design
4) may be partition

How should i take the relevant measure ? I have not taken any measurement in datastage so far. I have done lot of tuning work in DB side. As we know we should follow some steps like taking query plan, verifying whether proper indexes are used or not and so on, like that, is there any basic steps which we need to consider/see when we take the measurement?

I have not yet tested against huge volume of data. I just tested these jobs against 10,000 rows. Thats it. Which parameters i have to see to get tthe measurement?

Posted: Thu Sep 22, 2011 12:53 am
by ray.wurlod
Primarily DataStage's own Performance Analyzer.

You might also gather useful information from the Resource Estimation tool.

Both are available from within Designer. Performance analysis does require some preparation, all of which is described in the manuals.