Transpose Row to column from two files

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
karthi_gana
Premium Member
Premium Member
Posts: 729
Joined: Tue Apr 28, 2009 10:49 pm

Transpose Row to column from two files

Post by karthi_gana »

All,

I have a sequential file as below.

UHDR|2011210
a|201101|1.2
b|201102|2.2
c|201103|3.2
d|201104|4.2
UTRL|005

I have designed a job to split the header and footer as below

SeqFile --> Transformer --> Head (1 Row) --> seq1.txt
|
v
Tail (1 Row)
|
v
seq2.txt
Now...I have two sequential file as below.

Just hardcoded the key value as '1' in both the file.

seq1.txt

key header
1 UHDR|2011210


seq2.text

key footer
1 UTRL|0015

Expected output:
key header footer
1 UHDR|2011210 UTRL|0015

I have to do this in a single job. I don't like to create another job to do this. is it possible to get the expected output in my above job design?
Karthik
suse_dk
Participant
Posts: 93
Joined: Thu Aug 11, 2011 6:18 am
Location: Denmark

Post by suse_dk »

You can join the two output from the sequential file stage on the key value you are generating.
_________________
- Susanne
jyothisdasms
Participant
Posts: 33
Joined: Wed May 19, 2010 12:15 am
Location: Pune

Post by jyothisdasms »

Add a join stage and join the two ouputs instead of keeping the files.Make the dummy field as the key
Then you will get the o/p as

Key col1 col2
1 header trailor

Hope it will help you
:)
" Dream like you will live forever, live like you will die today."
karthi_gana
Premium Member
Premium Member
Posts: 729
Joined: Tue Apr 28, 2009 10:49 pm

Post by karthi_gana »

yes..i got it..now..i have to delete UHDR| and UTRL| from the output.

is there any function like substring() available in datastage?
Karthik
karthi_gana
Premium Member
Premium Member
Posts: 729
Joined: Tue Apr 28, 2009 10:49 pm

Post by karthi_gana »

i found the way...thx

I would like to capture the current date in a datetime columns.

I used TimeDate() function and declared the column as TimeStamp 23 3
in datastage. But it is not working as expected.
Karthik
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

All of the functions are documented.

Yes, there is a substring function and there is also square brackets "[]" which is the substring operator.
-craig

"You can never have too many knives" -- Logan Nine Fingers
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

TimeDate() is a server function. Maybe you needed CurrentTimestamp() function?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
karthi_gana
Premium Member
Premium Member
Posts: 729
Joined: Tue Apr 28, 2009 10:49 pm

Post by karthi_gana »

yes.

i used Head & Tail stage to get the header & footer.

I tried with Filter stage too. It worked fine.

I tried with 'Constraint' too. This one also worked fine.

Which one is best (performance perspective) to filter the header and footer from a file?

a)Head & Tail stage
b) Filter stage
c) Constraint

how to identify which one is providing best performance? which paramater we need to verify?
Karthik
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Define "performance".

You've run all three - did you bother to take relevant measures of whatever it is you consider performance to be, and did you use a statistically meaningful volume of data when doing so?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
karthi_gana
Premium Member
Premium Member
Posts: 729
Joined: Tue Apr 28, 2009 10:49 pm

Post by karthi_gana »

When i say performance it would be

a) minimal execution time
b) minimal memory utilization
c) job design
4) may be partition

How should i take the relevant measure ? I have not taken any measurement in datastage so far. I have done lot of tuning work in DB side. As we know we should follow some steps like taking query plan, verifying whether proper indexes are used or not and so on, like that, is there any basic steps which we need to consider/see when we take the measurement?

I have not yet tested against huge volume of data. I just tested these jobs against 10,000 rows. Thats it. Which parameters i have to see to get tthe measurement?
Karthik
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Primarily DataStage's own Performance Analyzer.

You might also gather useful information from the Resource Estimation tool.

Both are available from within Designer. Performance analysis does require some preparation, all of which is described in the manuals.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply