help in job logic

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

singhald
Participant
Posts: 180
Joined: Tue Aug 23, 2005 2:50 am
Location: Bangalore
Contact:

Post by singhald »

read the data form both the input links and add column genereator stage in each input link to create unique sequence number in new dummly key column which can be defined in the column generator stage (edit column metadata and provide "part" as start value and "partcount" as increment value which will generate unique sequence number in both the input links

Now you can use join stage to join input data based on newly introduced column which has unique sequence number.

Hope you can implement this.

Thanks,
Deepak
peddakkagari
Participant
Posts: 26
Joined: Thu Aug 12, 2010 12:07 am

Re: help in job logic

Post by peddakkagari »

A
__

1
2
5
6


B
__

3
4
7


connect your source A records to Transformer by adding a sequence number using Surrogate Key generator stage
then your source A data in transformer look like below

sno,A
_____
1,1
2,2
3,5
4,6

Do the same thing for second source also and connect to another transformer then the source B data in transformer look like below

sno,B
_____
1,3
2,4
3,7

Then join these two transfomrers in joiner stage(bases on your requirement choose join type) using the key sno(sequence number)

if you use normal join the ouput in C will be,
C
___
1,3
2,4
5,7

if you use full outer join the ouput in C will be,

C
___
1,3
2,4
5,7
6,
vinodshinde369
Participant
Posts: 7
Joined: Tue Dec 21, 2010 4:30 am
Location: Pune

Re: help in job logic

Post by vinodshinde369 »

You can do it as follows,
1. Use 2 different source to read the data i.e. A and B
2. Use column generator for each stream and generate a interger column named srno with initial value=1 and incremetal by=1 ( use these both column generator in sequencial mode). Also sort the input data using link sort
3. Use a joiner and do a leftouter join (A is left) on key=srno
4. Output the both columns from source A and B to output, you will get desired result
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Re: help in job logic

Post by chulett »

vinodshinde369 wrote:Also sort the input data using link sort
Don't.
-craig

"You can never have too many knives" -- Logan Nine Fingers
pandeesh
Premium Member
Premium Member
Posts: 1399
Joined: Sun Oct 24, 2010 5:15 am
Location: CHENNAI, TAMIL NADU

Post by pandeesh »

Hi deepak,

i have implemented as you said.

my input files are:

A
---
1
2
3

B
__

4
5
6

i have used the below design:

Code: Select all

seqfile ---> column generator---------
                                 Joinstage(f.O.J)---->target Seq file
Seqfile--->column generator----------

Finally in the target seq file, i am getting the below output:

"2","5"
"3","6"
"1","4"

but i want to get
"1","4"
"2","5"
"3","6"

How to achieve this?

thanks
pandeeswaran
pandeesh
Premium Member
Premium Member
Posts: 1399
Joined: Sun Oct 24, 2010 5:15 am
Location: CHENNAI, TAMIL NADU

Post by pandeesh »

I have tried with Surrogate key generator instead of column genearator.

i am getting the below result:

But i want to get as
"1","4"
"2","5"
"3","6"


since my input files A nd B are given below:

A
---

1
2
3

B
__

4
5
6

thanks
pandeeswaran
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

What happens if you run the job on a single node? Or are you doing that now? And you don't need to keep repeating what your input looks like over and over, I think we get that part by now.
-craig

"You can never have too many knives" -- Logan Nine Fingers
kogads
Premium Member
Premium Member
Posts: 74
Joined: Fri Jun 05, 2009 5:36 pm

Post by kogads »

In the join stage did you do hash partitioning and then sorting.if not give a try
pandeesh
Premium Member
Premium Member
Posts: 1399
Joined: Sun Oct 24, 2010 5:15 am
Location: CHENNAI, TAMIL NADU

Post by pandeesh »

kogads wrote:In the join stage did you do hash partitioning and then sorting.if not give a try

i have tried with hash partition and with sorting and without sorting.
In both cases , i get the o/p as

2,5
3,6
1,4
pandeeswaran
pandeesh
Premium Member
Premium Member
Posts: 1399
Joined: Sun Oct 24, 2010 5:15 am
Location: CHENNAI, TAMIL NADU

Post by pandeesh »

chulett wrote:What happens if you run the job on a single node? Or are you doing that now? .
in my project APT_CONFIG_FILE parameter is set to /opt/app/dstage/DataStage752/Ascential/DataStage/Configurations/default.apt
where default.apt is configured with 2 nodes.

so i wan to change the config file for this job in the join stage alone.
I have another configuration file oneX which is configured with one node.

But in the stage(join) level, the option to change config file is disabled. and it is set with default.apt as default.

So the admin only has the rights to change config file for job?>

How to enable this so that developers themselves can change the config file in the stage level?

thanks
pandeeswaran
pandeesh
Premium Member
Premium Member
Posts: 1399
Joined: Sun Oct 24, 2010 5:15 am
Location: CHENNAI, TAMIL NADU

Post by pandeesh »

Hi Craig,

I have chosen only one node in the node map constraint in join stage.

now i am getting the result as expected.

So, what causes the problem while running in multiple nodes>?

thanks
pandeeswaran
singhald
Participant
Posts: 180
Joined: Tue Aug 23, 2005 2:50 am
Location: Bangalore
Contact:

Post by singhald »

if you choose sort and merge collection method in target sequential file , you could have achive your desired output.
Regards,
Deepak Singhal
Everything is okay in the end. If it's not okay, then it's not the end.
Tejas Pujari
Participant
Posts: 14
Joined: Thu Jul 10, 2008 7:37 am
Location: mumbai

Post by Tejas Pujari »

Use Stable sort in the Join stage u will get desired output
jyothisdasms
Participant
Posts: 33
Joined: Wed May 19, 2010 12:15 am
Location: Pune

Post by jyothisdasms »

Generate a new keycolum for each file .You can make use of a transformer for that.Declare a stage variable.Make its initial value as 1.Then increase its value by one for each row.so that it's value will come as
1
2
3....

Make the same logic for both the files.Then your inputs will become

Input1

Key Col1
1 3
2 4
3 7

Input2

Key Col2
1 5
2 2


Use two transformer stages for this.
Then use a join/LOOkUp .Make the join on key.Take Col1,Col2 as O/P
" Dream like you will live forever, live like you will die today."
pandeesh
Premium Member
Premium Member
Posts: 1399
Joined: Sun Oct 24, 2010 5:15 am
Location: CHENNAI, TAMIL NADU

Post by pandeesh »

jyothisdasms wrote:Declare a stage variable.Make its initial value as 1.Then increase its value by one for each row.
How to do this?
pandeeswaran
Post Reply