Page 2 of 3

Posted: Sun May 22, 2011 11:42 pm
by singhald
read the data form both the input links and add column genereator stage in each input link to create unique sequence number in new dummly key column which can be defined in the column generator stage (edit column metadata and provide "part" as start value and "partcount" as increment value which will generate unique sequence number in both the input links

Now you can use join stage to join input data based on newly introduced column which has unique sequence number.

Hope you can implement this.

Thanks,
Deepak

Re: help in job logic

Posted: Mon May 23, 2011 1:05 am
by peddakkagari
A
__

1
2
5
6


B
__

3
4
7


connect your source A records to Transformer by adding a sequence number using Surrogate Key generator stage
then your source A data in transformer look like below

sno,A
_____
1,1
2,2
3,5
4,6

Do the same thing for second source also and connect to another transformer then the source B data in transformer look like below

sno,B
_____
1,3
2,4
3,7

Then join these two transfomrers in joiner stage(bases on your requirement choose join type) using the key sno(sequence number)

if you use normal join the ouput in C will be,
C
___
1,3
2,4
5,7

if you use full outer join the ouput in C will be,

C
___
1,3
2,4
5,7
6,

Re: help in job logic

Posted: Mon May 23, 2011 5:27 am
by vinodshinde369
You can do it as follows,
1. Use 2 different source to read the data i.e. A and B
2. Use column generator for each stream and generate a interger column named srno with initial value=1 and incremetal by=1 ( use these both column generator in sequencial mode). Also sort the input data using link sort
3. Use a joiner and do a leftouter join (A is left) on key=srno
4. Output the both columns from source A and B to output, you will get desired result

Re: help in job logic

Posted: Mon May 23, 2011 6:24 am
by chulett
vinodshinde369 wrote:Also sort the input data using link sort
Don't.

Posted: Mon May 23, 2011 7:03 am
by pandeesh
Hi deepak,

i have implemented as you said.

my input files are:

A
---
1
2
3

B
__

4
5
6

i have used the below design:

Code: Select all

seqfile ---> column generator---------
                                 Joinstage(f.O.J)---->target Seq file
Seqfile--->column generator----------

Finally in the target seq file, i am getting the below output:

"2","5"
"3","6"
"1","4"

but i want to get
"1","4"
"2","5"
"3","6"

How to achieve this?

thanks

Posted: Mon May 23, 2011 7:24 am
by pandeesh
I have tried with Surrogate key generator instead of column genearator.

i am getting the below result:

But i want to get as
"1","4"
"2","5"
"3","6"


since my input files A nd B are given below:

A
---

1
2
3

B
__

4
5
6

thanks

Posted: Mon May 23, 2011 7:49 am
by chulett
What happens if you run the job on a single node? Or are you doing that now? And you don't need to keep repeating what your input looks like over and over, I think we get that part by now.

Posted: Mon May 23, 2011 2:40 pm
by kogads
In the join stage did you do hash partitioning and then sorting.if not give a try

Posted: Mon May 23, 2011 11:11 pm
by pandeesh
kogads wrote:In the join stage did you do hash partitioning and then sorting.if not give a try

i have tried with hash partition and with sorting and without sorting.
In both cases , i get the o/p as

2,5
3,6
1,4

Posted: Mon May 23, 2011 11:20 pm
by pandeesh
chulett wrote:What happens if you run the job on a single node? Or are you doing that now? .
in my project APT_CONFIG_FILE parameter is set to /opt/app/dstage/DataStage752/Ascential/DataStage/Configurations/default.apt
where default.apt is configured with 2 nodes.

so i wan to change the config file for this job in the join stage alone.
I have another configuration file oneX which is configured with one node.

But in the stage(join) level, the option to change config file is disabled. and it is set with default.apt as default.

So the admin only has the rights to change config file for job?>

How to enable this so that developers themselves can change the config file in the stage level?

thanks

Posted: Mon May 23, 2011 11:39 pm
by pandeesh
Hi Craig,

I have chosen only one node in the node map constraint in join stage.

now i am getting the result as expected.

So, what causes the problem while running in multiple nodes>?

thanks

Posted: Mon May 23, 2011 11:50 pm
by singhald
if you choose sort and merge collection method in target sequential file , you could have achive your desired output.

Posted: Tue May 24, 2011 1:11 am
by Tejas Pujari
Use Stable sort in the Join stage u will get desired output

Posted: Tue May 24, 2011 1:27 am
by jyothisdasms
Generate a new keycolum for each file .You can make use of a transformer for that.Declare a stage variable.Make its initial value as 1.Then increase its value by one for each row.so that it's value will come as
1
2
3....

Make the same logic for both the files.Then your inputs will become

Input1

Key Col1
1 3
2 4
3 7

Input2

Key Col2
1 5
2 2


Use two transformer stages for this.
Then use a join/LOOkUp .Make the join on key.Take Col1,Col2 as O/P

Posted: Tue May 24, 2011 1:29 am
by pandeesh
jyothisdasms wrote:Declare a stage variable.Make its initial value as 1.Then increase its value by one for each row.
How to do this?