help in job logic

singhald · Post by **singhald** » Sun May 22, 2011 11:42 pm

read the data form both the input links and add column genereator stage in each input link to create unique sequence number in new dummly key column which can be defined in the column generator stage (edit column metadata and provide "part" as start value and "partcount" as increment value which will generate unique sequence number in both the input links

Now you can use join stage to join input data based on newly introduced column which has unique sequence number.

Hope you can implement this.

Thanks,
Deepak

peddakkagari · Post by **peddakkagari** » Mon May 23, 2011 1:05 am

A
__

1
2
5
6

B
__

3
4
7

connect your source A records to Transformer by adding a sequence number using Surrogate Key generator stage
then your source A data in transformer look like below

sno,A
_____
1,1
2,2
3,5
4,6

Do the same thing for second source also and connect to another transformer then the source B data in transformer look like below

sno,B
_____
1,3
2,4
3,7

Then join these two transfomrers in joiner stage(bases on your requirement choose join type) using the key sno(sequence number)

if you use normal join the ouput in C will be,
C
___
1,3
2,4
5,7

if you use full outer join the ouput in C will be,

C
___
1,3
2,4
5,7
6,

vinodshinde369 · Post by **vinodshinde369** » Mon May 23, 2011 5:27 am

You can do it as follows,
1. Use 2 different source to read the data i.e. A and B
2. Use column generator for each stream and generate a interger column named srno with initial value=1 and incremetal by=1 ( use these both column generator in sequencial mode). Also sort the input data using link sort
3. Use a joiner and do a leftouter join (A is left) on key=srno
4. Output the both columns from source A and B to output, you will get desired result

chulett · Post by **chulett** » Mon May 23, 2011 6:24 am

vinodshinde369 wrote:Also sort the input data using link sort

Don't.

pandeesh · Post by **pandeesh** » Mon May 23, 2011 7:03 am

Hi deepak,

i have implemented as you said.

my input files are:

A
---
1
2
3

B
__

4
5
6

i have used the below design:

Code: Select all

seqfile ---> column generator---------
                                 Joinstage(f.O.J)---->target Seq file
Seqfile--->column generator----------

Finally in the target seq file, i am getting the below output:

"2","5"
"3","6"
"1","4"

but i want to get
"1","4"
"2","5"
"3","6"

How to achieve this?

thanks

pandeesh · Post by **pandeesh** » Mon May 23, 2011 7:24 am

I have tried with Surrogate key generator instead of column genearator.

i am getting the below result:

But i want to get as
"1","4"
"2","5"
"3","6"

since my input files A nd B are given below:

A
---

1
2
3

B
__

4
5
6

thanks

chulett · Post by **chulett** » Mon May 23, 2011 7:49 am

What happens if you run the job on a single node? Or are you doing that now? And you don't need to keep repeating what your input looks like over and over, I think we get that part by now.

kogads · Post by **kogads** » Mon May 23, 2011 2:40 pm

In the join stage did you do hash partitioning and then sorting.if not give a try

pandeesh · Post by **pandeesh** » Mon May 23, 2011 11:11 pm

kogads wrote:In the join stage did you do hash partitioning and then sorting.if not give a try

i have tried with hash partition and with sorting and without sorting.
In both cases , i get the o/p as

2,5
3,6
1,4

pandeesh · Post by **pandeesh** » Mon May 23, 2011 11:20 pm

chulett wrote:What happens if you run the job on a single node? Or are you doing that now? .

in my project APT_CONFIG_FILE parameter is set to /opt/app/dstage/DataStage752/Ascential/DataStage/Configurations/default.apt
where default.apt is configured with 2 nodes.

so i wan to change the config file for this job in the join stage alone.
I have another configuration file oneX which is configured with one node.

But in the stage(join) level, the option to change config file is disabled. and it is set with default.apt as default.

So the admin only has the rights to change config file for job?>

How to enable this so that developers themselves can change the config file in the stage level?

thanks

pandeesh · Post by **pandeesh** » Mon May 23, 2011 11:39 pm

Hi Craig,

I have chosen only one node in the node map constraint in join stage.

now i am getting the result as expected.

So, what causes the problem while running in multiple nodes>?

thanks

singhald · Post by **singhald** » Mon May 23, 2011 11:50 pm

if you choose sort and merge collection method in target sequential file , you could have achive your desired output.

Tejas Pujari · Post by **Tejas Pujari** » Tue May 24, 2011 1:11 am

Use Stable sort in the Join stage u will get desired output

jyothisdasms · Post by **jyothisdasms** » Tue May 24, 2011 1:27 am

Generate a new keycolum for each file .You can make use of a transformer for that.Declare a stage variable.Make its initial value as 1.Then increase its value by one for each row.so that it's value will come as
1
2
3....

Make the same logic for both the files.Then your inputs will become

Input1

Key Col1
1 3
2 4
3 7

Input2

Key Col2
1 5
2 2

Use two transformer stages for this.
Then use a join/LOOkUp .Make the join on key.Take Col1,Col2 as O/P

pandeesh · Post by **pandeesh** » Tue May 24, 2011 1:29 am

jyothisdasms wrote:Declare a stage variable.Make its initial value as 1.Then increase its value by one for each row.

How to do this?

DSXchange

help in job logic

Re: help in job logic

Re: help in job logic

Re: help in job logic