Lookup Transformer produces Zeros

sohasaid · Post by **sohasaid** » Tue May 18, 2010 6:56 am

Dears,

I'm a new practitioner to server jobs with parallel background. I'm using the transformer stage to perform a lookup operation. There're two inputs (DB2 stages) for transformer stage and one output (DB2 stage also). Although there's a match between inputs on the key column, the output column from the second link is zero which means that there's no match.
For example,
Data from stream is:

ID Name
1 XxX
2 YyY

Data from reference is:

ID Country_Name
1 Egypt
2 US

All data go to the reject link as follows:

ID Name Country_Name
1 XxX 0
2 YyY 0

Any Suggestions regarding this behaviour?

Regards.

chulett · Post by **chulett** » Tue May 18, 2010 7:54 am

So... a direct database lookup rather than a hashed file? The latter is preferred and much more consistant but you can do it direct like that as long as you understand the performance considerations. And since it's a sql query, it behaves the same as if you had run the "select x from y where z" in your favorite tool. That and you can structure it to return whatever you like, hit or miss.

Not sure what you're asking - why is the lookup failing? We'd have to see the sql you are using and know your key v. non-key columns to be of any help there.

Sainath.Srinivasan · Post by **Sainath.Srinivasan** » Tue May 18, 2010 8:07 am

What is the datatype for country column ? Is it declared as numeric ?

sohasaid · Post by **sohasaid** » Tue May 18, 2010 11:52 pm

chulett wrote: why is the lookup failing?

That's exactly what I'm asking. The expected result should be:

ID Name Country_Name
1 XxX Egypt
2 YyY US

What I didn't mention that the key column of lookup is 'ID' with data type Integer and 'Name' and 'Country_Name' are non-key values and their data type is VarChar(50), not as Numeric, Sainath.

Sql statements are:

SELECT ID,Country_Name, FROM <table_name>;
SELECT ID, Name FROM <table_name>;

Each table has 2 records only. When I used a hashed file at reference link, it worked out. I'm also wondering if my reference has millions of records, should I go to direct database lookup or hashed file instead?

Regards.

chulett · Post by **chulett** » Wed May 19, 2010 12:30 am

SQL Statements, plural? Do you have two lookup stages?

Have you run the job through the debugger to see what happens? The Server product actually has a pretty nice one, something the [cough] other one doesn't. Or at least didn't the last time I was using it.

Just becase you have "millions of rows" in your lookup table doesn't mean you need to use all of them each job run, I assume? A common technique (which works very well, I might add) is what I've heard called a "key bucket table" or some such - basically you take the natural keys from your current input data and "bucket" them in a work table, then join that table to the "lookup" to drive the hashed file build. Essentially, each run the hashed file has "only what it needs" to fulfill its obligations for that run. Process 10,000 records and it there are up to 10,000 records in the hashed file. 5 records? No more than five. Etc. Part of the "Only What You Need" mantra.

Me, I don't recall the last time I did a direct database lookup in a server job... maybe for a very small volume where speed was not an issue. Otherwise it's Hashed Files all the way. As I've said here many times, hashed files are (IMHO) the Heart And Soul of the Server product. Learn to love them as they are your best friend in that world.

sohasaid · Post by **sohasaid** » Thu May 20, 2010 12:20 am

chulett wrote:Learn to love them as they are your best friend in that world.

I'll do my best to love them.

I went for the hashed file solution instead of the db direct lookup and it worked as I mentioned before.

Craig, Thanks for the valuable tip up there.

Regards.