Page 1 of 1

Database Lookup issue

Posted: Thu Jan 14, 2010 4:09 pm
by vemisr
Hi Experts:

Env : DS8.1, Unix, Oracle in PX

My Lookup database having 300 Million Records,

incoming file having 100 K to 500 K records, it needs to perform database lookup , if match found i will get the ACCT Num from the Database.

Req:
incoming file having name, street name, unit, city, state and zip fields if match found then i will get the acct num from the database.

the question is how can get the best performance. performance is very critical, i need to process multiple files in multiple times a day.

can you suggest me the solutions.

thanks
Vemisr

Posted: Thu Jan 14, 2010 4:49 pm
by ray.wurlod
Join stage, sorted inputs, Partition on name and perhaps also on zip, state and city.

Posted: Thu Jan 14, 2010 5:26 pm
by Sainath.Srinivasan
If the database is same, then do the join within the source.

Else do a sparse lookup.

Posted: Sun Jan 17, 2010 2:02 pm
by Kryt0n
Is there potential to filter out some of those 300m records? i.e Expired etc

What are you doing the join on? All those fields mentioned? If so, then maybe maintaining a separate checksum table could help...

Posted: Mon Jan 18, 2010 8:22 am
by nagarjuna
Sainath.Srinivasan wrote:If the database is same, then do the join within the source.

Else do a sparse lookup. ...
Hi Sainath ,

I think 500K input records may be high for a sparse lookup ...

Is there any optimal value on number of records for a sparse lookup ??

Posted: Mon Jan 18, 2010 3:42 pm
by ray.wurlod
As always, "it depends".

For example, for very narrow rows, 500K rows is not such a large volume of data.

Posted: Fri Jan 29, 2010 6:06 am
by abhijain
As suggested -
1. Go for sql query join if both are from same database.
2. Can go for Join stage if not bothering about the unmatched records.
3. Also, if wants to perform the look up only then try to use minimum columns in look up stage and by pass the rest.

Also, Please elobrate more on your problem.

Re: Database Lookup issue

Posted: Mon Feb 01, 2010 3:42 pm
by makreddy
One simple Solutopn.
Insert the source records into a database if they come from other source, and do a database join.

vemisr wrote:Hi Experts:

Env : DS8.1, Unix, Oracle in PX

My Lookup database having 300 Million Records,

incoming file having 100 K to 500 K records, it needs to perform database lookup , if match found i will get the ACCT Num from the Database.

Req:
incoming file having name, street name, unit, city, state and zip fields if match found then i will get the acct num from the database.

the question is how can get the best performance. performance is very critical, i need to process multiple files in multiple times a day.

can you suggest me the solutions.

thanks
Vemisr