processing large resultsets

Infosphere's Quality Product

Moderators: chulett, rschirm

Post Reply
seanc217
Premium Member
Premium Member
Posts: 188
Joined: Thu Sep 15, 2005 9:22 am

processing large resultsets

Post by seanc217 »

Hi there,

I am processing a large resultset (6M + records) through the USPREP ruleset.

My question is what is the best way to run such a resultset through the standardize stage efficiently. Right now I'm processing at 1240 rec/sec.

It appears that the standardize stage is really slow. I can understand why, it's doing alot of work. Just wondering what are some of the best practices for making this run efficiently.

Thanks.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

If the system is not running out of resources, use a configuration file with more nodes.

It might be possible to write a more efficient rule set, but the benefit is probably not worth the cost.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
seanc217
Premium Member
Premium Member
Posts: 188
Joined: Thu Sep 15, 2005 9:22 am

Post by seanc217 »

Understood.

Thanks for the reply.

Sean
emeri1md
Participant
Posts: 33
Joined: Tue Jun 17, 2008 10:42 am

Post by emeri1md »

It might be worth taking out any columns that are not being processed and join them again later on. It adds a bit on complexity, but if that takes a lot of records out of the standardization stage, it might be worth it.

Matt
seanc217
Premium Member
Premium Member
Posts: 188
Joined: Thu Sep 15, 2005 9:22 am

Post by seanc217 »

Good point. I was thinking the same thing.

Thanks!
seanc217
Premium Member
Premium Member
Posts: 188
Joined: Thu Sep 15, 2005 9:22 am

Post by seanc217 »

Good point. I was thinking the same thing.

Thanks!
Post Reply