DSXchange

Posted: **Tue Jun 03, 2008 3:26 pm**

Hi there,

I am processing a large resultset (6M + records) through the USPREP ruleset.

My question is what is the best way to run such a resultset through the standardize stage efficiently. Right now I'm processing at 1240 rec/sec.

It appears that the standardize stage is really slow. I can understand why, it's doing alot of work. Just wondering what are some of the best practices for making this run efficiently.

Thanks.

Posted: **Tue Jun 03, 2008 4:08 pm**

If the system is not running out of resources, use a configuration file with more nodes.

It might be possible to write a more efficient rule set, but the benefit is probably not worth the cost.

Posted: **Thu Jun 05, 2008 8:47 am**

Understood.

Thanks for the reply.

Sean

Posted: **Tue Jun 17, 2008 2:38 pm**

It might be worth taking out any columns that are not being processed and join them again later on. It adds a bit on complexity, but if that takes a lot of records out of the standardization stage, it might be worth it.

Matt

Posted: **Fri Jun 20, 2008 1:28 pm**

Good point. I was thinking the same thing.

Thanks!

Posted: **Fri Jun 20, 2008 1:35 pm**

Good point. I was thinking the same thing.

Thanks!

DSXchange

processing large resultsets

processing large resultsets