Page 1 of 1

Row/sec keep decreasing when writing data into hash file

Posted: Fri Dec 21, 2012 5:06 am
by Satwika
Hi,

I am reading data from database(sql serv 2005 ) using OLDB stage and writing it into hash file.Data flow start by 25000 rows/sec and keep decreasing to 600 rows\sec in few minutes.In input I have around 5 millions records .Can you please help me ,how to maintain the rows\sec so that data will load into has file in 5 to 10 minutes.

Posted: Fri Dec 21, 2012 8:20 am
by chulett
You need to properly size the hashed file when it is created, which involves computing the initial 'Minimum Modulus' value under the 'Create File' options in the stage. I'm assuming the Hashed File Calculator is still being provided... if you can find it, it will help with computing the correct value.

Posted: Fri Dec 21, 2012 1:55 pm
by ray.wurlod
Rows/sec is meaningless in this context. The initial burst is misleading, as this is showing you writes into memory cache. The figure at the end is an average of that and of flushing the cache to disk, which will necessarily be quite slower.

As Craig mentioned, well-tuned hashed files will perform better than badly-tuned ones, but they will never be as fast as the memory cache. The real speed in hashed files comes when you perform lookups; in a well-tuned hashed file any lookup requires exactly one I/O operation.

Hashed File Calculator continues to be supplied on the installation media, as an unsupported utility.

Posted: Thu Dec 27, 2012 11:01 pm
by Satwika
Would you please help me to find Hashed File Calculator .In hash file 'Minimum Modulus' is defined as 1 by default.

Posted: Thu Dec 27, 2012 11:51 pm
by ray.wurlod
Hashed File Calculator (hfc.exe) is provided on your installation media in a directory called Utilities, in a sub-directory called Unsupported.

Posted: Fri Dec 28, 2012 12:11 am
by Satwika
Hi Thanks for providing path.This project is migration project.Same job in DS 7.5 able to load data in hash file in around 4 to 5 mins .Where rows/sec keep constant between 15000 to 20000 .But in DS 8.5 its keep decreasing and it goes till few hundreds record per sec.We have around 6 million records in input database to load in hash file.

Posted: Fri Dec 28, 2012 12:22 am
by ray.wurlod
Are the hashed files identically sized in each environment?

Assuming there is a VOC pointer for the hashed file, you can use the command ANALYZE.FILE hashedfilename STATS to determine the current sizing.

Posted: Fri Dec 28, 2012 12:36 am
by Satwika
Ray ,would you mean the size of hash file during installation in 8.5 is identical to 7.5. In hash file stage ,under create file option button all values are identical in 7.5 and 8.5 job.In 8.5 job usually even get aborted after long time with warning like mention below.

CopyOfGSAP_CONTROL_ETL_change_8..Hashed_File_115.FrmIntfcCntrlSrc: ds_uvput() - Write failed for record id '4629563'

Posted: Fri Dec 28, 2012 3:26 am
by ray.wurlod
Was the 7.5 version created with the 64BIT option (not available through the Hashed File stage)?

Posted: Mon Jan 21, 2013 1:03 pm
by Satwika
Its created with type 30 (Dynamic). In 8.5 i am facing this issue with rows/sec.