Load all Hashed file into memory

qt_ky · Post by **qt_ky** » Wed Sep 21, 2011 10:48 pm

Read cache size (0 to 999 MB) is a per project setting. Can a pre-job be run with the hashed file read, which gets it into the read cache. Is it dropped from cache when that job ends or remaining available in cache for the full processing job to use when it runs later? Is that what you are asking--persistent caching across job runs? If so, have you run any test to see if it appears to be cached across jobs or not?

With so many hashed files in play and a 999 MB cache size limit per project, it's likely that any given hashed file would be knocked out of the cache before it's used by the job that needs it. That's assuming read cache is persistent across jobs in the first place.

I am also wondering what if you had a large solid-state drive on your server for reading and writing (storing and accessing) persistent hashed files? Has anyone tried that? Remember RAM disk? (like that but persistent storage). Then all your hashed file I/O is near the speed of a regular memory cache, or at least a lot faster than regular disk I/O. Just an idea...

ray.wurlod · Post by **ray.wurlod** » Thu Sep 22, 2011 1:03 am

Even flash disks can perform better than spinning disks.

mavrick21 · Post by **mavrick21** » Sun Sep 25, 2011 1:08 am

@Ray - "Even flash disks can perform better than spinning disks."
By Flash disks do you mean USB flash pen drives?

@qt_ky - "Is that what you are asking--persistent caching across job runs? If so, have you run any test to see if it appears to be cached across jobs or not?"
My question answered by Ray - "With private link caching there is no way available to pre-load the contents of the hashed file into memory."

Now I'm wondering if I'm approaching the problem in the right approach.

This is the job I'm trying to improve performance: All of our DS jobs were provided by Oracle (EPM product - server jobs). We have one single (longest running) job which runs for 3-4 hours which I'm trying to improve. Here is job description:
1) The source DRS stage selects all records (17.5 million) from source table during each run.
2) Then a transformer calculates a CRC on the key for all records
3) Another transformer lookups on the Hashed file (~17.5 mil records) on the key, compare the CRC value and determine whether it's a new record or an update record. If the record is new or an update then the same transformer write them into another delta hashed file.
4) The next transformer separates the inserts and updates into two flows which is connected to target DRS stages.

Here are the things I have looked at to make sure the job runs fast:
1) The source and target SQLs does indexed selects and updates - no table scans (Can't select a subset of records from source since it's transaction table which doesn't have create/update time-stamp and the OLTP team is adamant on adding them )
2) There are no unnecessary stage variables in any of the transformers.
3) Experimented with Array and Transaction size but that didn't help

I'm not sure about the below points, please advise me on each of them:
1) DISKCACHING is turned off. I wonder if turning this on would be of any help since the lookup hashed file (~17.5 mil) is used in and by this job only. So I think Private link caching is the appropriate one to use for this but then again I think Private link caching should be turned off - please read the next point.
2) I think the job runs faster if Pre-load file into memory is disabled since the record in the lookup hashed file will be referenced only once by incoming stream
3) I think the lookup hashed file should be left unchanged (Type 30) because the size increases on a regular basis.
4) Below is the ANALYZE.FILE <HF> STATS output and ls -l output. Would bringing over most of OVER.30 to DATA.30 (by RESIZEing) bring about any drastic job run time improvement?

Code: Select all

Press any key to continue...
File name ..................   HASH_CRC_PS_NV1
Pathname ...................   HASH_CRC_PS_NV1
File type ..................   DYNAMIC
NLS Character Set Mapping ..   NONE
Hashing Algorithm ..........   GENERAL
No. of groups (modulus) ....   461801 current ( minimum 461801, 44 empty,
                                            9803 overflowed, 0 badly )
Number of records ..........   4992087
Large record size ..........   1628 bytes
Number of large records ....   0
Group size .................   2048 bytes
Load factors ...............   80% (split), 50% (merge) and 45% (actual)
Total size .................   965849088 bytes
Total size of record data ..   229516969 bytes
Total size of record IDs ...   204675567 bytes
Unused space ...............   531652456 bytes
Total space for records ....   965844992 bytes

Press any key to continue...
File name ..................   HASH_CRC_PS_ITEM_SF_CS9_EPM9_NV1
                               Number per group ( total of 461801 groups )
                               Average    Minimum    Maximum     StdDev
Group buffers ..............      1.02          1          2       0.14
Records ....................     10.81          1         40       4.72
Large records ..............      0.00          0          0       0.00
Data bytes .................    497.00         43       1832     217.01
Record ID bytes ............    443.21         41       1640     193.45
Unused bytes ...............   1151.26         24       2048     373.89
Total bytes ................   2091.47       2048       4096       0.00


                               Number per record ( total of 4992087 records )
                               Average    Minimum    Maximum     StdDev
Data bytes .................     45.98         39         47       1.75
Record ID bytes ............     41.00         41         41       0.00
Total bytes ................     86.98         80         88       1.75

Code: Select all

 ls -l HASH_CRC_PS_NV1
total 943220
-rw-rw-r-- 1 dsadm dsadmprod 945770496 Sep 25 00:54 DATA.30
-rw-rw-r-- 1 dsadm dsadmprod  20078592 Sep 24 22:29 OVER.30

5) Have I missed out anything else that can improve the run time for example any changes to UVCONFIG file?

Thanks in advance for your advice and time. Very much appreciated!

chulett · Post by **chulett** » Sun Sep 25, 2011 8:45 am

It looks like you've fallen into one of the classic traps we see here: when people come with a perceived solution to something in their mind and ask us how to implement that something rather than coming here with their actual problem and then asking if their solution is appropriate. You'd be amazed at how many times it isn't... after going 'round and 'round talking.

First off, flash devices go by many names it seems. Flash drives, thumb drives, flash memory pen drives, SSD drives, etc etc. All the same technology.

As to your job description list:

1) So you have to extract and check everything every time? Sucks when that's true but sometimes it just is.
2) For Server, that's what I've always done. I'm assuming your hashed file just has the keys you need and the CRC32 value, yes? Do you persist it in the target?
3) Fine.
4) Also fine, but not sure you really need separate transformers for this. But then again, combining them probably would affect things either.

Now the second list:

1) Good... and I assume you've confirmed (with your DBA or otherwise) that the indexes are actually being used.
2) Good. I'm assuming that also means the opposite, that you don't have any unnecessary repeated calculations in derivations that should be done in stage variables.
3) I would think one or both would actually help... to some extent at least... but we'll let that one go.

Last list:

1) Leave it off. Please read the next point.

2) Correct. Cache sets of data that need to be accessed multiple times during the run. Never mind the fact that you couldn't cache one that size if you wanted, so you're saving the time the pre-load takes to make the attempt and then log the fact that everything wouldn't fit.
3) I would agree.
4) It should certainly help... don't know about the 'drastic' part but anytime a hashed file read doesn't have to sequentially check the overflow area you'll save time.
5) I doubt it.

Overall generic advice. What efforts have you made to isolate the actual bottle-neck here? For example, if you swap out the target OCI stages and simply write the result to flat files does the run time see your desired drastic improvement? If so, then you know you have some Oracle tuning to do. No? Then the hashed file lookup / write or transformations are the culprit. Et cetera.

qt_ky · Post by **qt_ky** » Sun Sep 25, 2011 8:32 pm

I worked on some Oracle EPM server jobs with DRS stages some time ago for one customer. Every one of the jobs had both the database target stage's array size and transaction size set to 1, causing very bad performance. Just by changing those defaults to 1000 each cut 8-10 hours off their daily load process. I would check that first, or possibly again. Like Craig said, run a test on a copy of the job and swap out the DRS stage(s) for sequential file stages and report any difference in job time.

qt_ky · Post by **qt_ky** » Sun Sep 25, 2011 8:36 pm

Yes, a USB flash drive, even in a pen, uses the same technology as an SSD. They make SSDs for servers that fit in a hard drive slot (not as a USB pen). I was just throwing that out there, but you would really have to determine the bottleneck first. It could be the database I/O. I think that's more likely than the hashed file.

mavrick21 · Post by **mavrick21** » Sun Sep 25, 2011 9:41 pm

@Craig,

"after going 'round and 'round talking. Wink "

Sorry about that. Next time will follow your advice promptly

"2) For Server, that's what I've always done. I'm assuming your hashed file just has the keys you need and the CRC32 value, yes? Do you persist it in the target?"

The keys and CRC32 values are written to another hashed file which is added to the lookup hashed file in the next job. Is this what you meant by "Do you persist it in the target?"?

I'll play around with the modulus and try to get all/most of the OVER.30 to DATA.30 and post the run time.

"For example, if you swap out the target OCI stages and simply write the result to flat files does the run time see your desired drastic improvement? If so, then you know you have some Oracle tuning to do."

Other than hitting the indexes for updates, I'm not sure what else I can play around for Oracle tuning. Can you please provide some pointers?

"Then the hashed file lookup / write or transformations are the culprit."

I'm pretty sure hashed file lookup is the culprit. I'll replace target DRS stage with sequential file and post the results.

@Eric,
Here are the values I experimented with and this didn't help.
For source: Array Size - 100, 1000 or 5000
For target: Array & Transaction Size - 100 & 200, 1000 & 2000 or 5000 & 10000

It could be the database I/O

My understanding is that the database is doing frequent reads and writes to the disk instead of caching them first. Is my understanding right? Can you please explain more about this?

Thanks a lot Gurus. I appreciate all your help and time.
-Mav

SURA · Post by **SURA** » Sun Sep 25, 2011 10:11 pm

Other than hitting the indexes for updates, I'm not sure what else I can play around for Oracle tuning. Can you please provide some pointers?

Partition.

DS User

mavrick21 · Post by **mavrick21** » Sun Sep 25, 2011 11:21 pm

SURA wrote:Partition.

Do server jobs load data in parallel into partitioned Oracle tables? I really doubt that!

SURA · Post by **SURA** » Sun Sep 25, 2011 11:26 pm

To load the data , may be no.

To load the data in DB, even indexing will consume time and i saw in some of the projects, they will recreate the index after load.

DS User

qt_ky · Post by **qt_ky** » Mon Sep 26, 2011 7:17 am

I would break the job into 3 jobs for testing:

1. DRS source --> text file1 (test DRS settings = 1 vs. 100 vs. 1/5/10K)
2. text file1 --> xfm/hash -> text file2 (find out how much time)
3. text file2 --> DRS target (test DRS settings = 1 vs. 100 vs. 1/5/10K)

Then you should know more about setting change effects and identifying the bottleneck(s). Let us know what you find.

Talk to your DBA if you want to know how the target database handles caching internally. I would think every time the target DRS stage sends a COMMIT, based on transaction size, the database should write X records to disk.

mavrick21 · Post by **mavrick21** » Fri Sep 30, 2011 11:56 pm

Hello Gurus,

Sorry for my slow response. I'm swamped with production issues and will update this post next week.

Thanks for your patience
-Mav