I am reading and writing to the same mainframe/DB2 table using the DSDB2 stage. The only way I have been successful is by utilizing an intermediate file. I wondered if any users have tried using a hash table instead of a Flat file as in intermediate file. Is there any performance improvements using one vs the other?
leslie
Flat File as intermediate vs Hash Table; which may be faster
Moderators: chulett, rschirm, roy
lclapp,
normally the sequential file should be faster as there is much less processing going on in the background. The fastest method would be to not land the data on disk at all, going straight to the output stage writing to DB/2 --> what errors do you get when you try this?
normally the sequential file should be faster as there is much less processing going on in the background. The fastest method would be to not land the data on disk at all, going straight to the output stage writing to DB/2 --> what errors do you get when you try this?
<a href=http://www.worldcommunitygrid.org/team/ ... TZ9H4CGVP1 target="WCGWin">
</a>
</a>
Might have something to do with database record locks if you are reading and updating the same record in the same table. Just an idea.lclapp wrote:I agree with your reponse. I just walked into this account today and was told that was the reason for landing the data in the job. I will post some more information once I dig a bit deeper.
leslie
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
For staging, where you don't need individual key-based access, a sequential file will always be substantially faster both to write and to read than a hashed file.
The main reason for this is you can't sort the data by hashed file group number, since the hashing algorithms are not in the public domain. But even then the overheads (approx 14 bytes/record, and (100-SPLIT.LOAD) percent free space per group on average) in a hashed file mean that you end up writing a lot more anyway.
The main reason for this is you can't sort the data by hashed file group number, since the hashing algorithms are not in the public domain. But even then the overheads (approx 14 bytes/record, and (100-SPLIT.LOAD) percent free space per group on average) in a hashed file mean that you end up writing a lot more anyway.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
You might be surprised how often that is not the case in a Server job.ArndW wrote:The fastest method would be to not land the data on disk at all, going straight to the output stage writing to DB/2
Can't speak directly to DB2, but take an Oracle job with two OCI stages with a Transformer in between. Nothing fancy - read from one db, write to another. Split it into two jobs - one to land the data and a second to read from that flat file and write to the target - and it can run substantially faster. Not saying it absolutely will run faster, just that it may more often than you'd think.
Plus, you get the added bonus of having a static dataset to load, which can simplify any restart/recovery you may end up having to do.
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers
Flat File in between is better...???
I am running a test as I type....The current 'best' appears to be to drop a FF between the read from DB2 and write to DB2. If I have say 5 million rows to process it will be faster to drop to a ff before beginning the load to DB2. I can't use the bulk loader because the DB2 is on the mainframe. At first blush I can't see the FF making things faster because I still have to write the 5 million records and then read them again....we will see.
Also the reason for my post turned into a bust. The Developers assumed that there would be problems reading and writing to the same DB2 table in the same job but never tested.
I will post the results.
leslie
Also the reason for my post turned into a bust. The Developers assumed that there would be problems reading and writing to the same DB2 table in the same job but never tested.
I will post the results.
leslie