I have a job pulling data from Oracle tables( 50 million records).It is just a trasfering into another table with little modifications ( like date formate is changed in the Xfr). I used the IPC stage as
oci-----Xfr----IPC----collector ---IPC--oci
-----xfr----IPC---
-----xfr----IPC---
1) In Job Properties/Performance, enable row buffering is set to in process,
what is the difference between the in process and inter process.
2) what is buffer size ? I set it default value
this is a server job .
will there b any chang if I use Inter proces
Thanks in Advance
IPC Stage
Moderators: chulett, rschirm, roy
-
- Participant
- Posts: 3337
- Joined: Mon Jan 17, 2005 4:49 am
- Location: United Kingdom
In Process is to denote DataStage to move one or several buffersize of data from one transform to another. This is useful when you are running in single processor.
Inter-Process is to break the whole process so each transform runs on a separate processor with its own buffer memory. This is useful when you are running in multi-processor machine.
Buffer size denotes the amount of memory you expect DataStage to allocate for each transform in either of above cases.
Inter-Process is to break the whole process so each transform runs on a separate processor with its own buffer memory. This is useful when you are running in multi-processor machine.
Buffer size denotes the amount of memory you expect DataStage to allocate for each transform in either of above cases.
-
- Participant
- Posts: 3337
- Joined: Mon Jan 17, 2005 4:49 am
- Location: United Kingdom
It is done by passing multiple rows (depending on buffer size) through to each transform in sets so that the next transform starts processing even when the previous has not finished. By this way, if you have multiple processor, you may have 3 processes working in small logical units in parallell.
Please note that referencing rows committed in the target (i.e. commit every 1 row and reference it in a lookup within the job) will not function as expected as rows might have moved forward the chain in groups and skip the reference together.
Please note that referencing rows committed in the target (i.e. commit every 1 row and reference it in a lookup within the job) will not function as expected as rows might have moved forward the chain in groups and skip the reference together.
-
- Charter Member
- Posts: 299
- Joined: Wed Nov 13, 2002 5:38 pm
- Location: USA
Since the source and target are both Oracle, is there some reason why you are extracting from Oracle and loading back into Oracle via Datastage and not doing it in Oracle directly? If the ETL Server is different then either the source Oracle and/or target then your design is really in question. Why would one take data out of a database move it around the network and then put it back into the same database? One reason is because you've got big hairy transforms that are best documented and written, not in PL/SQL, but in DataStage. However, your argument for this would have to be very strong to justify the network performance hit you are probably incurring.
Given that your reasons for using DataStage are valid, have you thought about making the job a multi-instance and instead of using duplicated transformers in a single job via IPC, using a single transformer multiple times a sin a multi-instance job?
Given that your reasons for using DataStage are valid, have you thought about making the job a multi-instance and instead of using duplicated transformers in a single job via IPC, using a single transformer multiple times a sin a multi-instance job?