Hi all,
1) Source ---> transformer -----> peek stage. If records coming out of transformer is 1000, all the 1000 records will be seen in the job log(output option = job log) or if i set in peek stage as 10 records per partition, only first 40 records will be visible in case if it runs in 4 nodes?
2) And whats the significance of " insert array size" property in OCI stage. Does it related to the commit interval of records ?
3) In case if i use OCI stage as target and Peek stage to collect reject records in that case if 100 records fail for unique constraint the job will abort once after the check is done for all the hundred records or once the check for first 40 records is done, the job will abort and only 40 records will be visible in job log?
Thanks in advance.
Peek stage and OCI stage
Moderators: chulett, rschirm, roy
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
1) Correct.
2) Array size has nothing to do with commit - it's simply the number of rows transmitted at a time. Commit rowcount should be a whole multiple of array size for best performance.
3) Correct.
2) Array size has nothing to do with commit - it's simply the number of rows transmitted at a time. Commit rowcount should be a whole multiple of array size for best performance.
3) Correct.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
3) I did not follow you on that one. If 100 records fail to insert, and peek stage is set to 10 with 4 nodes, I think you would get 4 peek entries in the job log with 10 rows in each, and you would not see the other 60 rejected rows. If each insert failure is logged as a warning and the job aborts after 50 warnings, then I'm not sure what you end up with.
I have seen where if I set the Peek value to 100 rows then in the job log for a given node, I may get two Peek entries, one giving say 30 rows and the other entry giving 70.
I have seen where if I set the Peek value to 100 rows then in the job log for a given node, I may get two Peek entries, one giving say 30 rows and the other entry giving 70.
Choose a job you love, and you will never have to work a day in your life. - Confucius
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
I didn't claim they're all in the same Peek entry. Information is logged as it arrives, and multiple nodes' information tends to collide. But if you look at all the Peek entries for node 0, for example, you'll find a total of 10 rows (or whatever you set it to). Ditto for node 1, and so on.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Setting array size to 0 logically transfers all records at once, but the network still breaks that mass of data into packets so any gain would be insignificant.
Array size has nothing at all to do with execution mode or read mode.
Array size has nothing at all to do with execution mode or read mode.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
U didn't say any such thing, nor did Ray. Keep in mind the fact that "U" is one of our posters.dsscholar wrote:U mean to say, there is no benefit in using array size option?
As noted, it strictly controls how many rows are sent across the network 'in an array', nothing more and nothing less. There are times when you'll want it to be set to 1 and others when it should be larger. There is always a sweet spot where increasing the size starts to decrease the throughput and performance starts to drop. There are many factors at work there as noted - network packet size and ARL being two of the most important. There's a lot more to it than simple sending everything at once.
Note this as nothing to do with DataStage but how Oracle works. A quick google for "Oracle array size" should help enlighten you on the topic.
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers