Page 1 of 1

partition question

Posted: Sun Aug 02, 2009 9:19 pm
by dnat
Hi,

I have a simple job which reads from a file and updates a particular field in a oracle database.

The file has around 20 mil record and the commit count is 10000. The partition type is Auto(havent changed anything).

Now the job aborted in the middle due to oracle instance being down. It had updated around 12 mil records.

Now if we re-run this again all the 20 mil records would be taken for update. But since the partition is Auto, we are not sure whether the updated records are the first 12 mil records from the file which we can remove and run this job again with just remaining 8 mil records.

Can anyone comment on this.

Posted: Sun Aug 02, 2009 10:18 pm
by ray.wurlod
If you are reading the file in sequential mode, then you can skip the first 12 million rows (or whatever the number is) with impunity.

If you are reading the file in parallel mode (perhaps more than one reader per node), then it's still possible, but you need to check how many rows have been processed on each node (from Monitor, perhaps, or DSGetLinkInfo() with the "partition row count" option). You may need to back off a little, say to 10 million just to be safe.

Posted: Mon Aug 03, 2009 6:45 am
by chulett
And you could only be reading the file 'in parallel mode' if it is a fixed-width file, so my money is on sequential.

Posted: Mon Aug 03, 2009 4:59 pm
by ray.wurlod
Not true any more. Multiple readers per node will work with delimited formats.

Posted: Mon Aug 03, 2009 6:30 pm
by chulett
Not true any more as of when? With which version? :?

Posted: Tue Aug 04, 2009 12:22 am
by ray.wurlod
7.1 at a guess, maybe even a minor 7.0 release.

Obviously it's more difficult with delimited data, but it's certainly possible (locate the percentage point then scan forward for a line terminator).

Posted: Tue Aug 04, 2009 12:25 am
by ArndW
ray.wurlod wrote:Not true any more. Multiple readers per node will work with delimited formats.
I tried it at 8.0 and it won't.
Sequential_File_33: The multinode option requires fixed length records.