Hi DS folks,
can some kind soul point out to me the fundamentals in which the control file is generated. is there any way i can control the defitions for some of the columns.
scenario: i am loading the data in Load-append mode in a Oracle Enterprise Stage (ver 7.5)
scenario 1:
the idea is i want the column defintion in the control file to be "my_sequence.nextval" for a column instead of the auto generation
i can uderstand that i can use Write Method as Upsert and edit the generated SQL. from the definition of upsert
Insert if update fails
OR
Update if insert fails
, whereas i want to do only insert
i can say Upsert Order "Insert then update" and change the generated update statement to update.... where 1!=1, which interanlly stops any updation . Also i need to change the insert statement into seq.nextval type..
If I want to force inserts and not use the bulk load process, I will choose the options 'update then insert' or 'insert then update' and afterwards choose 'custom sql' for both; then I'll cut-and-paste one to the other, so that both actually do the update or insert.
But if you only have inserts to do then I would recommend using the faster option!
If I want to force inserts and not use the bulk load process, I will choose the options 'update then insert' or 'insert then update' and afterwards choose 'custom sql' for both; then I'll cut-and-paste one to the other, so that both actually do the update or insert.
But if you only have inserts to do then I would recommend using the faster option!
Arnd , i think giving an insert statement for a Update SQL statement is not logically correct.
i guess there is some some option missing in Write Method (please see 2. below) and that's why we choose 3. below, instead and make a work around
1.Load 2.Insert/Update
3.Upsert
4.t Delete Row
If your whole issue is that you want to sequential assign a column the new available unique number, why don't you manage the process yourself inside the tool?
There are many useful techniques for doing this. All rely on "seeding" the jobstream with the starting value from which to assign. Then, use one of the parallel friendly techniques for assigning unique values. The method for seeding is to either do a select max on the key key prior to running the job and passing that in as a parameter, or at the conclusion of processing storing the max value used in a control table that is used at the beginning of the process via a select. The only issue is that gaps will occur in the sequencing as each partition independently is assign a unique set of numbers.
If you persist in using a database sequence, doesn't that introduce a choke point in your loading paradigm, as direct partition loading would require a semaphore to manage doling out numbers to the parallel threads? Wouldn't you prefer to stay with an optimized loading method of allow all partitions to simultaneously and without delay load the data using the fast insert only load method (DIRECT path)?
Kenneth Bland
Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
kcbland wrote:
If you persist in using a database sequence, doesn't that introduce a choke point in your loading paradigm, as direct partition loading would require a semaphore to manage doling out numbers to the parallel threads? Wouldn't you prefer to stay with an optimized loading method of allow all partitions to simultaneously and without delay load the data using the fast insert only load method (DIRECT path)?
let me try to understand this. if i understand correctly , the parallel threads are responsible to run the sqlloader control files using the data file generated on the fly.
do you mean to suggest that sequence.nextVal() of oralce will serailize the parallel DS threads
example: Thread t2 will wait for t1's nextVal() operation is completed.
i presume nextVal() is serialized[/b]
GO about 2/3 of the way down to get the discussions about bulk loading 100's thousands of rows simultaneously. Transactional inserts with sequences isn't a big deal, I agree with Tom 100%. (Buy Kytes Expert-One-On-One, it's an excellent book).
I guess my point is that you have to look at the volumes you're loading, and since parallel streams of data will be accessing partitions simultaneously, you're using a sequence for a unique key. If you have a unique index, it therefore must be GLOBAL, which is anathema to the things you're supposed to be doing. That index update will cause more issues than the use of sequences.
You'll just have to experiment. You can always generate a file per table partition and manually invoke sqlldr for each naming the partition instead of the table. You can see if there's a lower DIRECT path performance. You can mitigate it with playing with the sequence cache size.
Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
Forgot to answer directly. Yes, I was trying to say that your partitioned loading sessinos will have to cooperate with the sequence, and that there is an amount of serializing that could take place. Tom Kyte seems to not worry so much about it, but I couldn't find where the topic of parallel partition loading with a sequence was specifically addressed.
Kenneth Bland
Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle