Rejecting all rows which cannot be loaded in ORACLE

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Jamesvanam
Participant
Posts: 40
Joined: Tue May 18, 2004 10:51 pm

Rejecting all rows which cannot be loaded in ORACLE

Post by Jamesvanam »

HI All,
This was probably asked before in here, but I couldn't find an appropriate one, so please help.
I want to capture all the rows that have failed to be loaded into the Target Oracle DB because of any input error(like duplicates, non nulls, or any error which doesn't let the row to be loaded into the oracle DB), I tried the different options of Link variables > outputs >Rejectd, rejectedcode, rejectcode = noerror+rejected and nothing seems to work, because they give me warnings like input value too long to be inserted, full year cannot be 0. and I need to capture these errors too and send them to a Stored Procedure.

any input from anybody is greatly appreciated.
Thanks
James
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

All you really need to do is have a second link defined (with no constraint) and mark it as a Reject Row. Make sure it is the last link when looking at the link ordering and it will capture all database rejects.

What is your target stage and array size?
-craig

"You can never have too many knives" -- Logan Nine Fingers
bapajju
Participant
Posts: 82
Joined: Wed Nov 19, 2003 11:58 pm

Re: Rejecting all rows which cannot be loaded in ORACLE

Post by bapajju »

James u can try this option. We follow the similar pattern.

Have another O/p link to any Sequential file or any Target Error. In the constraints for this new link Check the Rejected Check Box. This link will capture all the records that have not been loaded to the target due to any eeor.

Thanks
Jamesvanam
Participant
Posts: 40
Joined: Tue May 18, 2004 10:51 pm

Post by Jamesvanam »

Thanks Guys,
I tried that option but I still get warnings (but the records are loaded into both target and sequentila file) like this one, which I need to send to the sequential file which is my second output link.

Transformer_0 (Timestamp) transform error: Invalid timestamp input '<Missing or NULL>'
Final_TestT0013Product..Transformer_0.DSLink3: Row 797: SEGMENT1="57844-5376-52", DESCRIPTION="WILZIN CAPSULES 25MG 250", ATTRIBUTE2=NULL, ATTRIBUTE3=2, ATTRIBUTE5=NULL, ATTRIBUTE6=NULL, ATTRIBUTE7=NULL, ATTRIBUTE8=NULL, ATTRIBUTE9=NULL
these attributes which are not are loaded into a not null coulmn in the target, so they need to go to a reject link.

THe Array size is 1 and target size is zero as we are doing the inital load now.
any suggestions as to how I can avoid these warnings.
Jamesvanam
Participant
Posts: 40
Joined: Tue May 18, 2004 10:51 pm

Post by Jamesvanam »

HI Craig,
My target is ORacle, one more thing, I get a warning for the rows which are rejected to sequential file, and this record is loaded in the sequential file which is my reject link. I was wondering how to make these warnings disappear.
Thanks
Jamy
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

You will always get the warnings. Setting up a reject link doesn't stop the warnings from being logged, it allows you to capture the record that caused the warnings. The record was not 'loaded' to the target, it was sent down the target link to Oracle and then, when the database rejected it, also sent down the reject link.

Good thing your Array Size is 1, that is the only way to ensure the rejected records written out are the actual problem records. Bump it up to something other than one and you never know what will actually be written to your reject file. :?

You avoid these warnings by not writing 'bad' records to Oracle. :wink: FYI - You don't have to wait for Oracle to bounce them back, if you 'know' you've got problem data - like null values for a not null field - reject it yourself.
-craig

"You can never have too many knives" -- Logan Nine Fingers
Jamesvanam
Participant
Posts: 40
Joined: Tue May 18, 2004 10:51 pm

Post by Jamesvanam »

Oh Cool,
Thanks Craig, I didn't know that jobs with warnings wouldn't cause problem in production, thats why I was worried about that.
Jamy
ogmios
Participant
Posts: 659
Joined: Tue Mar 11, 2003 3:40 pm

Post by ogmios »

But warnings WILL severely slow down your DataStage job. So try to aoid warnings.

Ogmios
Jamesvanam
Participant
Posts: 40
Joined: Tue May 18, 2004 10:51 pm

Post by Jamesvanam »

I spoke to my tem leader about that, he was like, I don't know what I'd be rejecting the columns, he wants the database to reject them and send them to a stored procedure using which an entry is made into a table where the details like an insertion into DB failed for whatever reason it is. and stuff like that. and he said its okay for now if the system is slow, we will think about it alter. we are only dealing with about a million records right now.

James
tonystark622
Premium Member
Premium Member
Posts: 483
Joined: Thu Jun 12, 2003 4:47 pm
Location: St. Louis, Missouri USA

Post by tonystark622 »

You can avoid the warnings by not marking the link as a Reject Row and putting a constraint on it. If you're writing to a DataBase stage (Oracle OCI), you can check the LinkName.DBMSCODE (under Link Variables if you're using the expression builder).

I personally do not like that many Warnings in my job logs, because they tend to mask if there is a real problem, I try to code to avoid them :)

Good Luck,

Tony
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

As noted, ideally your jobs can handle anything thrown at them and will never log any warnings. However, when things do get logged, you decide what happens in your Production environment by the way you setup your jobs and your error checking.

If you mark a link as a Reject Link (which can be done with or without a constaint as well) and any rows go down it, you will get a warning in your job. DataStage will write out the warning to log the number of records rejected.

You can set the warning level that aborts your jobs such that it takes 1, 10, 50, whatever number of warnings to force your job to abort. We have ours typically set to one, as we should never have errors and using it in conjunction with a Transaction Size of 0 makes the loads an all-or-nothing affair. None of this 'where do I need to restart my loads from' but could be problematical for large loads. Your Mileage May Vary. :wink:

You can capture 'logical' rejects without causing any warnings to be logged via judicious use of constraints. However, the 'physical' rejects from Oracle will always be logged if they occur... regardless of what you do to capture them.
-craig

"You can never have too many knives" -- Logan Nine Fingers
chrisk
Participant
Posts: 37
Joined: Wed Feb 26, 2003 6:20 pm

Post by chrisk »

Hi,

I'm curious about the array size = 1 aspect of this discussion. :?

I don't profess to know the reason why array size needs to be set to 1, but since the OCI plug-in doco says it has to be, then this obviously has a negative impact on the performance of the job if a large number of rows needs to written to an OCI stage.

Apart from mimicking DBMS constraint logic in the transform constraints, how are people dealing with DBMS rejects where they cannot afford to have an array size = 1?

Ta,
C.
Chris Kearns
Altis Consulting Pty Ltd
www.altis.com.au
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Array size 1 is so that the correct warning can be associated with the correct row.

If you send 1000 rows, which one(s) generated the warning(s)?

Better is an approach that avoids warnings altogether. Then the fact that a warning has occurred properly alerts all and sundry that there may be a problem.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

In my experience, I haven't seen all that much of a performance change by bumping the array size up when writing to Oracle. Reading, yes, but writing... not really. So we leave ours at 1.

It does let you know the 'real' error when there is one, but (as Ray noted) a much better approach is to ensure that there are no warnings or errors. Not always feasible, but something you should always strive for.

We also (as a general Rule of Thumb) keep our warning threshold rather low (two , IIRC) and combine that with a Transaction Size of 0 so that any warnings crater the job and everything rolls back. We don't want anything to 'mostly' run ok. :wink:
-craig

"You can never have too many knives" -- Logan Nine Fingers
chrisk
Participant
Posts: 37
Joined: Wed Feb 26, 2003 6:20 pm

Post by chrisk »

Regardless of the array size when writing to Oracle, I have noticed that DataStage will still log a warning in the log if a DBMS reject occurs (irrespective of the reject checkbox in the constraint being ticked).

Ray, this suggests to me that DataStage might know which row(s) generated the warning(s)? I concede that it might be difficult (or impossible!) for DataStage to marry these DBMS messages back to the offending row. I have asked Ascential the same question but have not received a definitive answer yet.

Craig, since you are using an array size of 1, does that mean that you are relying on the DBMS reject method to identify your rejects? I'm curious as we have decided to check for data problems ourselves rather than rely on DBMS rejects as we find that an array size of 1 does degrade our performance when writing to Oracle.

Ta,
C.
Chris Kearns
Altis Consulting Pty Ltd
www.altis.com.au
Post Reply