Where is the Reject Data of Teradata Enterprise Stage?

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
josfar
Participant
Posts: 5
Joined: Wed Jul 11, 2007 1:46 am

Where is the Reject Data of Teradata Enterprise Stage?

Post by josfar »

While using Teradata Enterprise Stage for loading data into Teradata DB, whether there are rows rejected or not , the job can finished ok.

But in our situation, even only one row data is rejected, we should get it and generate a Zero-size file to tell others ," the job is finished with rejected row data, you need to check the error and reload it."

There is no option related with reject in the Teradata Enterprise Stage, so what did the DataStage do to the Rejected Row Data? (Just Reject? or Save in a File? or store in the DB Table? or ???)

Thanks a lot!
JoshGeorge
Participant
Posts: 612
Joined: Thu May 03, 2007 4:59 am
Location: Melbourne

Post by JoshGeorge »

Database table !

Names of the tables are ERR_cookie_1 and ERR_cookie_2. Cookie can be found on the terasync table. In that table, you may use start time and end time (integer) fields in order to find the last ones inserted and you cannot modify these error table names. This should lead you to formulate a solution you are looking for.
Joshy George
<a href="http://www.linkedin.com/in/joshygeorge1" ><img src="http://www.linkedin.com/img/webpromo/bt ... _80x15.gif" width="80" height="15" border="0"></a>
josfar
Participant
Posts: 5
Joined: Wed Jul 11, 2007 1:46 am

Post by josfar »

JoshGeorge wrote:Database table !

Names of the tables are ERR_cookie_1 and ERR_cookie_2. Cookie can be found on the terasync table. In that table, you may use start time and end time (integer) fields in order to find the last ones inserted and you cannot modify these error table names. This should lead you to formulate a solution you are looking for.
Thank you very much!!! i'll check the table and make some tests.
josfar
Participant
Posts: 5
Joined: Wed Jul 11, 2007 1:46 am

Post by josfar »

josfar wrote:
JoshGeorge wrote:Database table !

Names of the tables are ERR_cookie_1 and ERR_cookie_2. Cookie can be found on the terasync table. In that table, you may use start time and end time (integer) fields in order to find the last ones inserted and you cannot modify these error table names. This should lead you to formulate a solution you are looking for.
Thank you very much!!! i'll check the table and make some tests.
I've checked the table mentioned above and made some test. But the result is not as expected.
Here is the detail:
duplicated row be rejected , but this action was not logged into the terasync table. so there were no ERR Tables generated either.

the DataStage Director Log:

Teradata_Enterprise_8: Orchestrate Teradata Write Operator Summary
===========================================
Data Volume:
------------
Input rows: 5
Input bytes: 35
Input rows rejected: 0
Duplicate rows: 2
Field conversion errors: 0
Duplicate index errors: 0
Rows written: 3
Total rows rejected: 2

The terasync table record:
cookie: 285485663
query: test_1015
recordcount: 5.00
bytecount: 35.00
rejectedcount: 0.00
starttime: 1,192,449,135

Does anybody ever met this situation before? And could u explain in which Condition do the DS Stage log the reject info and generate error table?

Deep Thanks to George and all of u pay attention to my question !!!!
nishadkapadia
Charter Member
Charter Member
Posts: 47
Joined: Fri Mar 18, 2005 5:59 am

Post by nishadkapadia »

Duplicate rows will never be captured in any database table.
Since teradata enterprise stage uses native teradata fastload utility which
has this feature.
At the most, field Conversion errors and Duplicate Index errors will be captured in the error tables which is zero as per your log.
josfar
Participant
Posts: 5
Joined: Wed Jul 11, 2007 1:46 am

Post by josfar »

nishadkapadia wrote:Duplicate rows will never be captured in any database table.
Since teradata enterprise stage uses native teradata fastload utility which
has this feature.
At the most, field Conversion errors and Duplicate Index errors will be captured in the error tables which is zero as per your log.
Thank u! :)
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Funny, I hadn't noticed that U had responded.

The correct English spelling of the second person personal pronoun is "you".
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
DS_FocusGroup
Premium Member
Premium Member
Posts: 197
Joined: Sun Jul 15, 2007 11:45 pm
Location: Prague

Post by DS_FocusGroup »

When using Teradata MLOAD stage there are two error tables created one for duplicate rows and other for duplicate index and other row erros, when using Enterprise stage the job do not create error table for duplicate rows, is there any way by which i can enable this option, if not any idea how to move duplicate rows into reject file.
JoshGeorge
Participant
Posts: 612
Joined: Thu May 03, 2007 4:59 am
Location: Melbourne

Post by JoshGeorge »

Pre-empt this situation. Use reject duplicate logic inside the DataStage job to move duplicate rows into reject file.
Joshy George
<a href="http://www.linkedin.com/in/joshygeorge1" ><img src="http://www.linkedin.com/img/webpromo/bt ... _80x15.gif" width="80" height="15" border="0"></a>
DS_FocusGroup
Premium Member
Premium Member
Posts: 197
Joined: Sun Jul 15, 2007 11:45 pm
Location: Prague

Post by DS_FocusGroup »

I am using Teradata Enterprise stage, Now suppose i have duplicate(complete row) in the incoming data than in the log
Teradata_Enterprise_8: Orchestrate Teradata Write Operator Summary
===========================================
Data Volume:
------------
Input rows:
Input bytes:
Input rows rejected:
Duplicate rows:
Field conversion errors:
Duplicate index errors:
Rows written:
Total rows rejected:

It will show the number of duplicates and put the row in ORCH_WORK_#####, instead of ERROR table, now suppose that we have 3 rows that is duplicate then it will report duplicate as 2 and put one row in the ORCH_WORK_#### table instead of 2 because the ORCH_WORK_### is a set table and it will also suppress duplicate, also in the terasync table it will not report the number of rejects if it is duplicate rows. Now suppose if i run the same job again with the same input data, now the data which don't have duplicate in the incoming data but have a matching row in the target table that will be shown as rejected rows and not duplicate rows. Now my questions are 1) Is it possible to make the ORCH_WORK_### as MULTISET? 2) Is it possible to specify a saparate database for these tables also the name of the error tables? 3) How will i know which target table errors belogs to ORCH_WORK_### becuase the number #### is not the cookie id that is present in the terasync table? 4) I can retrive this log if i use the commad dsjob -logsum ProjectName JobName eventId, but this event_id is different so can't hardcord it, but the Message_id is always the same for the log above is it possible to retirve this log easily through a routine or command so that i cam parse it and put it in a teradata table 5) Previous in some run i do get error tables like ERR_COOKIE_1 etc i tried different case but now it is not working do i have to enable some options? .Below are some test runs Target table has the first column as Unique Primary Index ============================================================================
RUN 1
============================================================================

INPUT DATA:
1 1
2 2
3 3
4 4
5 5
1 1
2 2
3 3
4 4
5 5
TARGET TABLE IS EMPTY



Teradata_Enterprise_121: Orchestrate Teradata Write Operator Summary
===========================================
Data Volume:
------------
Input rows: 10
Input bytes: 80
Input rows rejected: 0
Duplicate rows: 5
Field conversion errors: 0
Duplicate index errors: 0
Rows written: 5
Rows in staging table 5
Rows inserted from staging table: 5
Total rows rejected: 5
Cookie ID = 791891756
Terasync: 10,0 (10 input rows, 0 Rejected)
No ORCH WORK table Created


============================================================================
RUN 2
============================================================================

INPUT DATA:
1 1
2 2
3 3
4 4
5 5
1 1
2 2
3 3
4 4
5 5

TARGET TABLE:
1 1
2 2
3 3
4 4
5 5

Teradata_Enterprise_121: Orchestrate Teradata Write Operator Summary
===========================================
Data Volume:
------------
Input rows: 10
Input bytes: 80
Input rows rejected: 0
Duplicate rows: 5
Field conversion errors: 0
Duplicate index errors: 0
Rows written: 0
Rows in staging table 5
Rows inserted from staging table: 0
Total rows rejected: 10

Cookie ID = 1911042200
Terasync: 10,0 (10 input rows, 0 Rejected)
DD_WORK.ORCH_WORK_2f334f2c


============================================================================
RUN 3
============================================================================
INPUT DATA:
1 10
2 11
3 12
4 13
5 14

TARGET TABLE:
1 1
2 2
3 3
4 4
5 5


Teradata_Enterprise_121: Orchestrate Teradata Write Operator Summary
===========================================
Data Volume:
------------
Input rows: 5
Input bytes: 40
Input rows rejected: 0
Duplicate rows: 0
Field conversion errors: 0
Duplicate index errors: 0
Rows written: 0
Rows in staging table 5
Rows inserted from staging table: 0
Total rows rejected: 5


Cookie ID = 1313817672
Terasync: 5,0 (5 input rows, 0 Rejected)
DD_WORK.ORCH_WORK_4e4f4448
Last edited by DS_FocusGroup on Thu Nov 29, 2007 2:46 am, edited 1 time in total.
DS_FocusGroup
Premium Member
Premium Member
Posts: 197
Joined: Sun Jul 15, 2007 11:45 pm
Location: Prague

Post by DS_FocusGroup »

One other strange behavior is that now if in the incoming data only 1 row is having a problem then the whole input rows are rejected. From the log i see that first enterprise stage create a WORK table name WORK_ORCH and once data is populated in the work table it do
INSERT INTO TARGETTABLE SELECT * FROM WORKTABLE if there is a problem in any row the whole data is rejected becuase it is considered as one transaction. Did anyone had this problem previously.
throbinson
Charter Member
Charter Member
Posts: 299
Joined: Wed Nov 13, 2002 5:38 pm
Location: USA

Post by throbinson »

DS_Focus - Great questions. I can't answer most of them

Use WORKDB=<database> in the Additional Connections Options property of the Teradata Enterprise Stage and this will direct the work and err tables to the chosen database. I have not seen any way to control their naming however outside of a Server job.
The WORK_ORCH table being generated means you are doing an append which means no fastload into the target table. Fastload requires the target to be empty. The stage does a fastload into an empty work table and then select inserts the data into the target table.
DS_FocusGroup
Premium Member
Premium Member
Posts: 197
Joined: Sun Jul 15, 2007 11:45 pm
Location: Prague

Post by DS_FocusGroup »

throbinson thanks for your reply, here are what my finding are, as Enterprise stage is able to load data into a already loading table so it do this as mensioned by throbinson that it create a temp table name ORCH_WORK_####, As ORCH_WORK_### is empty so Enterprise stage load the data through fast load and create two table ERR_COOKIE_1(contraints etc), ERR_COOKIE_2(duplicate index). Then Datastage deletes the tables that are empty(means no error), and in last do the following INSERT INTO FINAL_TARGET SELECT * FROM ORCH_WORK_####. Now consider the following points
1) There are rows in ORCH_WORK_#### which is same as rows in the Target_Table (SET TABLE), no error will occur the duplicate will be suppresed automatically. ORCH_WORK_### will be delated after this.
2) There are rows in ORCH_WORK which has same PI index as Target_Table row. the insert select will fail and no row will be added into the target table due to this ORCH_WORK_### will not be dropped. I am thinking to do something like this now
INSERT INTO TARGET_TABLE SELECT * FROM ORCH_WORK_### WHERE PI_COL NOT IN (SELECT PI_COL FROM TARGET_TABLE); INSERT INTO ERR_COOKIE_2 SELECT * FROM ORCH_WORK_### WHERE PI_COL IN (SELECT PI_COL FROM TARGET_TABLE). The problem with the above is that i am not able to get the name of the worktable, if anyone has solved this before it will be great if the solution is posted.
Post Reply