Returning less number of rows

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

shivan
Participant
Posts: 70
Joined: Mon Jul 25, 2005 9:29 am

Returning less number of rows

Post by shivan »

hi,
I have a job, which has a transformer , sort and an aggregator. When i run the job for 200, 000 records , i only get 199233. It says the job is finished. Why i am not getting the rest of records.

thanks
shivan
pnchowdary
Participant
Posts: 232
Joined: Sat May 07, 2005 2:49 pm
Location: USA

Post by pnchowdary »

Hi Shivan,

Do you have any constraints in your transformer stage. Also, see the rowcount on the input,output to each stage and see in which stage you are loosing your rows. My suspect would be that you are having some constraint in the transformer stage, which is rejecting some rows.
Thanks,
Naveen
shivan
Participant
Posts: 70
Joined: Mon Jul 25, 2005 9:29 am

Post by shivan »

i do have some constraints in the transformet. It takes 200,000 records in the transformer and then from the sort, it only sends 192,323 records. The job doesnt give any warning of row rejection. So how would i do , what seem to be the problem

shivan
pnchowdary wrote:Hi Shivan,

Do you have any constraints in your transformer stage. Also, see the rowcount on the input,output to each stage and see in which stage you are loosing your rows. My suspect would be that you are having some constraint in the transformer stage, which is rejecting some rows.
pnchowdary
Participant
Posts: 232
Joined: Sat May 07, 2005 2:49 pm
Location: USA

Post by pnchowdary »

Hi Shivan,

You need to use the reject file functionality of the transformer. Use another file to store the rejects and make sure you check the reject row check box in the transformer constraint. Then you will get a warning in the job, saying that there were so many records rejected.

In fact, it is a good practice to always use reject files, whenever you have any kind of constraint in the transformer.
Thanks,
Naveen
shivan
Participant
Posts: 70
Joined: Mon Jul 25, 2005 9:29 am

Post by shivan »

i tried to run the job for 400,000 records. The error i got are:

CopyOfCopyOfDB2_INVOICE_LINE_ITEM_RPT_part1..AGGREGATOR2.DB2UPDATE: DSP.Close Error -100 in $DSP.Abort.
CopyOfCopyOfDB2_INVOICE_LINE_ITEM_RPT_part1..AGGREGATOR2: [IBM][CLI Driver] CLI0106E Connection is closed. SQLSTATE=08003
SQLTransact: Error committing transaction.
Attempting to Cleanup after ABORT raised in stage CopyOfCopyOfDB2_INVOICE_LINE_ITEM_RPT_part1..AGGREGATOR2
CopyOfCopyOfDB2_INVOICE_LINE_ITEM_RPT_part1..AGGREGATOR2: INV_ID = 000401258032 LF_ID = DMS003 CL_MATTER_ID = OSOS0375Y LI_LEGAL_COST_IN = F LI_TASK_CD = L110 TK_ID = SSW LI_ACTIVITY_CD = A104 LI_TASK_DS = Initial review of transcripts of hearing from June 10, 2003, August 15, 2003 and May 22, 2003 LI_PHASE_DS = Initial review of transcripts of hearing from June 10, 2003, August 15, 2003 and May 22, 2003 LI_CHARGE_IN = U BRANCH_CD = 356 CASE_NO = 3641 SYMBOL = 1 LI_ORIG_AM = 22 LI_BASE_AM = 22 LI_TOTAL_AM = 22 LI_CHARGE_DT = 2004-01-15


shivan

shivan wrote:i do have some constraints in the transformet. It takes 200,000 records in the transformer and then from the sort, it only sends 192,323 records. The job doesnt give any warning of row rejection. So how would i do , what seem to be the problem

shivan
pnchowdary wrote:Hi Shivan,

Do you have any constraints in your transformer stage. Also, see the rowcount on the input,output to each stage and see in which stage you are loosing your rows. My suspect would be that you are having some constraint in the transformer stage, which is rejecting some rows.
pnchowdary
Participant
Posts: 232
Joined: Sat May 07, 2005 2:49 pm
Location: USA

Post by pnchowdary »

Hi Shivan,

Well thats a different error. It would be nice if you start a new thread, as it is a different error and seems like your initial problem is solved.

Anyway, it seems like your error is in the aggregator stage.

Make sure that

1) You are sending the sorted data to the aggregator
2) You are using the right sort keys and sort order in the aggregator, compared to your preceeding sort stage
Thanks,
Naveen
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

Wait a second - SQL Commit error? Looks like it is not a sort or aggregate problem but perhaps a timeout or transaction size issue. How long is this process taking?
shivan
Participant
Posts: 70
Joined: Mon Jul 25, 2005 9:29 am

Post by shivan »

there are 1.7 million records. It works fine with 200,000 records or sometimes less or sometimes more. The speed is like 500 rows per sec. Then It sorts the data again in the sort. Which is like the double the time.

shivan
ArndW wrote:Wait a second - SQL Commit error? Looks like it is not a sort or aggregate problem but perhaps a timeout or transaction size issue. How long is this process taking?
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

Can you write the output of the sort/aggregate to a sequential file and then link from the sequential file to your Database write? I think the database won't get opened (and therefore no timeout) until the write to the sequential file finishes. If you have the disk space just shove a sequential stage into your job and check it out.
shivan
Participant
Posts: 70
Joined: Mon Jul 25, 2005 9:29 am

Post by shivan »

Hi,
the problem is : in the sequential file, i have to give an output file name. If i give an output link name there. i get an error:
CopyOfCopyOfDB2_INVOICE_LINE_ITEM_RPT_part1..DB2_UDB_API_31.IDENT1: |CopyOfCopyOfDB2_INVOICE_LINE_ITEM_RPT_part1..Sequential_File_132.INPUT_AGGR: DSD.SEQOpen Failed to open INPUT_AGGR
STATUS = -1.|

shivan
ArndW wrote:Can you write the output of the sort/aggregate to a sequential file and then link from the sequential file to your Database write? I think the database won't get opened (and therefore no timeout) until the write to the sequential file finishes. If you have the disk space just shove a sequential stage into your job and check it out.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Re: Returning less number of rows

Post by ray.wurlod »

shivan wrote:hi,
I have a job, which has a transformer , sort and an aggregator. When i run the job for 200, 000 records , i only get 199233. It says the job is finished. Why i am not getting the rest of records.

thanks
shivan
You expect to get fewer rows out of an Aggregator than went in. That's the whole point of grouping!
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
shivan
Participant
Posts: 70
Joined: Mon Jul 25, 2005 9:29 am

Re: Returning less number of rows

Post by shivan »

so does it exclude the duplicate. can you explain me why we get less rows.

shivan
ray.wurlod wrote:
shivan wrote:hi,
I have a job, which has a transformer , sort and an aggregator. When i run the job for 200, 000 records , i only get 199233. It says the job is finished. Why i am not getting the rest of records.

thanks
shivan
You expect to get fewer rows out of an Aggregator than went in. That's the whole point of grouping!
ukyrvd
Premium Member
Premium Member
Posts: 73
Joined: Thu Feb 10, 2005 10:59 am

Re: Returning less number of rows

Post by ukyrvd »

shivan wrote:so does it exclude the duplicate. can you explain me why we get less rows.

shivan
To answer that question .. one would need the information regarding what exactly are you doing in aggregator.


It seems that you have multiple problems .. one being the connection to DB itself which is getting closed too soon for you.
thank you
- prasad
shivan
Participant
Posts: 70
Joined: Mon Jul 25, 2005 9:29 am

Re: Returning less number of rows

Post by shivan »

in aggregator i am doing the grouping and sum.

shivan
ukyrvd wrote:
shivan wrote:so does it exclude the duplicate. can you explain me why we get less rows.

shivan
To answer that question .. one would need the information regarding what exactly are you doing in aggregator.


It seems that you have multiple problems .. one being the connection to DB itself which is getting closed too soon for you.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Try these queries against the database.

Code: Select all

SELECT COUNT(*) FROM tablename;
SELECT COUNT(*) FROM (SELECT A,COUNT(*) FROM tablename GROUP BY A);
What do these tell you about grouping?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply