Compliation Error in lookup

DSDexter · Post by **DSDexter** » Wed Sep 15, 2010 3:56 am

Hi Gurus,

I am getting the following error when i am looking up data from a oracle table.


Output from transformer compilation follows:

##I TFCN 000001 05:50:42(000) <main_program> 
Ascential DataStage(tm) Enterprise Edition 7.5.2
Copyright (c) 2004, 1997-2004 Ascential Software Corporation.
All Rights Reserved


##I TOSH 000002 05:50:42(001) <main_program> orchgeneral: loaded
##I TOSH 000002 05:50:42(002) <main_program> orchsort: loaded
##I TOSH 000002 05:50:42(003) <main_program> orchstats: loaded
##I TFSC 000001 05:50:42(006) <main_program> APT configuration file: /appl/datastage/Ascential/DataStage/Configurations/default.apt
##W TCOS 000049 05:50:42(007) <main_program> Parameter specified but not used in flow: DSPXWorkingDir
##W TFCP 000000 05:50:43(000) <transform> Error when checking composite operator: The number of reject datasets "0" is less than the number of input datasets "1".
##E TFCP 000039 05:50:43(001) <transform> Error when checking composite operator: This field was not found: lFctAttndeesRkey; [line 57, character 23].
##E TFSR 000019 05:50:43(002) <main_program> Could not check all operators because of previous error(s)
##E TCOS 000029 05:50:43(003) <main_program> Creation of a step finished with status = FAILED. (ActCallIdsFctCallReprocessLoad.lkpDeriveSkeys)

*** Internal Generated Transformer Code follows:
0001: //
0002: // Generated file to implement the V0S60_repos_lkpDeriveSkeys transform operator.
0003: //
0004: 
0005: // define our input/output link names
0006: inputname 0 lToFactCallRkeyDset;
0007: tablename 0 lCustSkey;
0008: tablename 1 lTerrId;
0009: tablename 2 lFullDate;
0010: tablename 3 lEmpId;
0011: tablename 4 lFctAttndeesRkey;
0012: outputname 0 lToLkup;
0013: 
0014: initialize {
0015: 	// define our row rejected variable
0016: 	int8 RowRejected0;
0017: 
0018: 	// define our null set variable
0019: 	int8 NullSetVar0;
0020: 
0021: 	// define our force_error call variables
0022: 	string FErrorMsg0;
0023: 	string FErrorLink0;
0024: 
0025: }
0026: 
0027: mainloop {
0028: 	// initialise our row rejected variable
0029: 	RowRejected0 = 1;
0030: 
0031: 	// evaluate condition for link: lCustSkey
0032: 	if (notnull(lToFactCallRkeyDset.ACCOUNT_ID))
0033: 	{
0034: 		lookup(lCustSkey);
0035: 	}
0036: 	do
0037: 	{
0038: 		// no condition for lookup link: lTerrId
0039: 		lookup(lTerrId);
0040: 		// no condition for lookup link: lFullDate
0041: 		lookup(lFullDate);
0042: 		// no condition for lookup link: lEmpId
0043: 		lookup(lEmpId);
0044: 		// no condition for lookup link: lFctAttndeesRkey
0045: 		lookup(lFctAttndeesRkey);
0046: 		if (!is_match(lFctAttndeesRkey)) {
0047: 			FErrorMsg0 = "Job aborting due to lookup failure on link: ";
0048: 			FErrorLink0 = "lFctAttndeesRkey";
0049: 			force_error(FErrorMsg0 + FErrorLink0);
0050: 		};
0051: 		// evaluate columns (no constraints) for link: lToLkup
0052: 		lToLkup.CUST_SKEY = lCustSkey.CUST_SKEY;
0053: 		lToLkup.TERR_SKEY = lTerrId.TERR_SKEY;
0054: 		lToLkup.DAY_SKEY = lFullDate.DAY_SKEY;
0055: 		lToLkup.EMP_SKEY = lEmpId.EMP_SKEY;
0056: 		lToLkup.DATA_SRCE_SKEY = lToFactCallRkeyDset.DATA_SRCE_SKEY;
0057: 		lToLkup.CONTACT_ID = lFctAttndeesRkey.CONTACT_ID;
0058: 		lToLkup.EMP_ID = lToFactCallRkeyDset.EMP_NUM;
0059: 		lToLkup.CREATED_BY = lToFactCallRkeyDset.CREATEDBY;
0060: 		lToLkup.LAST_UPD_BY = lToFactCallRkeyDset.LASTUPDATE_BY;
0061: 		lToLkup.LAST_UPD_DATE = lToFactCallRkeyDset.LASTUPDATE_DATE;
0062: 		lToLkup.NUM_RESIDENTS_SEEN = lToFactCallRkeyDset.NUM_RESIDENTS_SEEN;
0063: 		lToLkup.NUM_FELLOWS_SEEN = lToFactCallRkeyDset.NUM_FELLOWS_SEEN;
0064: 		lToLkup.NUM_NURSES_SEEN = lToFactCallRkeyDset.NUM_NURSES_SEEN;
0065: 		lToLkup.NUM_OTHERS_SEEN = lToFactCallRkeyDset.NUM_OTHERS_SEEN;
0066: 		lToLkup.SYNC_DT = lToFactCallRkeyDset.SYNC_DATE;
0067: 		writerecord 0;
0068: 		RowRejected0 = 0;
0069: 		next_match(lCustSkey);
0070: 	} while (is_match(lCustSkey));
0071: }
0072: 
0073: finish {
0074: }
0075: 
*** End of Internal Generated Transformer Code

I searched a lot in the forum but guess everyone has faced this issue in the transformer stage. But i am facing in the lookup stage.

ray.wurlod · Post by **ray.wurlod** » Wed Sep 15, 2010 4:28 am

Range lookups compile as transform operators. Therefore you really are looking at a "Transformer" stage. Now see if you can figure out the "field not found".

PhilHibbs · Post by **PhilHibbs** » Wed Sep 15, 2010 4:31 am

Error when checking composite operator: The number of reject datasets "0" is less than the number of input datasets "1".

This implies that you have a "reject" option selected on a lookup link, but don't have a rejects link. Double-click on the Condition in the Lookup stage, and check if the "Lookup Failure" option is "Reject", if so then you need a reject link.

Nagin · Post by **Nagin** » Mon Dec 06, 2010 3:46 pm

ray.wurlod wrote:Range lookups compile as transform operators. Therefore you really are looking at a "Transformer" stage. Now see if you can figure out the "field not found".

This is interesting.

So, If Range look ups compile as transform operators. How is the performance going to differ for a regular lookup versus a range look up?

Sounds like the performance should be better.

I need to perform Range look ups on these huge 10+ million row datasets.

If I just needed a look up I would have gone with Merge stage because of the performance. But I need Range look ups. So, looks like I am stuck with Look up stage.

Any ideas as of how I can do Range look ups and still get good performance out of the job?

ray.wurlod · Post by **ray.wurlod** » Mon Dec 06, 2010 4:08 pm

It's still a requirement that the reference data set can fit in memory. Does your 10 million+ row reference data set fit in memory?

Can you push the join back into the database, with a BETWEEN condition in the join specification?

Nagin · Post by **Nagin** » Mon Dec 06, 2010 4:29 pm

ray.wurlod wrote:It's still a requirement that the reference data set can fit in memory. Does your 10 million+ row reference data set fit in memory?

Can you push the join back into the database, with a BETWEEN condition in the join specification?

Well, the 10 million+ row reference data set is a stream input and I am looking up each of these rows against 10 other rows to see if they fit in the range.

Do you think its not a problem because the large data is stream input? Also, since the reference data is so small should I just go with sparse lookup?

I prefer to do it in Data Stage versus the database because of parallelism.

ray.wurlod · Post by **ray.wurlod** » Mon Dec 06, 2010 4:41 pm

With only 10 rows in the reference data set, this is a perfect fit to use a Lookup stage (unless the rows are each 1M in size!)