Surrogate Key - wrong generation
Moderators: chulett, rschirm, roy
-
- Premium Member
- Posts: 151
- Joined: Fri Feb 13, 2009 4:19 pm
Surrogate Key - wrong generation
Hi
I have a Server job that extracts data from a sequential file and loads it into oracle table using oracle bulk loader. the total number of records is 3.8 million in the text file. And when i load the data we generate a surrogate key in the transformer (we use a stage variable - keep adding 1 to the previous value).
Now the problem is; the job loads 2.5 million records without any problem, but when it comes to 2500001th record it uses the same value as 2500001.
Ex: lets say i have 10 records in total..the job creates value from 1,2,3...5 correctly but when it comes to 6 or 7... 10 it keep generating like this 1,2,3,4,5,5,5,5,5,5.
Also im not able to check the logs in director. when i open the log, the screen freezes.
Cany any body help me out of this.
I have a Server job that extracts data from a sequential file and loads it into oracle table using oracle bulk loader. the total number of records is 3.8 million in the text file. And when i load the data we generate a surrogate key in the transformer (we use a stage variable - keep adding 1 to the previous value).
Now the problem is; the job loads 2.5 million records without any problem, but when it comes to 2500001th record it uses the same value as 2500001.
Ex: lets say i have 10 records in total..the job creates value from 1,2,3...5 correctly but when it comes to 6 or 7... 10 it keep generating like this 1,2,3,4,5,5,5,5,5,5.
Also im not able to check the logs in director. when i open the log, the screen freezes.
Cany any body help me out of this.
Thanks
Karthick
Karthick
For starters, your log issue is going to be from the zillions of warnings your failed sqlldr load is slamming into the hashed file. Best to just clear it, do you know the job number for this job? From the Administrator, issue a CLEAR.FILE RT_LOGnnn where nnn is the job number in question.
If you aren't sure, from memory the query to get the number would be:
SELECT JOBNO FROM DS_JOBS WHERE NAME = 'YourJobName';
What data type is your surrogate key field? The target, not any stage variable - Server stage variables are not typed.
If you aren't sure, from memory the query to get the number would be:
SELECT JOBNO FROM DS_JOBS WHERE NAME = 'YourJobName';
What data type is your surrogate key field? The target, not any stage variable - Server stage variables are not typed.
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers
-
- Premium Member
- Posts: 151
- Joined: Fri Feb 13, 2009 4:19 pm
-
- Premium Member
- Posts: 151
- Joined: Fri Feb 13, 2009 4:19 pm
i used the following to find my job number:
LIST DS_JOBS JOBNO WITH NAME = CustLdCup2OdsCustRef
and this gave me the number as 2235.
so my command to clear the log should be like:
CLEAR.FILE RT_LOG2235, please correct me if im wrong.
or can i create a copy of this job and try loading it in the mean while.?
LIST DS_JOBS JOBNO WITH NAME = CustLdCup2OdsCustRef
and this gave me the number as 2235.
so my command to clear the log should be like:
CLEAR.FILE RT_LOG2235, please correct me if im wrong.
or can i create a copy of this job and try loading it in the mean while.?
Thanks
Karthick
Karthick
-
- Premium Member
- Posts: 151
- Joined: Fri Feb 13, 2009 4:19 pm
-
- Premium Member
- Posts: 151
- Joined: Fri Feb 13, 2009 4:19 pm
-
- Premium Member
- Posts: 151
- Joined: Fri Feb 13, 2009 4:19 pm
-
- Premium Member
- Posts: 151
- Joined: Fri Feb 13, 2009 4:19 pm
Hi chulett,
That was a good idea. I tried to verify the issue, but i really couldnt understand why the key is not generated properly. but i found a way around to get this working. These are the steps i followed:
1) Increase the rows per transaction to a higher number (initially it was 0)
2) when the job aborted at 2.5Mth row i manually updated the surrogate key table to have the next key.
3) divided the staging file into two half, the first half contains the data that was loaded and the second half contains the new ones.
4) used the 2nd half to load the remaining, with corrected keys.
after this the table is loaded with no issues. if you have any suggestions please share with me.
That was a good idea. I tried to verify the issue, but i really couldnt understand why the key is not generated properly. but i found a way around to get this working. These are the steps i followed:
1) Increase the rows per transaction to a higher number (initially it was 0)
2) when the job aborted at 2.5Mth row i manually updated the surrogate key table to have the next key.
3) divided the staging file into two half, the first half contains the data that was loaded and the second half contains the new ones.
4) used the 2nd half to load the remaining, with corrected keys.
after this the table is loaded with no issues. if you have any suggestions please share with me.
Thanks
Karthick
Karthick
-
- Premium Member
- Posts: 151
- Joined: Fri Feb 13, 2009 4:19 pm
-
- Premium Member
- Posts: 151
- Joined: Fri Feb 13, 2009 4:19 pm
Hi ,
Im tryig to analyze why this problem exist, and only today i as able to get the Director logs, and that says:
i searched the forum for some replies. i read in a post that says, if i increase the "Read Cache Size" (mine was 128 MB). also the post says that if i increase the size there might be some adverse changes/issues that i might encounter. Im afraid to do that. also i check online and i got this manual,
http://71.18.93.139/uploads/upload_book ... dskche.pdf
I read this but got little confused, what am im going to do...! i totally got confused now.
Can anybody help me with this. Also i checked the size of the hash file, the DATA.30 file is only 16MB.
Note: Im reading and updating the same hash file. any help or suggestions would be great help.
Im tryig to analyze why this problem exist, and only today i as able to get the Director logs, and that says:
Code: Select all
Read caching disabled, file size exceeds cache size
http://71.18.93.139/uploads/upload_book ... dskche.pdf
I read this but got little confused, what am im going to do...! i totally got confused now.
Can anybody help me with this. Also i checked the size of the hash file, the DATA.30 file is only 16MB.
Note: Im reading and updating the same hash file. any help or suggestions would be great help.
Thanks
Karthick
Karthick
A couple of points. First, that message is not an error or even a warning, it is strictly informational and means you told it to cache a hashed file lookup and it couldn't fit into memory. It simply reads it from disk in that case as if you hadn't asked it to be cached. Not a problem. And increasing the size can cause other problems so I would not necessarily rush to change that.
When you read from and write to the same hashed file in a single transformer you need to not cache the lookup. Either that or cache and 'lock for update'. Is this hashed file the source for your surrogate key? If so this could be part of your issue.
When you read from and write to the same hashed file in a single transformer you need to not cache the lookup. Either that or cache and 'lock for update'. Is this hashed file the source for your surrogate key? If so this could be part of your issue.
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers