This is one of the interveiw question.
Source tables/files does not contain any key field.
How to implement incremental load in the target tables?
I was guesing using unix shell script commands like compare and other commmands, it be can be possible.
Any other tips/solution would be a great help.
Thanks in advance
Regards,
Rafiq
Incremental load with no key field at source
Moderators: chulett, rschirm, roy
Re: Incremental load with no key field at source
The "key" to understanding whether a incoming record is already present or not are the "key columns". When you don't have any, what it necessarily means is that 1>no primary keys are specified in the table or 2>there are actually no keys. Option 1 gives you the alternate or natural keys based on which you can do a change capture in datastage. Option 2 tells you that all the columns are a key, so you do a change capture on all columns as keys.rafik2k wrote:This is one of the interveiw question.
Source tables/files does not contain any key field.
How to implement incremental load in the target tables?
I was guesing using unix shell script commands like compare and other commmands, it be can be possible.
Any other tips/solution would be a great help.
Thanks in advance
Regards,
Rafiq
In a practical situation, I cannot think of a Logical Datamodel where you have a table which does not have any keys and for which you want to write incremental load ETL. Probably when you find one, you should bust your data modeller.
It took me fifteen years to discover I had no talent for ETL, but I couldn't give it up because by that time I was too famous.
Without a key what identifies the records to compare from one run to the next?
If there is no natural key in the source field then all you can test for is whether their is already a record in the target that is exactly the same, in which case all fields are key fields???
Or is this a trick question? was there a modified time field in the source?
If there is no natural key in the source field then all you can test for is whether their is already a record in the target that is exactly the same, in which case all fields are key fields???
Or is this a trick question? was there a modified time field in the source?
Regards,
Nick.
Nick.
-
- Premium Member
- Posts: 385
- Joined: Wed Jun 16, 2004 12:43 pm
- Location: Virginia, USA
- Contact:
If you trust your load portion of ETL, you can compare yesterday's full load file with today's full load file to identify inserts, updates, and deletes.
Look at the unix "comm" command to compare your load files.
Look at the unix "comm" command to compare your load files.
Chuck Smith
www.anotheritco.com
www.anotheritco.com
Thanks for your tipschucksmith wrote:If you trust your load portion of ETL, you can compare yesterday's full load file with today's full load file to identify inserts, updates, and deletes.
Look at the unix "comm" command to compare your load files.
Though I can indentify changed records using unix utilities, but how can I update or deletes these recorods at target database without any key
I also believe that it is trick question.kduke wrote:It is a trick question. An incremental load is based on only selecting records in the source which have changed. You can do that without keys. All you need is last updated field in the source.
What to update in the target may require keys.
That means if we have last updated field in the source then we can insert fresh records only.
We can't assume all fields are key fields as some field even may contain null values.
So updating existing records in target not possible without key.