This is more of a design question than a DS implementation one. Using a generated key from the beginning ensures that there are no duplicates and less "holes".
I didn't say that these would occur, just that they could occur. If the initial load uses a different method from the recurrent one then there is a chance that the two methods won't synchronize (particularly after problems such as job aborts or retries with new data). If you only use one method then you have remove a potential source of error.