Hi,
I am creating a dataset using OVEWRITE in 1 job and in a successor job, APPENDING to the same dataset. I have RCP turned off and the metadata definitions are identical (I and a colleague have checked). The schema definitions in DS management also match.
I run the sequence of Job 1 and Job 2 twice for separate business areas, so 2 datasets are created to be merged by Job 3:
Job 1 (creates dataset OVERWRITE)
|
Job 2 (appends to same dataset, schema / metadata identical to Job 1)
|
Job 3 (merges 2 identically formated datasets produced from Job 1 and Job 2 into 1 using a Funnel Stage. RCP is switched on so it can be re-used as we have 14 different subject areas to do this for. No metadata is included within the column definitions, I have checked.)
Job 3 - The APPENDED dataset is then used as input into a funnel stage job where 2 identically defined datasets are merged into 1. I am using a sequence and am using RCP to perform this.
I am doing this for 14 subject areas.
6 Work fine where I am not APPENDING - For datasets that are just being written using OVERWRITE with no need to APPEND (so no Job 2), the 3rd job using RCP to funnel the results of 2 datasets into 1 works fine.
However, if the input datasets of job 3 are APPENDED ones, the RCP Funnel job just hangs after reading a small number of recs (couple of hundred).
I tried re-running the hanging job, re-running Job 1 (OVERWRITE) and omitting Job 2 and it works fine so it is defintely something with regards to APPENDing to the dataset.
1) I have checked the metadata and the schema definitions within dataset management and they are identical both between business areas and between the OVERWRITE and APPEND jobs (job 1 and job2).
2) There are no error messages in the log. The last log entry shows:
main_program: Starting step execution
Has anyone else come across this issue? I can get around it by merging 2 datasets instead of using OVERWRITE and APPEND but using APPEND seems so basic with regards to a DS, I can't understand why it is causing an issue.
Many Thanks,
Daren
Error when using Dataset that is being appended to
Moderators: chulett, rschirm, roy
In addition to the above I have amended Job 2 to write to a new (separate) dataset to Job 1, and then amended Job 3 to funnel these using RCP and it works OK.
This then proves that the metadata / schema from Job 1 and Job 2 are identical.
The APPEND of the DS is obviously causing some sort of issue, but I kinda knew that![Wink ;)](./images/smilies/icon_wink.gif)
Just wanted to add that![Smile :)](./images/smilies/icon_smile.gif)
Daren
This then proves that the metadata / schema from Job 1 and Job 2 are identical.
The APPEND of the DS is obviously causing some sort of issue, but I kinda knew that
![Wink ;)](./images/smilies/icon_wink.gif)
Just wanted to add that
![Smile :)](./images/smilies/icon_smile.gif)
Daren
Daren
not sure why u require 3 jobs .u can have 1 job with append option run in mutliple instance since all data are going in 1 dataset in 3rd job
Thanks
Sanjay
not sure why u require 3 jobs .u can have 1 job with append option run in mutliple instance since all data are going in 1 dataset in 3rd job
Thanks
Sanjay
droberts wrote:In addition to the above I have amended Job 2 to write to a new (separate) dataset to Job 1, and then amended Job 3 to funnel these using RCP and it works OK.
This then proves that the metadata / schema from Job 1 and Job 2 are identical.
The APPEND of the DS is obviously causing some sort of issue, but I kinda knew that
Just wanted to add that
Daren
The data is populated into the identical target structure from separate sources, hence 2 separate jobs. You also cannot use a Dataset as input and Output in the same job, so that's why the funnel job is a 3rd (as well as it being generic and using RCP).sanjay wrote:Daren
not sure why u require 3 jobs .u can have 1 job with append option run in mutliple instance since all data are going in 1 dataset in 3rd job
Thanks
Sanjay
droberts wrote:In addition to the above I have amended Job 2 to write to a new (separate) dataset to Job 1, and then amended Job 3 to funnel these using RCP and it works OK.
This then proves that the metadata / schema from Job 1 and Job 2 are identical.
The APPEND of the DS is obviously causing some sort of issue, but I kinda knew that
Just wanted to add that
Daren
daren
i really dont understand statment "You also cannot use a Dataset as input and Output in the same job"
u can use Dataset as input and Output in the same job
regrds
sanjay
i really dont understand statment "You also cannot use a Dataset as input and Output in the same job"
u can use Dataset as input and Output in the same job
regrds
sanjay
droberts wrote:The data is populated into the identical target structure from separate sources, hence 2 separate jobs. You also cannot use a Dataset as input and Output in the same job, so that's why the funnel job is a 3rd (as well as it being generic and using RCP).sanjay wrote:Daren
not sure why u require 3 jobs .u can have 1 job with append option run in mutliple instance since all data are going in 1 dataset in 3rd job
Thanks
Sanjay
droberts wrote:In addition to the above I have amended Job 2 to write to a new (separate) dataset to Job 1, and then amended Job 3 to funnel these using RCP and it works OK.
This then proves that the metadata / schema from Job 1 and Job 2 are identical.
The APPEND of the DS is obviously causing some sort of issue, but I kinda knew that
Just wanted to add that
Daren