Load multiple tables

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
snt_ds
Premium Member
Premium Member
Posts: 280
Joined: Wed Oct 18, 2006 11:53 am
Location: Pune, India

Load multiple tables

Post by snt_ds »

Hi All,

I have a dataset which looks like the following
Name Value
A 1
B 2
A 45
A 23
C 17
B 12
C 32
A 21
C 65


I need to load A,B and C table. This three table has the same structure.
Please find the A table structure with the data.

Sno Value create_dt end_dt current_flag
1 1 1/31/2005 1/1/9999 Y
2 45 2/1/2005 1/1/9999 Y
3 23 2/2/2005 1/1/9999 Y
4 21 2/3/2005 1/1/9999 Y

I need to create the multiple instance which will read the dataset and pick the Name as invocation id and load the respective table with its values.
This job should populate 3 tables A,B and C

Dataset-------->xfm--------------->target table(A,B,C)

Please let me know is this possible with multiple instance job.

Thanks
Suri[/code]
sud
Premium Member
Premium Member
Posts: 366
Joined: Fri Dec 02, 2005 5:00 am
Location: Here I Am

Re: Load multiple tables

Post by sud »

The approach you have outlined is correct and will work. Parametrize the table name and that's it !!

In situations where say, your table structures(columns) vary, even then you can parametrize the insert/update statements in the target table stage and make do with just a single job. :D
It took me fifteen years to discover I had no talent for ETL, but I couldn't give it up because by that time I was too famous.
csrazdan
Participant
Posts: 127
Joined: Wed May 12, 2004 6:03 pm
Location: Chicago IL

Post by csrazdan »

Is the number of tables fixed, that is 3. If it is why are you not thinking of following job design.

Code: Select all

                          __________________ Target table A  
                         |
                         |
Dataset ------->xfm ---------------------> Target table B 
                         |
                         |__________________ Target table C
Assume everything I say or do is positive
DSguru2B
Charter Member
Charter Member
Posts: 6854
Joined: Wed Feb 09, 2005 3:44 pm
Location: Houston, TX

Post by DSguru2B »

As sud already noted, stick to your design. Make the job multiple instance and send the table name as a parameter. Within the transformer use the instance name to filter the records and send the correct records for a particular table.
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
csrazdan
Participant
Posts: 127
Joined: Wed May 12, 2004 6:03 pm
Location: Chicago IL

Post by csrazdan »

DSguru2B - Can you elaborate why you think multiple instance job is right way. Don't you think each instance of job is reading the entite DataSet each time?
Assume everything I say or do is positive
DSguru2B
Charter Member
Charter Member
Posts: 6854
Joined: Wed Feb 09, 2005 3:44 pm
Location: Houston, TX

Post by DSguru2B »

Read is fine, very fast too. A multiple instance job can keep it scalable. Tomorrow if a fourth or even fifth table comes in, no job design needs to be changed. The same can work for x number of tables.
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
csrazdan
Participant
Posts: 127
Joined: Wed May 12, 2004 6:03 pm
Location: Chicago IL

Post by csrazdan »

That is what my question was: Is your number of tables fixed?
Assume everything I say or do is positive
DSguru2B
Charter Member
Charter Member
Posts: 6854
Joined: Wed Feb 09, 2005 3:44 pm
Location: Houston, TX

Post by DSguru2B »

For now it maybe. As for the future, no one knows. The best design would be to handle that with minimal to no change to the present design. Thats why I advised a multi-instance job. Your method will work too.
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
snt_ds
Premium Member
Premium Member
Posts: 280
Joined: Wed Oct 18, 2006 11:53 am
Location: Pune, India

Post by snt_ds »

Hi All,


Thanks a lot for all your suggestions.
I'm using the multiple instance job to load multiple dimensions.

Now I need to extract data from source using parameter table to get the MAX_DATE and MAX_SEQ_NO.
Extract SQL looks like below:

SELECT
*
FROM
#$SCHEMA_NAME#.#SrcTblNm#
WHERE ATTRIBUTE_NAME = '#DSJobInvocationId#'
AND Update_DT >= (select max_date from PARAM_TABLE where Tgt_Table = '#DSJobInvocationId#')

I have PARAM_TABLE from where I get the date and sequence number of the jobs last run.
My job will pick date and sequence number from parameter table and there is a sequence generator in the job which will add one to sequence number and generate the sequence number for all the records for that run.

Here max_date is VARCHAR in PARAM_TABLE.
Update_DT column is date column.

The SQL should look like below but the problem is with TO_DATE function where I give the first arugument in qutotaion.

SELECT
*
FROM
#$SCHEMA_NAME#.#SrcTblNm#
WHERE ATTRIBUTE_NAME = '#DSJobInvocationId#'
AND Update_DT >= TO_DATE('(select max_date from PARAM_TABLE where Tgt_Table = "#DSJobInvocationId#")','DD-MON-YYYY')

So I need to do type conversion on Max_date using to_date function.
Can someone help me how to resolve the above problem in the sql..

Thanks
Suri
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

You are converting something that is already a DATE to a DATE. Simplify your SQL:

Code: Select all

SELECT * 
FROM #$SCHEMA_NAME#.#SrcTblNm# 
WHERE ATTRIBUTE_NAME = '#DSJobInvocationId#' 
AND Update_DT >= (select max_date from PARAM_TABLE where Tgt_Table = '#DSJobInvocationId#')
And you need to use single quotes, not doubles.
-craig

"You can never have too many knives" -- Logan Nine Fingers
snt_ds
Premium Member
Premium Member
Posts: 280
Joined: Wed Oct 18, 2006 11:53 am
Location: Pune, India

Post by snt_ds »

Hi

max_date is VARCHAR in PARAM_TABLE.


Thanks
Suri
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Why? :? Ok...

Code: Select all

SELECT * 
FROM #$SCHEMA_NAME#.#SrcTblNm# 
WHERE ATTRIBUTE_NAME = '#DSJobInvocationId#' 
AND Update_DT >= (select TO_DATE(max_date,'DD-MON-YYYY') from PARAM_TABLE where Tgt_Table = '#DSJobInvocationId#')
I'm assuming the date mask is correct from your previous post, you'd need to verify.
-craig

"You can never have too many knives" -- Logan Nine Fingers
Post Reply