Slowly Changing Dimension and valid date

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
splayer
Charter Member
Charter Member
Posts: 502
Joined: Mon Apr 12, 2004 5:01 pm

Slowly Changing Dimension and valid date

Post by splayer »

I have a job with a SCD stage in the middle receiving input from a source Ora stage and a lookup Ora stage. I am trying to populate a dimension table and to start off with, the dimension table is empty. Obviously, the lookup from the dimension table returns nothing. The link from the lookup Ora stage to the SCD stage is named lkp_link.

I have a date column, Eff_Date, which I am using as Effective Date (Type 2) purpose. The derivation for this column on Output tab's Dim Update column is:

if IsNull(lkp_Link.Eff_Date) Then
CurrentTimestamp()
Else
if IsValid('Date',lkp_Link.Eff_Date) Then
lkp_Link.Eff_Date
Else
CurrentTimestamp()

My datatype for this column on the Columns tab on the Output tab is Timestamp and on the Columns tab in the Input tab is also Timestamp (which is ready-only).

No matter what I do, I keep getting the error:

SCDOne,0: Data string '**********' does not match format '%yyyy-%mm-%dd %hh:%nn:%ss': an integer was expected to match tag %yyyy.

Any ideas?
jgajardo
Premium Member
Premium Member
Posts: 15
Joined: Thu Sep 23, 2004 9:16 am
Contact:

Re: Slowly Changing Dimension and valid date

Post by jgajardo »

splayer wrote:

Code: Select all

if IsNull(lkp_Link.Eff_Date) Then 
   CurrentTimestamp() 
Else 
    if IsValid('Date',lkp_Link.Eff_Date) Then 
       lkp_Link.Eff_Date 
    Else 
       CurrentTimestamp() 
My datatype for this column on the Columns tab on the Output tab is Timestamp and on the Columns tab in the Input tab is also Timestamp
You seem to be validating that the lkp_Link.Eff_Date is a date.

I see 2 potential problems here:
1) If you want to validate, you should be validating that it's a timestamp and not a date.
But, since your column is already a TimeStamp this is not required.

2) Why is your link called lkp_Link for the effective date?
Shouldn't you get the effective date from your source or a parameter?

You should have something like this instead:

Code: Select all

if IsNull(INPUT_LINK.Eff_Date) Then 
   CurrentTimestamp() 
Else 
    INPUT_LINK.Eff_Date
Incase you haven't already done so,... Check this out :
"IBM InfoSphere DataStage Data Flow and Job Design"
http://www.redbooks.ibm.com/redbooks/pdfs/sg247576.pdf
Section 2.14 Slowly Changing Dimension (pages 113 to 126).
Julio E. Gajardo

Life is what happens to you while your busy making other plans - John Lennon
jgajardo
Premium Member
Premium Member
Posts: 15
Joined: Thu Sep 23, 2004 9:16 am
Contact:

Re: Slowly Changing Dimension and valid date

Post by jgajardo »

splayer wrote:

Code: Select all

if IsNull(lkp_Link.Eff_Date) Then 
   CurrentTimestamp() 
Else 
    if IsValid('Date',lkp_Link.Eff_Date) Then 
       lkp_Link.Eff_Date 
    Else 
       CurrentTimestamp() 
My datatype for this column on the Columns tab on the Output tab is Timestamp and on the Columns tab in the Input tab is also Timestamp
You seem to be validating that the lkp_Link.Eff_Date is a date.

I see 2 potential problems here:
1) If you want to validate, you should be validating that it's a timestamp and not a date.
But, since your column is already a TimeStamp this is not required.

2) Why is your link called lkp_Link for the effective date?
Shouldn't you get the effective date from your source or a parameter?

You should have something like this instead:

Code: Select all

if IsNull(INPUT_LINK.Eff_Date) Then 
   CurrentTimestamp() 
Else 
    INPUT_LINK.Eff_Date
In case you haven't already done so,... Check this out :
"IBM InfoSphere DataStage Data Flow and Job Design"
http://www.redbooks.ibm.com/redbooks/pdfs/sg247576.pdf
Section 2.14 Slowly Changing Dimension (pages 113 to 126).
Julio E. Gajardo

Life is what happens to you while your busy making other plans - John Lennon
splayer
Charter Member
Charter Member
Posts: 502
Joined: Mon Apr 12, 2004 5:01 pm

Post by splayer »

Julio, thank you for your help. I was able to resolve the IsValid date issue.

My main problem is with the SCD stage. I am trying to maintain a dimension. I have From_Date, To_Date, Seq_No and Current_Flag columns.

On the Dim Update tab of the Output tab, I have the following derivations:

Current_Flag:
----------------
Derivation: 'Y'
Purpose: Current Indicator (Type 2)
Expire column value: 'N'

From_Date:
--------------
Derivation: if IsNull(lkp_Link.From_Date) then CurrentTimestamp() Else lkp_Link.From_Date
(NOTE: If there is a business key match with an existing record then I am using existing From_Date)
Purpose: Effective Date (Type 2)
Expire column value: Blank

To_Date:
----------
Derivation: '2099-12-31 00:00:00'
Purpose: Expiration Date (Type 2)
Expire column value: lnk_Link.From_Date

Seq_No:
----------
(NOTE: 1st record inserted in the table would have a Seq_No of 1. If an update do this record is inserted, it would get a Seq_No value of 2 and so on. Thus, the current record would have the max Seq_No value for a business key)

Derivation: if IsNull(lkp_Link.Seq_No) or lkp_Link.Seq_No=0 Then 1 Else lkp_Link.Seq_No + 1
Purpose: Blank
Expire column value: Blank

All other columns:
---------------------
Derivation: comes from Source Ora stage
Purpose: Type 2
---------------------------------------------------------------------------------

QUESTIONS:

1)On an empty dimension table, I run the job. Results:
CURRENT_FLAG: 'Y' (Correct value)
FROM_DATE: 2/26/2010 9:12:20 PM (Correct value)
TO_DATE: 12/31/2099 (Correct value)
SEQ_NO: 84225 (Incorrect value. Should be 1.)

2)I update one of the Purpose=Type 2 columns to a different value. I rerun the job. An additional rows gets inserted as expected. Results:
1st existing row:
-------------------
CURRENT_FLAG: 'Y' (Incorrect value. Should be 'N')
FROM_DATE: 2/26/2010 9:12:20 PM (Correct value)
TO_DATE: 12/31/2099 (Incorrect value. Should have the From_Date of the next new row)

2nd row, new one which got inserted:
-------------------------------------------
CURRENT_FLAG: 'Y' (Correct value)
FROM_DATE: 2/26/2010 9:12:20 PM (Incorrect value. Should be at least a minute later than the previous row's From_Date and same as the previous row's To_Date.)
TO_DATE: 12/31/2099 (Correct value)
SEQ_NO: 84226 (Correct value. Should be previous Seq_No + 1 and it is)
---------------------------------------------------------------------------------

I would appreciate any help in this regard.
jgajardo
Premium Member
Premium Member
Posts: 15
Joined: Thu Sep 23, 2004 9:16 am
Contact:

Post by jgajardo »

I don't have access to test anything before monday...

In version 8.1 (which I've tested with) the SCD stage updates the values
(in memory) of the reference link columns which you defined with a purpose code.

ie: When you reference them, it's too late ... the value has changed ...
and you don't have the original value

If you duplicate the column (in the lookup), and don't set any purpose code to the duplicated column definition, you can then reference the "old" value.

This solved my problem in output map ... it should be the same in the Update Dimension link.

you can read my post on this here
viewtopic.php?t=132394&highlight=SCD+stage

Hope this helps. Let me know if it works!
Julio E. Gajardo

Life is what happens to you while your busy making other plans - John Lennon
splayer
Charter Member
Charter Member
Posts: 502
Joined: Mon Apr 12, 2004 5:01 pm

Post by splayer »

If you set the purpose code on the Output tab/Dim Update tab, the purpose code on the Lookup tab on the reference link gets set by default. If you remove on the lookup tab, it gets removed from the Dim Update tab as well.
jgajardo
Premium Member
Premium Member
Posts: 15
Joined: Thu Sep 23, 2004 9:16 am
Contact:

Post by jgajardo »

Did you duplicate the column in your lookup?

Code: Select all

Type 2   No purpose code
Seq_No   Seq_No_Original
------   ---------------
2        1
The column might not appear on the left side, but you can still access it if you edit (double-click) the derivation and choose input column"[/code]
Julio E. Gajardo

Life is what happens to you while your busy making other plans - John Lennon
splayer
Charter Member
Charter Member
Posts: 502
Joined: Mon Apr 12, 2004 5:01 pm

Post by splayer »

Yes, I did.

I also followed 2 tutorials including the redbook one (Very interestingly, I could not find a single picture with a SCD stage in that whole 350 page project. If you do please let me know the page number).

My main problem is, like I explained in detail earlier, my updates are not happening correctly eventhough I followed the tutorials to a T.
splayer
Charter Member
Charter Member
Posts: 502
Joined: Mon Apr 12, 2004 5:01 pm

Post by splayer »

Julio, thank you very much for trying to help. I have accepted defeat. Since the SCD stage is very new, there are very few posts and examples on this. I think I'll try to implement this using the change capture stage.
Post Reply