Filling date mark in dynamic query

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

RodBarnes
Charter Member
Charter Member
Posts: 182
Joined: Fri Mar 18, 2005 2:10 pm

Filling date mark in dynamic query

Post by RodBarnes »

I am sure someone has already needed to do this and come up with a way to accomplish it.

Here's the situation: We have a need to pull data from a large table with 5M records and growing. We can limit this by adding a where clause that limits based upon a particular timestamp. So, with each run, I would store out a timestamp indicating the mark for the next run. Upon the next run, the mark would be read and used in the query that extracts the data so that only records with a timestamp beyond the mark would be extracted rather than pulling over all 5M records.

I haven't been able to figure out a way to get the mark into the query before it is sent to the server. I thought of using a parameter in some way but haven't had much luck there.

Any one want to share how they accomplished this?
loveojha2
Participant
Posts: 362
Joined: Thu May 26, 2005 12:59 am

Post by loveojha2 »

As you have said that parameter is the way to go about it.

All tables should have a timestamp field. At the end of each run to the job, the max timestamp of the target table is written to a Hashed File with a known key value. In the next run (of the sequencer) the same Hashed File is read from a routine, the routine returns this timestamp, this value is passed to the job as a parameter, which in turn is used in fetching the rows from the source table.
Success consists of getting up just one more time than you fall.
kcbland
Participant
Posts: 5208
Joined: Wed Jan 15, 2003 8:56 am
Location: Lutz, FL
Contact:

Re: Filling date mark in dynamic query

Post by kcbland »

RodBarnes wrote: I haven't been able to figure out a way to get the mark into the query before it is sent to the server. I thought of using a parameter in some way but haven't had much luck there.
In any of the relational stages, parameter values are available via #parametername# notation. So, a query would have a WHERE condition formatted something like the following, in this example Oracle notation:

Code: Select all

WHERE yourdatacolumn >= TO_DATE('#dateparameter#', 'YYYY-MM-DD HH24:MI:SS')
You must insure that parameter values, which are always text strings, are cast into the appropriate data type.
Last edited by kcbland on Tue Dec 13, 2005 9:26 am, edited 1 time in total.
Kenneth Bland

Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Small correction as I'm pretty sure it will need to look like this:

Code: Select all

WHERE yourdatacolumn >= TO_DATE('#dateparameter#', 'YYYY-MM-DD HH24:MI:SS')
And it may need to change slightly if you use an actual DATE parameter type as it allows no time portion:

Code: Select all

WHERE trunc(yourdatacolumn) >= TO_DATE('#dateparameter#', 'YYYY-MM-DD')
In any case, food for thought and some combination of what has been posted should get you where you need to be. :wink:
-craig

"You can never have too many knives" -- Logan Nine Fingers
kcbland
Participant
Posts: 5208
Joined: Wed Jan 15, 2003 8:56 am
Location: Lutz, FL
Contact:

Post by kcbland »

Whatchoo talkin about? :oops:
Kenneth Bland

Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

The single quotes around the parameter, it will get upset without them. Dat's all. :wink:
-craig

"You can never have too many knives" -- Logan Nine Fingers
RodBarnes
Charter Member
Charter Member
Posts: 182
Joined: Fri Mar 18, 2005 2:10 pm

Post by RodBarnes »

Thanks for all the replys. I can successfully replace the parameter in a dynamic query.

Now I am struggling to get a date from a file (or other external, automated source) into the parameter <sigh>. I'm trying to set up a situation where I can read the date of the last good run and use that as a filter on the query for the next run, thus eliminating data that would have already been harvested.

I've tried using the CPSubstituteParams() in the click-pack and though I can get it to test correctly it doesn't seem to have any affect on filling the value of the parameter in the sequence.

Other suggestions of how to populate this parameter?
ameyvaidya
Charter Member
Charter Member
Posts: 166
Joined: Wed Mar 16, 2005 6:52 am
Location: Mumbai, India

Post by ameyvaidya »

You can use the following routine in a routine Activity before the job is invoked:

Code: Select all

* -----------------------------------------------------------------------------
* Arguments:
* -----------------------------------------------------------------------------
* FilePath: The path of the ParameterFile to be processed without the trailing \
* FileName: The ParameterFile name
* 
* -----------------------------------------------------------------------------
* Return Values:
* -----------------------------------------------------------------------------
* -1: Data not found/Empty Arguments.
* ParameterValue: If file pointed to contains a line of text, it will be passed(Trimmed) to output.


* Check if Input Parameters are present
If Trim(FilePath)="" or Trim(FileName)="" Then
   Ans=-1
End
Else
	* Open the parameter file.
	OpenSEQ FilePath:"\":FileName to ParamFile
		Then
			* Read a line or EOF(Empty File).
			ReadSEQ ParamLine from ParamFile
				Then
					* If a line of data is found, return  it to the calling program
					Ans=Trim(ParamLine)
				End
				Else
					* if not, return -1
					Ans=-1
				End
				* Close the File
				CloseSEQ ParamFile
		End
		Else
			* File cannot be opened
			Ans=-1
		End

End
Note you can also add ON ERROR clauses to both OpenSEQ or ReadSEQ to add error handling.
Amey Vaidya<i>
I am rarely happier than when spending an entire day programming my computer to perform automatically a task that it would otherwise take me a good ten seconds to do by hand.</i>
<i>- Douglas Adams</i>
RodBarnes
Charter Member
Charter Member
Posts: 182
Joined: Fri Mar 18, 2005 2:10 pm

Post by RodBarnes »

Thanks for the help but my problem isn't getting data from the file. I can do that. I am just missing something about how to get a value into a job parameter -- probably something really obvious I am just stumbling over.

I tried simply doing an assignment in a transform:

jpMarkTime = GetParamFromFile(<filepath>, <paramname>)

but that doesn't work. But I know the function works because if I just use the function directly in a transform and then write out the value in a column, it is correct.

So, once I have the value from the file, how do I put that into a job parameter so it becomes available throughout the job?

2nd related question: Is it possible to pass this value upwards so it is available to other jobs? For example, given sequence A that runs job #1 and job #2, I want to set the job parameter in job #1 and then that value be available in job #2.
ameyvaidya
Charter Member
Charter Member
Posts: 166
Joined: Wed Mar 16, 2005 6:52 am
Location: Mumbai, India

Post by ameyvaidya »

Hi Rod,
What version of DataStage are you using? Is it 7.5 and above?
The Sequence can be designed as:

Code: Select all


Job Activity1----------->Routine Activity-------------->Job Activity2
[This Job Writes the     [This routine reads            [Add a new Job parameter
Parameter to file]        the value written by Job1      to this Jobs design 
                          and returns it]                and in this sequence, 
                                                         Open activity properties,
                                                          Select the New Job 
                                                         parameter, Click 
                                                         "Insert Parameter Value";
                                                          in the new window that 
                                                         opens up browse down to the 
                                                         routine activity and select
                                                          Routine_Activity.$ReturnValue]
This 'Routine_Activity.$ReturnValue' can be used to pass values to any job downstream of the routine activity.
Amey Vaidya<i>
I am rarely happier than when spending an entire day programming my computer to perform automatically a task that it would otherwise take me a good ten seconds to do by hand.</i>
<i>- Douglas Adams</i>
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

You can also take advantage of the User Status area each job has to pass a value to a downstream job without the intervening routine.
-craig

"You can never have too many knives" -- Logan Nine Fingers
RodBarnes
Charter Member
Charter Member
Posts: 182
Joined: Fri Mar 18, 2005 2:10 pm

Post by RodBarnes »

ameyvaidya wrote:Hi Rod,
What version of DataStage are you using? Is it 7.5 and above?
The Sequence can be designed as...
Yes, v7.5.

The method you specified was the only way I could come up with, also, and this works. I just thought there may be a more elegant way to actually read the value from the file once and store that in a job parameter which could then be passed around. Sounds like there isn't such an option.

At least you have confirmed that what I came up with isn't unusual. :-)
RodBarnes
Charter Member
Charter Member
Posts: 182
Joined: Fri Mar 18, 2005 2:10 pm

Post by RodBarnes »

chulett wrote:You can also take advantage of the User Status area each job has...
I am not familiar with this and a search through the docs doesn't seem to come up with anything. Can you elaborate?
RodBarnes
Charter Member
Charter Member
Posts: 182
Joined: Fri Mar 18, 2005 2:10 pm

Post by RodBarnes »

I can succesfully read a value from a file and pass that along from within a transform.

I can successfully replace a job parameter in a query (e.g., a date) so that I can have a dynamic query.

What I can't seem to do is take that value I obtain from the file and put it into a job parameter so that it can be used in the where clause of the query.

Maybe there is a completely different way to go about this. I just want to limit the rows extracted by the query based upon a date stored by the previous run of the ETL. Seems like something that would have been needed plenty of times by now.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

See if this post helps. There were several more - you need to search on UserStatus (one word) rather than two.
-craig

"You can never have too many knives" -- Logan Nine Fingers
Post Reply