Page 1 of 2

Filling date mark in dynamic query

Posted: Mon Dec 12, 2005 5:48 pm
by RodBarnes
I am sure someone has already needed to do this and come up with a way to accomplish it.

Here's the situation: We have a need to pull data from a large table with 5M records and growing. We can limit this by adding a where clause that limits based upon a particular timestamp. So, with each run, I would store out a timestamp indicating the mark for the next run. Upon the next run, the mark would be read and used in the query that extracts the data so that only records with a timestamp beyond the mark would be extracted rather than pulling over all 5M records.

I haven't been able to figure out a way to get the mark into the query before it is sent to the server. I thought of using a parameter in some way but haven't had much luck there.

Any one want to share how they accomplished this?

Posted: Mon Dec 12, 2005 9:28 pm
by loveojha2
As you have said that parameter is the way to go about it.

All tables should have a timestamp field. At the end of each run to the job, the max timestamp of the target table is written to a Hashed File with a known key value. In the next run (of the sequencer) the same Hashed File is read from a routine, the routine returns this timestamp, this value is passed to the job as a parameter, which in turn is used in fetching the rows from the source table.

Re: Filling date mark in dynamic query

Posted: Tue Dec 13, 2005 8:29 am
by kcbland
RodBarnes wrote: I haven't been able to figure out a way to get the mark into the query before it is sent to the server. I thought of using a parameter in some way but haven't had much luck there.
In any of the relational stages, parameter values are available via #parametername# notation. So, a query would have a WHERE condition formatted something like the following, in this example Oracle notation:

Code: Select all

WHERE yourdatacolumn >= TO_DATE('#dateparameter#', 'YYYY-MM-DD HH24:MI:SS')
You must insure that parameter values, which are always text strings, are cast into the appropriate data type.

Posted: Tue Dec 13, 2005 9:21 am
by chulett
Small correction as I'm pretty sure it will need to look like this:

Code: Select all

WHERE yourdatacolumn >= TO_DATE('#dateparameter#', 'YYYY-MM-DD HH24:MI:SS')
And it may need to change slightly if you use an actual DATE parameter type as it allows no time portion:

Code: Select all

WHERE trunc(yourdatacolumn) >= TO_DATE('#dateparameter#', 'YYYY-MM-DD')
In any case, food for thought and some combination of what has been posted should get you where you need to be. :wink:

Posted: Tue Dec 13, 2005 9:27 am
by kcbland
Whatchoo talkin about? :oops:

Posted: Tue Dec 13, 2005 9:54 am
by chulett
The single quotes around the parameter, it will get upset without them. Dat's all. :wink:

Posted: Tue Dec 13, 2005 6:12 pm
by RodBarnes
Thanks for all the replys. I can successfully replace the parameter in a dynamic query.

Now I am struggling to get a date from a file (or other external, automated source) into the parameter <sigh>. I'm trying to set up a situation where I can read the date of the last good run and use that as a filter on the query for the next run, thus eliminating data that would have already been harvested.

I've tried using the CPSubstituteParams() in the click-pack and though I can get it to test correctly it doesn't seem to have any affect on filling the value of the parameter in the sequence.

Other suggestions of how to populate this parameter?

Posted: Tue Dec 13, 2005 6:35 pm
by ameyvaidya
You can use the following routine in a routine Activity before the job is invoked:

Code: Select all

* -----------------------------------------------------------------------------
* Arguments:
* -----------------------------------------------------------------------------
* FilePath: The path of the ParameterFile to be processed without the trailing \
* FileName: The ParameterFile name
* 
* -----------------------------------------------------------------------------
* Return Values:
* -----------------------------------------------------------------------------
* -1: Data not found/Empty Arguments.
* ParameterValue: If file pointed to contains a line of text, it will be passed(Trimmed) to output.


* Check if Input Parameters are present
If Trim(FilePath)="" or Trim(FileName)="" Then
   Ans=-1
End
Else
	* Open the parameter file.
	OpenSEQ FilePath:"\":FileName to ParamFile
		Then
			* Read a line or EOF(Empty File).
			ReadSEQ ParamLine from ParamFile
				Then
					* If a line of data is found, return  it to the calling program
					Ans=Trim(ParamLine)
				End
				Else
					* if not, return -1
					Ans=-1
				End
				* Close the File
				CloseSEQ ParamFile
		End
		Else
			* File cannot be opened
			Ans=-1
		End

End
Note you can also add ON ERROR clauses to both OpenSEQ or ReadSEQ to add error handling.

Posted: Wed Dec 21, 2005 10:40 am
by RodBarnes
Thanks for the help but my problem isn't getting data from the file. I can do that. I am just missing something about how to get a value into a job parameter -- probably something really obvious I am just stumbling over.

I tried simply doing an assignment in a transform:

jpMarkTime = GetParamFromFile(<filepath>, <paramname>)

but that doesn't work. But I know the function works because if I just use the function directly in a transform and then write out the value in a column, it is correct.

So, once I have the value from the file, how do I put that into a job parameter so it becomes available throughout the job?

2nd related question: Is it possible to pass this value upwards so it is available to other jobs? For example, given sequence A that runs job #1 and job #2, I want to set the job parameter in job #1 and then that value be available in job #2.

Posted: Wed Dec 21, 2005 11:00 am
by ameyvaidya
Hi Rod,
What version of DataStage are you using? Is it 7.5 and above?
The Sequence can be designed as:

Code: Select all


Job Activity1----------->Routine Activity-------------->Job Activity2
[This Job Writes the     [This routine reads            [Add a new Job parameter
Parameter to file]        the value written by Job1      to this Jobs design 
                          and returns it]                and in this sequence, 
                                                         Open activity properties,
                                                          Select the New Job 
                                                         parameter, Click 
                                                         "Insert Parameter Value";
                                                          in the new window that 
                                                         opens up browse down to the 
                                                         routine activity and select
                                                          Routine_Activity.$ReturnValue]
This 'Routine_Activity.$ReturnValue' can be used to pass values to any job downstream of the routine activity.

Posted: Wed Dec 21, 2005 11:09 am
by chulett
You can also take advantage of the User Status area each job has to pass a value to a downstream job without the intervening routine.

Posted: Wed Dec 21, 2005 11:20 am
by RodBarnes
ameyvaidya wrote:Hi Rod,
What version of DataStage are you using? Is it 7.5 and above?
The Sequence can be designed as...
Yes, v7.5.

The method you specified was the only way I could come up with, also, and this works. I just thought there may be a more elegant way to actually read the value from the file once and store that in a job parameter which could then be passed around. Sounds like there isn't such an option.

At least you have confirmed that what I came up with isn't unusual. :-)

Posted: Wed Dec 21, 2005 11:21 am
by RodBarnes
chulett wrote:You can also take advantage of the User Status area each job has...
I am not familiar with this and a search through the docs doesn't seem to come up with anything. Can you elaborate?

Posted: Wed Dec 21, 2005 11:26 am
by RodBarnes
I can succesfully read a value from a file and pass that along from within a transform.

I can successfully replace a job parameter in a query (e.g., a date) so that I can have a dynamic query.

What I can't seem to do is take that value I obtain from the file and put it into a job parameter so that it can be used in the where clause of the query.

Maybe there is a completely different way to go about this. I just want to limit the rows extracted by the query based upon a date stored by the previous run of the ETL. Seems like something that would have been needed plenty of times by now.

Posted: Wed Dec 21, 2005 11:27 am
by chulett
See if this post helps. There were several more - you need to search on UserStatus (one word) rather than two.