Filling date mark in dynamic query
Moderators: chulett, rschirm, roy
Filling date mark in dynamic query
I am sure someone has already needed to do this and come up with a way to accomplish it.
Here's the situation: We have a need to pull data from a large table with 5M records and growing. We can limit this by adding a where clause that limits based upon a particular timestamp. So, with each run, I would store out a timestamp indicating the mark for the next run. Upon the next run, the mark would be read and used in the query that extracts the data so that only records with a timestamp beyond the mark would be extracted rather than pulling over all 5M records.
I haven't been able to figure out a way to get the mark into the query before it is sent to the server. I thought of using a parameter in some way but haven't had much luck there.
Any one want to share how they accomplished this?
Here's the situation: We have a need to pull data from a large table with 5M records and growing. We can limit this by adding a where clause that limits based upon a particular timestamp. So, with each run, I would store out a timestamp indicating the mark for the next run. Upon the next run, the mark would be read and used in the query that extracts the data so that only records with a timestamp beyond the mark would be extracted rather than pulling over all 5M records.
I haven't been able to figure out a way to get the mark into the query before it is sent to the server. I thought of using a parameter in some way but haven't had much luck there.
Any one want to share how they accomplished this?
As you have said that parameter is the way to go about it.
All tables should have a timestamp field. At the end of each run to the job, the max timestamp of the target table is written to a Hashed File with a known key value. In the next run (of the sequencer) the same Hashed File is read from a routine, the routine returns this timestamp, this value is passed to the job as a parameter, which in turn is used in fetching the rows from the source table.
All tables should have a timestamp field. At the end of each run to the job, the max timestamp of the target table is written to a Hashed File with a known key value. In the next run (of the sequencer) the same Hashed File is read from a routine, the routine returns this timestamp, this value is passed to the job as a parameter, which in turn is used in fetching the rows from the source table.
Success consists of getting up just one more time than you fall.
Re: Filling date mark in dynamic query
In any of the relational stages, parameter values are available via #parametername# notation. So, a query would have a WHERE condition formatted something like the following, in this example Oracle notation:RodBarnes wrote: I haven't been able to figure out a way to get the mark into the query before it is sent to the server. I thought of using a parameter in some way but haven't had much luck there.
Code: Select all
WHERE yourdatacolumn >= TO_DATE('#dateparameter#', 'YYYY-MM-DD HH24:MI:SS')
Last edited by kcbland on Tue Dec 13, 2005 9:26 am, edited 1 time in total.
Kenneth Bland
Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
Small correction as I'm pretty sure it will need to look like this:
And it may need to change slightly if you use an actual DATE parameter type as it allows no time portion:
In any case, food for thought and some combination of what has been posted should get you where you need to be.
Code: Select all
WHERE yourdatacolumn >= TO_DATE('#dateparameter#', 'YYYY-MM-DD HH24:MI:SS')
Code: Select all
WHERE trunc(yourdatacolumn) >= TO_DATE('#dateparameter#', 'YYYY-MM-DD')
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers
Thanks for all the replys. I can successfully replace the parameter in a dynamic query.
Now I am struggling to get a date from a file (or other external, automated source) into the parameter <sigh>. I'm trying to set up a situation where I can read the date of the last good run and use that as a filter on the query for the next run, thus eliminating data that would have already been harvested.
I've tried using the CPSubstituteParams() in the click-pack and though I can get it to test correctly it doesn't seem to have any affect on filling the value of the parameter in the sequence.
Other suggestions of how to populate this parameter?
Now I am struggling to get a date from a file (or other external, automated source) into the parameter <sigh>. I'm trying to set up a situation where I can read the date of the last good run and use that as a filter on the query for the next run, thus eliminating data that would have already been harvested.
I've tried using the CPSubstituteParams() in the click-pack and though I can get it to test correctly it doesn't seem to have any affect on filling the value of the parameter in the sequence.
Other suggestions of how to populate this parameter?
-
- Charter Member
- Posts: 166
- Joined: Wed Mar 16, 2005 6:52 am
- Location: Mumbai, India
You can use the following routine in a routine Activity before the job is invoked:
Note you can also add ON ERROR clauses to both OpenSEQ or ReadSEQ to add error handling.
Code: Select all
* -----------------------------------------------------------------------------
* Arguments:
* -----------------------------------------------------------------------------
* FilePath: The path of the ParameterFile to be processed without the trailing \
* FileName: The ParameterFile name
*
* -----------------------------------------------------------------------------
* Return Values:
* -----------------------------------------------------------------------------
* -1: Data not found/Empty Arguments.
* ParameterValue: If file pointed to contains a line of text, it will be passed(Trimmed) to output.
* Check if Input Parameters are present
If Trim(FilePath)="" or Trim(FileName)="" Then
Ans=-1
End
Else
* Open the parameter file.
OpenSEQ FilePath:"\":FileName to ParamFile
Then
* Read a line or EOF(Empty File).
ReadSEQ ParamLine from ParamFile
Then
* If a line of data is found, return it to the calling program
Ans=Trim(ParamLine)
End
Else
* if not, return -1
Ans=-1
End
* Close the File
CloseSEQ ParamFile
End
Else
* File cannot be opened
Ans=-1
End
End
Amey Vaidya<i>
I am rarely happier than when spending an entire day programming my computer to perform automatically a task that it would otherwise take me a good ten seconds to do by hand.</i>
<i>- Douglas Adams</i>
I am rarely happier than when spending an entire day programming my computer to perform automatically a task that it would otherwise take me a good ten seconds to do by hand.</i>
<i>- Douglas Adams</i>
Thanks for the help but my problem isn't getting data from the file. I can do that. I am just missing something about how to get a value into a job parameter -- probably something really obvious I am just stumbling over.
I tried simply doing an assignment in a transform:
jpMarkTime = GetParamFromFile(<filepath>, <paramname>)
but that doesn't work. But I know the function works because if I just use the function directly in a transform and then write out the value in a column, it is correct.
So, once I have the value from the file, how do I put that into a job parameter so it becomes available throughout the job?
2nd related question: Is it possible to pass this value upwards so it is available to other jobs? For example, given sequence A that runs job #1 and job #2, I want to set the job parameter in job #1 and then that value be available in job #2.
I tried simply doing an assignment in a transform:
jpMarkTime = GetParamFromFile(<filepath>, <paramname>)
but that doesn't work. But I know the function works because if I just use the function directly in a transform and then write out the value in a column, it is correct.
So, once I have the value from the file, how do I put that into a job parameter so it becomes available throughout the job?
2nd related question: Is it possible to pass this value upwards so it is available to other jobs? For example, given sequence A that runs job #1 and job #2, I want to set the job parameter in job #1 and then that value be available in job #2.
-
- Charter Member
- Posts: 166
- Joined: Wed Mar 16, 2005 6:52 am
- Location: Mumbai, India
Hi Rod,
What version of DataStage are you using? Is it 7.5 and above?
The Sequence can be designed as:
This 'Routine_Activity.$ReturnValue' can be used to pass values to any job downstream of the routine activity.
What version of DataStage are you using? Is it 7.5 and above?
The Sequence can be designed as:
Code: Select all
Job Activity1----------->Routine Activity-------------->Job Activity2
[This Job Writes the [This routine reads [Add a new Job parameter
Parameter to file] the value written by Job1 to this Jobs design
and returns it] and in this sequence,
Open activity properties,
Select the New Job
parameter, Click
"Insert Parameter Value";
in the new window that
opens up browse down to the
routine activity and select
Routine_Activity.$ReturnValue]
Amey Vaidya<i>
I am rarely happier than when spending an entire day programming my computer to perform automatically a task that it would otherwise take me a good ten seconds to do by hand.</i>
<i>- Douglas Adams</i>
I am rarely happier than when spending an entire day programming my computer to perform automatically a task that it would otherwise take me a good ten seconds to do by hand.</i>
<i>- Douglas Adams</i>
Yes, v7.5.ameyvaidya wrote:Hi Rod,
What version of DataStage are you using? Is it 7.5 and above?
The Sequence can be designed as...
The method you specified was the only way I could come up with, also, and this works. I just thought there may be a more elegant way to actually read the value from the file once and store that in a job parameter which could then be passed around. Sounds like there isn't such an option.
At least you have confirmed that what I came up with isn't unusual.
I can succesfully read a value from a file and pass that along from within a transform.
I can successfully replace a job parameter in a query (e.g., a date) so that I can have a dynamic query.
What I can't seem to do is take that value I obtain from the file and put it into a job parameter so that it can be used in the where clause of the query.
Maybe there is a completely different way to go about this. I just want to limit the rows extracted by the query based upon a date stored by the previous run of the ETL. Seems like something that would have been needed plenty of times by now.
I can successfully replace a job parameter in a query (e.g., a date) so that I can have a dynamic query.
What I can't seem to do is take that value I obtain from the file and put it into a job parameter so that it can be used in the where clause of the query.
Maybe there is a completely different way to go about this. I just want to limit the rows extracted by the query based upon a date stored by the previous run of the ETL. Seems like something that would have been needed plenty of times by now.