hi,
i've got a problem.
i have got a table (ORACLE) wich contains the connexion of utilisators to an application (uti_id, connexion_time, deconnexion_time - here are the fields).
i must plot the maximal number of connexion by day, and the interval of time when the maximum is reach.
I would like to do it by datastgae.
my idea :
1 - sort the table by connexion_time (ascending).
2 - select the first line as reference and test the connexion_time of all the other lines of the table. (but it might be too long)
so i am listening all kind of ideas...
Real part of a file in a routine
Moderators: chulett, rschirm, roy
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
DataStage is not a reporting tool.
Use the correct tool, which is probably as simple as a query against your Oracle database. Use a CASE construct to form your intervals, and group by these.
DataStage can certainly calculate your intervals as part of an ETL process, but will not generate a report for you. This must be accomplished out of the target table.
Use the correct tool, which is probably as simple as a query against your Oracle database. Use a CASE construct to form your intervals, and group by these.
DataStage can certainly calculate your intervals as part of an ETL process, but will not generate a report for you. This must be accomplished out of the target table.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
I know that datastage is not a reporting tool.
my question was not to construct intervals but finding a way in datastage (using routine or other things) to generate a file wich indicates the number of simultaneous connections .
my first idea was to create a routine and test for each line all the other line of my table verifying if the "deconnection_time" of the tested line is smaller than the "connection_time" of the reference line. Using a variable like 'i=i+1' if my condition is verified.
at the end datastage give me a file with the number of simultaneaous connections in front of each line of my file...
but it appears that it's too long. because my table as 30 000 rows. it's not performing at all.
so i think : perhaps someone as an other idea or could give me an idea to resolve my problem, or has yet find a solution for the same type of problem.
(excuse my english, i'm french)
my question was not to construct intervals but finding a way in datastage (using routine or other things) to generate a file wich indicates the number of simultaneous connections .
my first idea was to create a routine and test for each line all the other line of my table verifying if the "deconnection_time" of the tested line is smaller than the "connection_time" of the reference line. Using a variable like 'i=i+1' if my condition is verified.
at the end datastage give me a file with the number of simultaneaous connections in front of each line of my file...
but it appears that it's too long. because my table as 30 000 rows. it's not performing at all.
so i think : perhaps someone as an other idea or could give me an idea to resolve my problem, or has yet find a solution for the same type of problem.
(excuse my english, i'm french)
-
- Premium Member
- Posts: 385
- Joined: Wed Jun 16, 2004 12:43 pm
- Location: Virginia, USA
- Contact:
How are your times measured, (hours, minutes, seconds)? What is the interval used to measure simultaneous connects (all connections existing at this minute of the day)?
My though is you could to load the table to a hash file, adding a new multi-valued (delimited) field, with a value for each connection interval. For example:
Start time: 01:01:01
End time: 01:10:13
Interval length: 1 minute
Delimited Intervals: 01:01 @VM 01:02 @VM 01:03 @VM 01:04 @VM 01:05 @VM 01:06 @VM 01:07 @VM 01:08 @VM 01:09 @VM 01:10
Write a routine to build the Delimited Intervals from the Start time and End time.
Next, reading from the hash file, choose to normalize based upon the new multi-valued field.
Now, you have one row for each connection/interval combination. Just sort and count, or aggregate, them, and you should have your results.
My though is you could to load the table to a hash file, adding a new multi-valued (delimited) field, with a value for each connection interval. For example:
Start time: 01:01:01
End time: 01:10:13
Interval length: 1 minute
Delimited Intervals: 01:01 @VM 01:02 @VM 01:03 @VM 01:04 @VM 01:05 @VM 01:06 @VM 01:07 @VM 01:08 @VM 01:09 @VM 01:10
Write a routine to build the Delimited Intervals from the Start time and End time.
Next, reading from the hash file, choose to normalize based upon the new multi-valued field.
Now, you have one row for each connection/interval combination. Just sort and count, or aggregate, them, and you should have your results.
Code: Select all
(normalize)
OCI ----> XFR ----> HASH -------------> XFR ----> AGGR ----> SEQ
Chuck Smith
www.anotheritco.com
www.anotheritco.com