Implementation problem
Posted: Wed Aug 18, 2010 8:47 pm
I have an implementation/optimization problem.
a. There is a set of weekending dates which comes in a file.
b. There is a set of account ids which comes from a second stream (driver)
c. There is a history table which I have to query :(over 650million records)
1. To get a subset of account ids from table which have the weekending dates given in the file mentioned above.
2. Do a self join to the extracted accounts to find accounts related by a particular field.
Till now, this was being done manually in a single query : creating a temp table of the accounts mentioned in (b) above. Joining this table to the history table(c) and the self join of the history to itself and the date range(a) being hard-coded in the query.
When I try to implement this is Datastage, the history table reads in all the records which is where the problem is. The history table has over 650 million records and it takes several hours till it can first get in all the records.
Is there a way that I can avoid creating temp tables and avoid doing a sql join and have it implemented in Datastage?
a. There is a set of weekending dates which comes in a file.
b. There is a set of account ids which comes from a second stream (driver)
c. There is a history table which I have to query :(over 650million records)
1. To get a subset of account ids from table which have the weekending dates given in the file mentioned above.
2. Do a self join to the extracted accounts to find accounts related by a particular field.
Till now, this was being done manually in a single query : creating a temp table of the accounts mentioned in (b) above. Joining this table to the history table(c) and the self join of the history to itself and the date range(a) being hard-coded in the query.
When I try to implement this is Datastage, the history table reads in all the records which is where the problem is. The history table has over 650 million records and it takes several hours till it can first get in all the records.
Is there a way that I can avoid creating temp tables and avoid doing a sql join and have it implemented in Datastage?