Universe Query -

sarjushah · Post by **sarjushah** » Tue Mar 31, 2009 4:39 pm

Hi,

I have this query in oracle which I would like to convert to Universe.

My problem is how can I get the variables I passed to the query back in output in Universe. Just replacing it with ? wont help

SELECT
p.polcy_id,
p.poldm_id,
p.sc_case_key,
p.sc_cont_no,
:1, :2, :3
FROM policy p
WHERE
p.polcy_id = :1 and
p.STATUS_STRT_DT <= :3 and
p.STATUS_END_DT >= :2

ray.wurlod · Post by **ray.wurlod** » Tue Mar 31, 2009 4:58 pm

I'm surprised it works with Oracle.

Parameter markers are not supposed to work in the SELECT clause.

The SELECT clause returns columns that exactly match the metadata. Parameter markers in the WHERE clause (and this has to be done on a reference link) are populated with Key Expression values from the Transformer stage, in order. This is where you use "?" parameter markers if communicating with any database other than Oracle.

sarjushah · Post by **sarjushah** » Wed Apr 01, 2009 8:25 am

The reason I am returning the variables back is the query is not a equi join it is a > or < than check so once we get the required result how do we get a successful look up without passing the back the input parameters back to the output, so we can do a successful look up.

We do have to define those 3 addition column tough in the oci stage.

Now when I try to do that in a universe stage, I am not sure what the syntax is. I am looking for way to convery my query below to work with a universe stage.

SELECT
p.polcy_id,
p.poldm_id,
p.sc_case_key,
p.sc_cont_no,
:1, :2, :3
FROM policy p
WHERE
p.polcy_id = :1 and
p.STATUS_STRT_DT <= :3 and
p.STATUS_END_DT >= :2

chulett · Post by **chulett** » Wed Apr 01, 2009 8:33 am

I'd suggest first creating the lookup as a strict equi-join and letting it generate the sql needed. Then take that and modify it to add in the range logic.

sarjushah · Post by **sarjushah** » Wed Apr 01, 2009 9:26 am

A modified version of equijoin that we know works looks like this, but problem facing is getting the input variable;es back in the output result.

SELECT
p.polcy_id,
p.poldm_id,
p.sc_case_key,
p.sc_cont_no,
:1, :2, :3
FROM policy p
WHERE
p.polcy_id = ? and
p.STATUS_END_DT >= ? and
p.STATUS_STRT_DT <= ?

Scanned thru several universe manuals on IBM's site and no good example yet.

http://www-01.ibm.com/software/data/u2/ ... ry/96univ/

But this is weird for me that you need to call each variable a ? and cannot qualify them like we do in oracle with a number sign so you know which variable it is. What if we need to use that same variable multiple times in the where clause, not sure how that could be achieved in the universe stage.

All this apart I am gonna concentrate on the task at hand of finding a way of passing back variables.

Any pointers / manuals appreciated.

chulett · Post by **chulett** » Wed Apr 01, 2009 9:58 am

Every stage other than OCI uses a ? for the bind variable and they are positional, meaning the first one binds to the first column, etc. This is modified by keys v. non-keys i.e. fields in v. out of the where clause. Therefor you cannot send the same field in more than once but rather need to repeat the field value in multiple columns.

chulett · Post by **chulett** » Wed Apr 01, 2009 10:19 am

No, you can't - and I was wondering if that was why the question was being asked. Note that you use the UV stage, not the Universe stage, over the hashed file for this.

sarjushah · Post by **sarjushah** » Wed Apr 01, 2009 10:24 am

I thought there way a way to access a hash file using a odbc stage. But I did not know how.

If you go to this site they mention not doing it due to performance issue, but dont explain on how to do it.

Any pointer appreciated on this topic.

Go to this link and search for ODBC - (I need to use google cache because it is a blocked site from work.)

http://www.google.com/search?q=cache:h1 ... =firefox-a

chulett · Post by **chulett** » Wed Apr 01, 2009 10:35 am

You cannot.

They mention those two stages together - UV/OBDC - as they are the only two stages that allow you to return 'multiple row' results rather than a singelton row. And the performance comments were related to direct database reference lookups.

sarjushah · Post by **sarjushah** » Wed Apr 01, 2009 10:42 am

Thanks for the info

So I am back to square one.

How do you do a range based lookup using a hash file.

I can do it using a OCI stage but we are processing more than 56 million rows and have 4 range based look up so it kills the performance.

So I started out to see if I can get the data in hash file and do a range based search using a UV stage and that did not go well as you know and we cannot use a ODBC to read a hash file as you mentioned.

So what are my options

I checked the range based discussion in the forum.

chulett · Post by **chulett** » Wed Apr 01, 2009 11:10 am

I've used the technique and routines Ken discusses here and it worked out well for me.

kcbland · Post by **kcbland** » Wed Apr 01, 2009 11:21 am

To answer the original post, why would you even need to see values that were passed to the key expression when you already have them available? As for the ultimate question about ranged lookups against hashed files for high volumes - forget about using the UV/ODBC interface as it's better than a database ODBC/OCI lookup but not the best. If you have PX you should use the Join stage there to do this work, otherwise, the best solution is the equi-join with an embedded array of pre-built values.

sarjushah · Post by **sarjushah** » Wed Apr 01, 2009 12:16 pm

kcbland wrote:To answer the original post, why would you even need to see values that were passed to the key expression when you already have them available? As for the ultimate question about ranged lookups against hashed files for high volumes - forget about using the UV/ODBC interface as it's better than a database ODBC/OCI lookup but not the best. If you have PX you should use the Join stage there to do this work, otherwise, the best solution is the equi-join with an embedded array of pre-built values.

The reason I need the parameters back in the output is I am not doing a equijoin on the dates so the dates from the lookup tables that I will get will be different from what I passed in and since they are different my datastage lookup will not be successful and so to get the data from the lookup if a row is found I need the parameters as part of the output.

I think going over the network 200 million times for the 4 look ups vs looking up data from files on the same server has to be faster but the problems remains how can I achieve that.

kcbland · Post by **kcbland** » Wed Apr 01, 2009 12:38 pm

Huh? 200 million network + database queries is better than a hashed equi-join lookup with preloaded/cached data? Unless you think the PX Join is doing network lookups (sparse lookup) then I don't know what you're thinking.

If Server is your only option, the best choice is to find a way to use a hashed file with caching. That entails nesting your data so that you can use an equi-join. You have a business key with variants having a surrogate key and a date effective, rollup that day using the business key as the primary key in the hashed file. Sort the data in the rollup so that it's ordered in the way that gives you the common answers quicker (desc or ascending). Use a function to parse the array returned on each lookup choosing the appropriate answer. Craig's pointed you a way to do this.

sarjushah · Post by **sarjushah** » Wed Apr 01, 2009 1:05 pm

kcbland wrote:Huh? 200 million network + database queries is better than a hashed equi-join lookup with preloaded/cached data? Unless you think the PX Join is doing network lookups (sparse lookup) then I don't know what you're thinking.

If Server is your only option, the best choice is to find a way to use a hashed file with caching. That entails nesting your data so that you can use an equi-join. You have a business key with variants having a surrogate key and a date effective, rollup that day using the business key as the primary key in the hashed file. Sort the data in the rollup so that it's ordered in the way that gives you the common answers quicker (desc or ascending). Use a function to parse the array returned on each lookup choosing the appropriate answer. Craig's pointed you a way to do this.

I LIKE THIS ONE.
If I get it correct.

1) Basically create a array of data for each primary key in the hash file .
2) Read the hash file using the primary key
3) now read the array and get the data you want.

Let me do some research on
(1) creating a array of data for each primary key
(2) parsing the array.

Let me search the forum but if you have it handy please pass it on.

MY NEW APPROACH

before I got this I had already started on this approach , let me know what you think.

Create a function to do the following
1) Call the function using the primary key and 2 dates for which I need to do a greater than less than approach

2) read the hash file using the key and step thru each row that matches it and compare the dates and if you find hit wite the row to output and come out.

I know this would defeat my initial purpose of speed but It is now bugging me that I cannot do this look on a hash file which is not a equi join. So i need to crack this one to get a good night sleep.