SCD Stage and sparse lookup to Oracle connector

msukkar · Post by **msukkar** » Wed Nov 30, 2011 10:23 pm

Is it possible to use sparse lookup as reference link to SCD stage? If yes, what is the needed configuration?

The volumes we are looking at eventually are ~ 20 million for both stream and reference. Right now the job is failing when the stream is just ~ 1 million and reference is ~ 12 million. The SCD stage is consuming all memory when using normal lookup to Oracle or when we replace the connector with a dataset containing the same reference information.

We would like to try if sparse lookup will at least get the job to complete until we revisit the whole design. When I changed the lookup type to sparse it compiled successfully, but at run time, the job is aborting right away with this message:

main_program: Data set, file set, or file "TABLE_NAME_Ref_Tab:StagingRef.v": Non-pipe (or non-virtual data set) with .v in its name

TABLE_NAME_Ref_Tab is the Oracle connector stage name
StagingRef is the name of the reference link going to the SCD stage directly.

When I switch back to the normal lookup, compile and rerun, the job proceeds normally (until it fails eventually).

One weird thing happens as well when I use sparse; the derivations on the lookup tab in SCD (fast path 2 of 5) disappear after compiling the job.

I searched the forum for this kind of error, but I could not find anything related. I hope someone can shed some light on this or point me to a previous post, if any.

Thanks!

SURA · Post by **SURA** » Wed Nov 30, 2011 10:41 pm

I am not sure about the lookup. But i read somewhere, when you have huge volume of data, then you can increase the nodes, do HASH parition.

As per my experience, clear the parition in the previous stage and then in SCD stage do hash partition. It may work !!

Give it a go.

DS User

msukkar · Post by **msukkar** » Thu Dec 01, 2011 7:58 am

Thanks SURA for replying. We are running on 4-node AIX configuration.

So you mean do hash partitioning on both the stream and the reference links? or just the stream or just the reference?

I tried hash partitioning on the stream before and it did not help. I tried it on the reference and it gave me a runtime warning that it has to repartition another way. I assume it will repartition as entire. I tried entire partitioning on the reference and it did not help either but did not give a warning.

ray.wurlod · Post by **ray.wurlod** » Thu Dec 01, 2011 1:54 pm

SURA means that both inputs should be identically hash partitioned. You can't prevent the alert about being unable to use shared memory; this can legitimately demoted with a message handler.

DSXchange

SCD Stage and sparse lookup to Oracle connector

SCD Stage and sparse lookup to Oracle connector

Re: SCD Stage and sparse lookup to Oracle connector