SCD Stage and sparse lookup to Oracle connector

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
msukkar
Participant
Posts: 7
Joined: Wed Aug 03, 2011 1:30 pm

SCD Stage and sparse lookup to Oracle connector

Post by msukkar »

Is it possible to use sparse lookup as reference link to SCD stage? If yes, what is the needed configuration?

The volumes we are looking at eventually are ~ 20 million for both stream and reference. Right now the job is failing when the stream is just ~ 1 million and reference is ~ 12 million. The SCD stage is consuming all memory when using normal lookup to Oracle or when we replace the connector with a dataset containing the same reference information.

We would like to try if sparse lookup will at least get the job to complete until we revisit the whole design. When I changed the lookup type to sparse it compiled successfully, but at run time, the job is aborting right away with this message:

main_program: Data set, file set, or file "TABLE_NAME_Ref_Tab:StagingRef.v": Non-pipe (or non-virtual data set) with .v in its name

TABLE_NAME_Ref_Tab is the Oracle connector stage name
StagingRef is the name of the reference link going to the SCD stage directly.

When I switch back to the normal lookup, compile and rerun, the job proceeds normally (until it fails eventually).

One weird thing happens as well when I use sparse; the derivations on the lookup tab in SCD (fast path 2 of 5) disappear after compiling the job.

I searched the forum for this kind of error, but I could not find anything related. I hope someone can shed some light on this or point me to a previous post, if any.

Thanks!
SURA
Premium Member
Premium Member
Posts: 1229
Joined: Sat Jul 14, 2007 5:16 am
Location: Sydney

Re: SCD Stage and sparse lookup to Oracle connector

Post by SURA »

I am not sure about the lookup. But i read somewhere, when you have huge volume of data, then you can increase the nodes, do HASH parition.

As per my experience, clear the parition in the previous stage and then in SCD stage do hash partition. It may work !!

Give it a go.

DS User
msukkar
Participant
Posts: 7
Joined: Wed Aug 03, 2011 1:30 pm

Post by msukkar »

Thanks SURA for replying. We are running on 4-node AIX configuration.

So you mean do hash partitioning on both the stream and the reference links? or just the stream or just the reference?

I tried hash partitioning on the stream before and it did not help. I tried it on the reference and it gave me a runtime warning that it has to repartition another way. I assume it will repartition as entire. I tried entire partitioning on the reference and it did not help either but did not give a warning.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

SURA means that both inputs should be identically hash partitioned. You can't prevent the alert about being unable to use shared memory; this can legitimately demoted with a message handler.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply