Hello,
I am transitioning from the server to the parallel world so bear with me...
I have two oracle tables: one is used as stream and the other as reference. I need to find out which rows from the stream are NOT in the reference and output them to a target file.
I am using the Oracle connector for both stream and reference (row count on each is under 400) and I am using a Lookup stage with a reject link to capture the failed lookups. I coded for a sparse lookup in the reference Oracle connector.
Since the Lookup stage does not allow me to code only the reject link I have also an output stream link to a flat file for the lookup hits.
What I am seeing is that the job outputs 200k+ rows to the output stream link. In looking at this data I find that there are lots of duplicates (500+) for each input stream row.
Can you please explain this behavior of the Lookup stage?
Thanks.
Marco
Lookup stage outputs duplicate rows
Moderators: chulett, rschirm, roy
-
- Premium Member
- Posts: 43
- Joined: Tue Sep 09, 2008 1:56 pm
Lookup stage outputs duplicate rows
ASU Developer
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
-
- Premium Member
- Posts: 43
- Joined: Tue Sep 09, 2008 1:56 pm