Loop through all the records in the DB for every record.

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
Minhajuddin
Participant
Posts: 467
Joined: Tue Mar 20, 2007 6:36 am
Location: Chennai
Contact:

Loop through all the records in the DB for every record.

Post by Minhajuddin »

Hi there,

Please help me to fix this tricky situation.

I have a sequential file. A DB stage which I used to lookup. And a target file.
I want to compare every input record with all the records in the DB stage(Which is huge 17 million). To add to the complexity I need to parse the values from an input column and compare it with the columns. I don't know how to approach this problem. Can we do it using Buildops. I tried searching for buildop in the DS documentation and didn't find much useful information in it. It would be great if you guys help me out of this.

Thank you.
Minhajuddin

<a href="http://feeds.feedburner.com/~r/MyExperi ... ~6/2"><img src="http://feeds.feedburner.com/MyExperienc ... lrow.3.gif" alt="My experiences with this DLROW" border="0"></a>
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Build stage probably won't help all that much.

What you need first is a specification. You can generate the computed comparison value in an upstream Transformer stage, before attempting the lookup/join/merge.

Is the comparison against one column in every row, or against more than one column? In the latter case you will probably need to supply the comparison value a number of times (generate multiple copies of it from the Transformer stage) so that you can match the search key columns.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Minhajuddin
Participant
Posts: 467
Joined: Tue Mar 20, 2007 6:36 am
Location: Chennai
Contact:

Post by Minhajuddin »

Which stage would work in this situation where I need to compare every record with all the records in the DB. And for every record I may find a dozen matches. I need to send these 12 matched records and the input source record to the output.
Minhajuddin

<a href="http://feeds.feedburner.com/~r/MyExperi ... ~6/2"><img src="http://feeds.feedburner.com/MyExperienc ... lrow.3.gif" alt="My experiences with this DLROW" border="0"></a>
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

I think what you're seeking to do is a join, best accomplished with a Join stage. But I'm not sure, which I why I suggested that a specification is the first thing you need - I don't think you're sure either.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Minhajuddin
Participant
Posts: 467
Joined: Tue Mar 20, 2007 6:36 am
Location: Chennai
Contact:

Post by Minhajuddin »

Hi Ray,
I could have used a Join if I just had one condition to compare.
But the scenario I am in, is a little different.

I need to see if condition1 is true OR condition2 is true OR ........and so on.

I have almost 10 conditions. And as far as I know we can't use OR statements as key columns in a Join. I know it sounds a bit confusing. But the following example should make my problem clearer.

Say I have the input data in the following format.

Code: Select all


EmpID		EmpResAddress		     EmpOfficialaddr
===========================================

1		     HN 1:Bakers Street	     Of1:Hogwarts Ville
		
2		     HN 2:Share Street	      Of2:Walt Disney
		
I have similar data in my Database. Now I need to parse all the values and compare the values with all the possible permutations. Say I compare the first field in the office address (Of1) with the first field in the office address of the DB and I compare Of1 with the first field in the Home address of the DB too. And I do the same with the Residential address. So here It all gets complicated. I don't think that we can implement this with a transformer or a join. I think that BuildOp can do this. I don't know how. I need the help of all you experts in this.

Thanks a lot.
:D
Minhajuddin

<a href="http://feeds.feedburner.com/~r/MyExperi ... ~6/2"><img src="http://feeds.feedburner.com/MyExperienc ... lrow.3.gif" alt="My experiences with this DLROW" border="0"></a>
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

See what I mean? A specification makes it clearer what your intent is. A full specification would help you (though not us) even more.

Is the database Oracle or DB2? If so, you can do this using a sparse lookup in combination with user-defined SQL. Or you might pass the ten key values to a stored procedure via an ODBC Enterprise stage and have it return the result set of the SP. Or you could use user-defined SQL to construct the compound join condition.

Don't essay the Build stage route until you've eliminated all the easier possibilities; using a Build stage means that the organization will need to maintain/buy C++ expertise to maintain the code.

However, if it's addresses you're comparing, have you considered using QualityStage? With multiple match passes you could ascertain all discoverable combinations, even those with minor misspellings.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Minhajuddin
Participant
Posts: 467
Joined: Tue Mar 20, 2007 6:36 am
Location: Chennai
Contact:

Post by Minhajuddin »

Hi Ray,

My backend DB is Oracle. I don't know what a specification is.
Can you shed some light on it.
And since I am just a participant. I could not see the whole message.
Anyway thanks for all you help.
Minhajuddin

<a href="http://feeds.feedburner.com/~r/MyExperi ... ~6/2"><img src="http://feeds.feedburner.com/MyExperienc ... lrow.3.gif" alt="My experiences with this DLROW" border="0"></a>
Minhajuddin
Participant
Posts: 467
Joined: Tue Mar 20, 2007 6:36 am
Location: Chennai
Contact:

Post by Minhajuddin »

Please Help! :shock:
Minhajuddin

<a href="http://feeds.feedburner.com/~r/MyExperi ... ~6/2"><img src="http://feeds.feedburner.com/MyExperienc ... lrow.3.gif" alt="My experiences with this DLROW" border="0"></a>
michaeld
Premium Member
Premium Member
Posts: 88
Joined: Tue Apr 04, 2006 8:42 am
Location: Toronto, Canada

Post by michaeld »

Maybe you can optimize the compare part of this task. That is, do you really need to compare with every record or can you limit it somehow?

You can try doing the parsing in SQL if it helps you get a direct join with the lookup table.

You can do a sparse lookup if you can find a way to limit you result set with an SQL WHERE clause.

I would need more details to give you a less general answer.
Mike
Post Reply