Page 1 of 1

validation on records

Posted: Mon Feb 04, 2008 2:48 am
by basu.ds
Hi Can any one help me that i want to know finding the multiple records match with single reord which is transaction record if matches with multple records in master table how should i find using server joobs.
thanks in advacnce

Posted: Mon Feb 04, 2008 3:41 am
by ray.wurlod
If there are multiple matches how can you call it a master table?

Please take the time to write out a specification (in English, not in "DataStage") about what the rules are. What is your source, where is the master table, on what basis do you match your source records, what (precisely) do you mean by a "transaction record"?

Posted: Mon Feb 04, 2008 3:50 am
by basu.ds
i know that is not called masterdata but i want know if one match with multiple records in with particular table how can i found wether particular record match with multiple record in table.assume that my source is CSV file but i want to validate with table.there is a possiblity that duplicates in my table if that is the case how can i found

Posted: Mon Feb 04, 2008 4:00 am
by ray.wurlod
There are several de-duplication techinques available. The easiest is probably to run a query against the table

Code: Select all

SELECT keycol,COUNT(*) FROM table GROUP BY keycol HAVING COUNT(*) > 1;
If you now want to compare that against the contents of some CSV file, load the SQL result into a hashed file and compare against that; any lookup that does not fail has duplicates in the target table for that key value.
A slower solution would be to do the lookup directly against the target table.

Code: Select all

SELECT keycol,COUNT(*) FROM table WHERE keycol = ? GROUP BY keycol HAVING COUNT(*) > 1;
or leave out the HAVING clause and test the returned count.

Posted: Mon Feb 04, 2008 5:19 am
by basu.ds
Using data satge i want to reject that data if multiple matc occure

Posted: Mon Feb 04, 2008 7:29 am
by sachin1
bellow job design can help.

OCI stage(1)---->transformer1--->aggregator-------> transformer2---->transformer3--->OCI stage(2).

i aggregation stage you find out duplication like

select deptno,count(deptno) from emp group by deptno.

then in transformer 3 apply condition that count > 1 rows should propagate further, just deptno propagated.

in OCI stage(2) write a user defined sql like below

delete from empsb a where rowid > (select min(rowid) from empsb b where a.deptno = b.deptno and b.deptno = :1)

hope this helps you.

Posted: Mon Feb 04, 2008 4:18 pm
by ray.wurlod
sachin1, where did you read in the specification from basu.ds that there is a need to delete anything?

"Reject records" is not the same as "delete records".

Rejecting records is simply a matter of having a constraint on a Transformer stage output link that does not allow records to pass if the count is greater than one (using my second method) or if the lookup succeeds (using my first method). If you want to capture the rejected records, simply add a reject-handling output link to the Transformer stage.

Posted: Mon Feb 04, 2008 10:25 pm
by basu.ds
hi sachin thanks for response but i want to make a look up with table.

Posted: Tue Feb 05, 2008 5:00 am
by sachin1
sorry for wrong interpretation,

i have taken an input file emp.txt with below records
1,sac,10
2,mark,10
3,pras,20
4,sush,20
6,ritu,30
7,ash,40
8,ddd,50

i have dept table with below records


SQL> select deptno from dept;

DEPTNO
----------
10
20
30
40

SQL>


see that your input record is sorted for lookup.

stage variables are used for counting number greater than 0 for matched records, then in next transformer depending upon count variable you can reject records.

Posted: Tue Feb 05, 2008 5:22 am
by basu.ds
no i know this but i want opposite to your answer.like please change your data i mean emp.txt file should be dept table and table should be emp.txt data.if one record match with multiple records in table that record shpuld reject.

Posted: Tue Feb 05, 2008 9:40 am
by sachin1
vice versa works dear !!

Posted: Tue Feb 05, 2008 3:32 pm
by ray.wurlod
Three questions for basu.ds:
1. Please post YOUR input data and lookup table.
2. What output do you require:
(a) if there is no match
(b) if there is exactly one match
(c) if there is more than one match?
3. How have you currently designed this and what's happening in your design?