Page 1 of 1

Compare fields in two different flat files

Posted: Mon Dec 02, 2013 4:37 am
by Arpanakhade2009
We need to compare two different files as below and output should be the record no. and the field name which does not match with the before file.
Also deleted records from before file need to be captured.

All fields are Nvarchar & we cannot sort the fields based on key field col1 as sequence is important and comparison should be done in the sequence order only.

before file:
col1, col2, col3
1,2,3
4,5,6
7,8,9
1,1,1
2,2,2
3,3,3

after file:
col1, col2, col3
1,2,3
4,9,6
7,8,8
5,2,2
3,3,3
4,4,4

expected o/p

record 2 , col2 , update
record 3, col3, update
record 4, col1, delete
record 4, col2, delete
record 4, col3, delete

Tried Join, Difference but no luck. Also tried introducing a new col surr key but then will need to make a new join for every field. Or are we left with use of transformer.

Please suggest a design logic.

Posted: Mon Dec 02, 2013 5:01 am
by ray.wurlod
What does "no luck" mean?

Have you tried Change Capture stage?

Posted: Mon Dec 02, 2013 5:04 am
by Arpanakhade2009
Have tried that but change capture did not give the expected results.

Posted: Mon Dec 02, 2013 7:49 am
by chulett
Well... how did you set up the stage? What results did you get?

Posted: Mon Dec 02, 2013 2:51 pm
by ray.wurlod
How are you partitioning your data? Does it work (whatever that means) if you run on a single node configuration?

Posted: Tue Dec 03, 2013 10:42 am
by bart12872
you are missing information,

you should create a surrogate key in the before file based on the line number
it will be your key

but in your exemple, why do you say "record 4, col1, delete"
I see it as an update
record 4 before 1,1,1 after 5,2,2
the output could be
record 4, col1, update
record 4, col2, update
record 4, col3, update


Let's say you have
before file:
col1, col2, col3
1,2,3
4,5,6
7,8,9
1,1,1
2,2,2
3,3,3

after file:
col1, col2, col3
1,2,3
4,5,6
7,8,9
5,2,2
1,1,1
2,2,2
3,3,3

how do you know that 1,1,1 is an update or a delete