change data capture - How can I preserve certain fields?

pibe86 · Post by **pibe86** » Thu Mar 31, 2016 3:07 pm

Mike · Post by **Mike** » Thu Mar 31, 2016 4:17 pm

Use the join stage with a full outer join. You will have access to all of the columns from the left link as well as all of the columns from the right link. Keep the ones that you want.

Mike

pibe86 · Post by **pibe86** » Fri Apr 01, 2016 9:06 am

Thanks. I tried it but didnt work, i am getting random info, sometimes it shows table 1 values, other times it shows table 2 values.

chulett · Post by **chulett** » Fri Apr 01, 2016 9:17 am

Make sure you run your job on a single node, that should take any 'randomness' out of the picture.

UCDI · Post by **UCDI** » Mon Apr 04, 2016 9:29 am

A full outer run sequentially sounds slow but would work.

If these are large files, you should be able to add an extra column by renaming the column in file2 to value2 (for example) so that when merged with a regular inner join you get (id, desc, value, value2) in your output, preserving both values. If you do something like that and then use a partitioning that splits by keys you can do it in parallel as well.

A transformer on the back end could make a decision on which column to keep (custom merge of value and value2) or however you want to actually handle the issue?

Or, actually, it may just be that you need to swap the "left and right" and try it again as you already have it?