Compare files taken from FTP

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
ganesh.soundar
Participant
Posts: 9
Joined: Tue Jan 08, 2008 7:21 am
Location: Chennai

Compare files taken from FTP

Post by ganesh.soundar »

Hi,

I have a requirement to fetch two files from FTP server and then to perform checksum operation on those files. Need to identify those files are identical or different using the checksum value.

I can perform column or record level checksum using Basic transformer. But no clue of how to do this comparison for files. Please let me know how to do this.

Regards,
Raja
Sainath.Srinivasan
Participant
Posts: 3337
Joined: Mon Jan 17, 2005 4:49 am
Location: United Kingdom

Post by Sainath.Srinivasan »

Checksum is a name and not a fixed logic.

You need to be asking this to your business or source team of file.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Right, whomever gave you that requirement should also be able to tell you how you would calculate that 'checksum' value in your organization. Hopefully some sort of command line utility, easily scripted.
-craig

"You can never have too many knives" -- Logan Nine Fingers
LNarayan
Premium Member
Premium Member
Posts: 23
Joined: Mon Aug 04, 2008 1:58 am

Post by LNarayan »

chulett wrote:Right, whomever gave you that requirement should also be able to tell you how you would calculate that 'checksum' value in your organization. Hopefully some sort of command line utility, easily scripted.
Thanks. Apart from checksum is there any stage in DS to identify the source files are identical or different?
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

At the file level? Not directly, no. This is where you would leverage your operating system, something easily incorporated into a DataStage job stream using a Sequence job, however.
-craig

"You can never have too many knives" -- Logan Nine Fingers
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Is there any reason you can't use the UNIX command diff to perform the comparison? This could be invoked from an Execute Command activity in a job sequence.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply