Page 1 of 1

Data Masking Pack

Posted: Wed Mar 27, 2013 3:34 pm
by suryadev
Hello,

We are using data masking pack in 8.7 version of datastage. The job with masking stage is taking too much of time like 20 min for 100K records , with out the masking stage it takes only 2 minutes.

What is the usual runtime for 100K records with only 5 or 6 stages?

Do we need to do anything to increase the performance?

Posted: Wed Mar 27, 2013 3:38 pm
by ray.wurlod
How long does it take Optim alone to perform this masking?

DataStage data masking acts as a wrapper that processes the data and invokes Optim data masking. So you'd expect some overhead but probably not a lot.

How complex are the masking algorithms? Keep in mind that you can require that the statistical relationships are preserved in the data, which carries quite a large overhead in its own right.

Posted: Thu Mar 28, 2013 8:26 am
by suryadev
Hi Ray,

We were not using Optim for doing the masking,earlier before this pack we were doing most of the masking with in the DS stages.

But now as we have data masking pack we are using it and see this kind of performance.

Also we are using repeatable replace function in data masking pack for only 3 fields.

Masking Performance

Posted: Thu Mar 28, 2013 8:27 am
by jisantangelo
You may want to consider utilizing Data Stage as just the ETL portion of your masking. You can leverage specific masking software, such as DMsuite which will allow you to perform masking at about 1 million rows per minute and does not require any coding on your part.

Posted: Thu Mar 28, 2013 8:39 am
by suryadev
Hi Jisan,

We had an overview on DMSuite and some other tools in the market but due to the unique requirements we can only handle them in Datastage as it is an ETL tool and includes many functions.

For some fields where we can use the data masking pack we are using it.

Posted: Thu Mar 28, 2013 9:42 am
by suryadev
Small correction, we are using data masking in 8.5 its not 8.7!

Sorry for the mistake

Also is there any difference between data masking pack in 8.5 and 8.7?

Business Logic in Masking Routines

Posted: Mon Apr 01, 2013 4:22 pm
by jisantangelo
I would suggest that you do not place any business logic in your data masking routines. It is generally bad practice, it means that you are creating an entire application just to mask an existing application.

I offer a free consultation to come up with a solution to your reqreuiments. We have been doing masking for 9 years now with many large multi-nationals and we deal with many ETLs including Informatica.

Posted: Tue Apr 02, 2013 10:52 am
by suryadev
Thank you Jisan.
We are not placing businees logic in our data masking routines.Its just the business requirements we are taking care to implement them using datastage.

Most likely all the requirments have been handled, its only the performance issue where it is taking more time in 8.5 when used data masking pack.

Please let me know if there is any performance improvement in data masking pack in the recent version 9.1 so that we will upgrade it.


Thanks again:)

Posted: Tue Apr 02, 2013 11:18 am
by jwiles
I don't believe there are changes in the Data Masking Pack for IS 8.5, 8.7 and 9.1. You can examine the New Features and Release Notes for 9.1 (a link is available here) to see if there are any changes or improvements listed.

If you suspect a performance problem with the pack, I recommend that you contact your official support provider for assistance.

Regards,

Posted: Tue Apr 02, 2013 12:02 pm
by suryadev
Thanks very much to everyone:)

I will reach out to the support!