Data Masking using parallel jobs
Moderators: chulett, rschirm, roy
Data Masking using parallel jobs
Hi,
Could anyone please explain briefly how a masking is done using parallel jobs in Data Stage 7.5.
How a typical Masking job should be?
Thanks,
DS.
Could anyone please explain briefly how a masking is done using parallel jobs in Data Stage 7.5.
How a typical Masking job should be?
Thanks,
DS.
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
-
- Participant
- Posts: 3593
- Joined: Thu Jan 23, 2003 5:25 pm
- Location: Australia, Melbourne
- Contact:
There is a data masking for DataStage solution which is due in late 2009! IBM Professional Services have already built one using routines but they are not sharing it unless you engage them directly. The main question is whether you really need data masking. Most sites make the ETL server secure and don't need to mask the data on it.
Certus Solutions
Blog: Tooling Around in the InfoSphere
Twitter: @vmcburney
LinkedIn:Vincent McBurney LinkedIn
Blog: Tooling Around in the InfoSphere
Twitter: @vmcburney
LinkedIn:Vincent McBurney LinkedIn
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Masking
Masking is just like desoding or encrypting the original records in the table to some other values that may be from selecting from other files.ray.wurlod wrote:Welcome aboard.
Can you be more specific about what you mean by "masking"? ...
-
- Participant
- Posts: 3593
- Joined: Thu Jan 23, 2003 5:25 pm
- Location: Australia, Melbourne
- Contact:
You need to find out what needs to be masked, why it needs to be masked and how robust that masking needs to be. Credit Card numbers for example need 106 bit encryption or masking that DataStage cannot do, customer names could just be scrambled with an external C++ routine algorithm. There are applications to encrypt that sit outside DataStage, there are masking functions for Oracle database that you could apply on load, there is a robust masking solution from IBM via the Optim product that protects test data, there are the encode and decode stages, there are java calls and web services calls you can make to third party encryption and hashing products. If you mask too early you can't tranform the data. If you have unmasked data landing on the DataStage server then it is exposed then no amount of masking will undo that.
Certus Solutions
Blog: Tooling Around in the InfoSphere
Twitter: @vmcburney
LinkedIn:Vincent McBurney LinkedIn
Blog: Tooling Around in the InfoSphere
Twitter: @vmcburney
LinkedIn:Vincent McBurney LinkedIn
vmcburney wrote:You need to find out what needs to be masked, why it needs to be masked and how robust that masking needs to be. Credit Card numbers for example need 106 bit encryption or masking that DataStage cannot do, customer names could just be scrambled with an external C++ routine algorithm. There are applications to encrypt that sit outside DataStage, there are masking functions for Oracle database that you could apply on load, there is a robust masking solution from IBM via the Optim product that protects test data, there are the encode and decode stages, there are java calls and web services calls you can make to third party encryption and hashing products. If you mask too early you can't tranform the data. If you have unmasked data landing on the DataStage server then it is exposed then no amount of masking will undo that.
Thank you,
I understnad the idea. Working on developing masking jobs. Hope to find more info..
-
- Participant
- Posts: 64
- Joined: Tue Sep 23, 2008 9:54 am
Data Obfuscation
[quote="ray.wurlod"]Others also have data obfuscation routines available. But is that what you mean by "masking"? ...[/quote]
Where can I find the Data Obfuscation routines?
Where can I find the Data Obfuscation routines?
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
The RoadMap presentation at IOD 2008 conference mentioned that this is functionality "they" (IBM) are looking at including in a future release. Possibly as early as 8.2, but no promises.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
-
- Participant
- Posts: 3593
- Joined: Thu Jan 23, 2003 5:25 pm
- Location: Australia, Melbourne
- Contact:
Right now probably the easiest way to do it is on the way out of the database using some of the data masking functions that are now available on the latest DBs using user-defined SQL or a stored procedure as the source. There are some DataStage data masking functions offered by IBM services but I've never seen them and I don't know how they are implemented. Possibly some type of routines - C or basic. You still need to apply them to every column. You could also have a look at Optim from IBM - a product that specialises in data masking and could be used prior to the data reaching DataStage.
Certus Solutions
Blog: Tooling Around in the InfoSphere
Twitter: @vmcburney
LinkedIn:Vincent McBurney LinkedIn
Blog: Tooling Around in the InfoSphere
Twitter: @vmcburney
LinkedIn:Vincent McBurney LinkedIn