Metadata and Dataset

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
wahi80
Participant
Posts: 214
Joined: Thu Feb 07, 2008 4:37 pm

Metadata and Dataset

Post by wahi80 »

Hi,

I wanted to know if a dataset is created in Job1 with 20 columns.

In Job2 we read from this dataset and the metadata is defined only for 10 columns, the remaining 10 columns are never defined in the dataset.

The job executes successfully, but I wanted to know will the data integrity be maintained.Currently data looks fine, but I'm not sure of large data loads

Anyone faced similar issues?

What if I have a Job3 and I define only 5 columns out of 20 in metadata?

Regards
Ankur
nagarjuna
Premium Member
Premium Member
Posts: 533
Joined: Fri Jun 27, 2008 9:11 pm
Location: Chicago

Post by nagarjuna »

Hi ,
When you are creating any datasets then it will store the schema in the descriptor file .In your case , you have created dataset with 20 col .In otherjob , eventhough you specify 5 col you are able to view the data.But , If you change any datatype then you wont be able to read the data.
Nag
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Is Runtime Column Propagation enabled? What (precisely) do you mean by "data integrity" here?

In no stage type (except Sequential File stage*) do you need to read all columns from source.

* Even in Sequential File stage there is a column property "drop on import" available. But you still have to read every byte in the file to get to the next.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
wahi80
Participant
Posts: 214
Joined: Thu Feb 07, 2008 4:37 pm

Post by wahi80 »

Run Time propogation is not enabled. By data integrity I meant if data could get corrupted
Last edited by wahi80 on Mon Jun 08, 2009 8:46 am, edited 1 time in total.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Nothing will happen to data unless YOU program DataStage to make those changes.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
ajay.vaidyanathan
Participant
Posts: 53
Joined: Fri Apr 18, 2008 8:13 am
Location: United States

Metadata and Dataset

Post by ajay.vaidyanathan »

Hi,

As Nagarjuna rightly said, unless you make any changes (Metadata Change) to your dataset in the first job, you can always go ahead and use it for the specified number of columns you want in the next successive jobs.
Regards
Ajay
sjfearnside
Premium Member
Premium Member
Posts: 278
Joined: Wed Oct 03, 2007 8:45 am

Post by sjfearnside »

If the source data you are reading has mandatory columns, i.e. must have a valid value, and you drop one or more of those columns, you will have a potential data integrity problem.
Post Reply