Information on stages

soumya5891 · Post by **soumya5891** » Wed Feb 11, 2015 11:23 am

I have a situation where I have to determine any cross flow dataset dependency which is like below

Assumes one sequence job S1,it consists two parallel job P11 and P12 and another sequence S2,it consists two parallel job P21 and P22.

Now assume one dataset has been written in the P11 job and which is ds1.Now ds1 is directly using in P21.So in this case ds1 is being created in one flow and being used in another flow. I need to find out this type of datasets. Can you please suggest what is the best way for this?

ShaneMuir · Post by **ShaneMuir** » Wed Feb 11, 2015 11:48 am

I don't suppose that you have saved the table definitions of the datasets and imported them into the DataSet stage each time a job was created? If yes then you can just do a where used on the table definition.

Otherwise, it would most likely be some sort of export to DSX and search for the jobs using the specific dataset name (provided its not parameterised).

soumya5891 · Post by **soumya5891** » Wed Feb 11, 2015 11:59 am

Thanks a lot for the reply. You are right am not saving the table definition of the dataset.

I have missed in the requirement,extremely sorry for that. My input is the sequence job name (S2) from that I need to get the result.

I have tried it with some kind of shell scripting on the .dsx file of whole time but as the size of DSX is large so its taking long time to give the result.

ray.wurlod · Post by **ray.wurlod** » Wed Feb 11, 2015 3:28 pm

The best way is to use Metadata Workbench. It has the ability automatically to recognise the situation you describe, and report on it.