Dataset Management Utility Doesn't Work

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
ds2000
Premium Member
Premium Member
Posts: 109
Joined: Sun Apr 22, 2007 7:25 pm
Location: ny

Dataset Management Utility Doesn't Work

Post by ds2000 »

When i try to view a dataset using Dataset Management Utility in Designer. I pointed to data file path which is mentioned in config file. I get following error messgage "This is not a valid dataset file or format is not currently supported."



How can i run Orchadmin from windows workstation ?
throbinson
Charter Member
Charter Member
Posts: 299
Joined: Wed Nov 13, 2002 5:38 pm
Location: USA

Post by throbinson »

The Data File Path in the config file? That is not correct. You need to point to the descriptor file. It is this file which will contain the config file paths to the resource disks for the dataset. The descriptor file will provide orchadmin ont he command line or the Dataset Management tool in Designer with the info needed to view the data, schema Config file, etc. The descriptor file is the filename and path that was defined in the dataset stage of the job that created the dataset.
ds2000
Premium Member
Premium Member
Posts: 109
Joined: Sun Apr 22, 2007 7:25 pm
Location: ny

Post by ds2000 »

throbinson:
When i pointed to header file i was able to view the data.

However i have following scenario:
In our jobs we are creating datasets on an external file server
(e.g. \\fileserver\dev\proj\work\test.ds) but in config file admin has pointed datasets path as below:

node "node1"
{
fastname "DSS01"
pools "" "node1" "DSS01"
resource disk "d:/DSS01/apps/Ascential/DataStage/Datasets" { pools "" }
resource scratchdisk "d:/DSS01/apps/Ascential/DataStage/Scratch" { pools "" }
}

node "node2"
{
fastname "DSS01"
pools "" "node2" "DSS01"
resource disk "d:/DSS01/apps/Ascential/DataStage/Datasets" { pools "" }
resource scratchdisk "d:/DSS01/apps/Ascential/DataStage/Scratch" { pools "" }
}


Should admin needs to change the config file and give fileserver path in config file?
ds2000
Premium Member
Premium Member
Posts: 109
Joined: Sun Apr 22, 2007 7:25 pm
Location: ny

Post by ds2000 »

throbinson:
When i pointed to header file i was able to view the data.

However i have following scenario:
In our jobs we are creating datasets on an external file server
(e.g. \\fileserver\dev\proj\work\test.ds) but in config file admin has pointed datasets path as below:

node "node1"
{
fastname "DSS01"
pools "" "node1" "DSS01"
resource disk "d:/DSS01/apps/Ascential/DataStage/Datasets" { pools "" }
resource scratchdisk "d:/DSS01/apps/Ascential/DataStage/Scratch" { pools "" }
}

node "node2"
{
fastname "DSS01"
pools "" "node2" "DSS01"
resource disk "d:/DSS01/apps/Ascential/DataStage/Datasets" { pools "" }
resource scratchdisk "d:/DSS01/apps/Ascential/DataStage/Scratch" { pools "" }
}


Should admin needs to change the config file and give fileserver path in config file?
throbinson
Charter Member
Charter Member
Posts: 299
Joined: Wed Nov 13, 2002 5:38 pm
Location: USA

Post by throbinson »

Yes. Where you want the data to go is the path you'll need to put into the resource disk path of the nodes of the Config File. You've got a 2 node SMP currently writing to the same place on both nodes. Assuming there is no contention when writing to the same actual physical disk, this shouldn't be a concern. But what if you are writing to the same disk in both nodes? Can you write to two diferent FileServer locations to get the max parallel performance? Another observation I would make is about where your FileServer is physically located. What's the network hop from DSS01 to it? Is it significant? This might impact your read/write performance from/to DSS01. Just some stuff to think about.
ds2000
Premium Member
Premium Member
Posts: 109
Joined: Sun Apr 22, 2007 7:25 pm
Location: ny

Post by ds2000 »

Question from question:
Can you write to two diferent FileServer locations to get the max parallel performance?

Can two locations be on a same server pointing to different folders ?

What are the best practices to follow when defining resource and scratch disk configuration in config file?
Should Descripter (header) must be local on datastage server and data files somewhere else?
ds2000
Premium Member
Premium Member
Posts: 109
Joined: Sun Apr 22, 2007 7:25 pm
Location: ny

Post by ds2000 »

Resource disk needs to be located on two boxes or can be two different folders on same box and that box should be a datastage ?
throbinson
Charter Member
Charter Member
Posts: 299
Joined: Wed Nov 13, 2002 5:38 pm
Location: USA

Post by throbinson »

It doesn't have to be the same machine as where the DataStage Engine is located and all nodes don't have to be on the same box. Resource disk file systems can be anywhere the DataStage Engine can reach in any combination. Should it be anywhere? That all depends.
Post Reply