Page 1 of 1

Copy datasets and stat files

Posted: Fri Sep 14, 2018 5:00 pm
by rumu
Hi All,

Our UAT and Prod datastage env are hosted on the same server. Unix flle system is maintained using separate directories.
Requirement is copying datasets and surrogate key stat files from UAT directiroy to Prod directoy (on same server).
Does this copy pose any threat? is there any possibility of datasets and and stat files get corrupted?

Posted: Fri Sep 14, 2018 7:56 pm
by ray.wurlod
Provided that you use the orchadmin command (or the Data Set Management tool in Designer) to copy the Data Sets you should be fine.
Copying just the descriptor file, however, is a Bad Idea.

Make sure that the state file is not in use when you copy it. Why do you want to copy the state file? Ought not you to be using a different set of generated keys in the new environment? The sets of keys from the separate state files will quickly cease to be synchronized.

Posted: Mon Sep 17, 2018 6:26 am
by rumu
I am not able to view the entire content as I am not a premium member.Could you please let me know the remaining context.

Posted: Mon Sep 17, 2018 4:20 pm
by ray.wurlod
Could you please get yourself a premium membership? You have posted 200 times, so are clearly deriving a benefit.

Premium membership is the mechanism through which DSXchange is funded. It pays for the hosting and bandwidth costs. (Perhaps you've noticed the lack of advertising on the site?)

Posted: Tue Sep 18, 2018 5:53 am
by rumu
Hi Ray,

I will do.
Between, to copy the datasets from one directory to another, is it not sufficient to copy the descriptor files? Data files to be copied ? for that do we need to set the config files ?

Posted: Tue Sep 18, 2018 11:56 pm
by ray.wurlod
If you copy only the descriptor files, then both your prod and your non-prod jobs will be (over)writing the same Data Sets. Last one in wins. This is a Bad Idea.

The descriptor file contains (among other things) the pathnames of the segment (data) files making up the Data Set. A copy of the descriptor file only would point to those same pathnames.

Instead, create the Data Sets afresh in the production environment. That way, even if the resource disk is shared between prod and non-prod, the segment files will have different names. And have the directory pathname where the descriptor files are stored be specified through a job parameter, and different in the prod and non-prod environments.