Query dataset part key(s) and sort key(s)

bcarlson · Post by **bcarlson** » Fri Apr 11, 2008 4:49 pm

We are heading into a technology refresh - our ETL and database servers are being replaced and also moved to a new building. We will have both our current system and new system available for a few months side by side. As part of our testing and migration/cutover we need to be able to move DataStage datasets from one server to the other.

I believe a simple DataStage (PX) job can read the dataset from one server and write it to the other server (specifying 2 different configs, one for read dataset and one for write dataset). To be honest, I have never tried this so I am making a big assumption here...

The part that is troubling us is how to maintain the correct partitioning key(s) and sort key(s). I cannot find this information using the orchadmin command. Are there any ways to query this from the repository?

Thanks!

Brad.

ray.wurlod · Post by **ray.wurlod** » Fri Apr 11, 2008 4:53 pm

I believe that neither sort keys nor partition keys are stored in Data Sets. There is no need; the data are stored however they arrived (partitioned and sorted).

Therefore, I believe, you need to track back to the jobs that created each Data Set and determine from its log which APT_CONFIG_FILE value was used and from the job design which partitioning and sort algorithms/keys were used.

This information would then be used in your new transfer jobs.

bcarlson · Post by **bcarlson** » Fri Apr 11, 2008 5:08 pm

Bummer. We were hoping for something more dynamic. Grab a dataset, look up its info and write it out.

Oh well. The problem is that we have potentially hundreds out there.

Brad.

ray.wurlod · Post by **ray.wurlod** » Fri Apr 11, 2008 6:09 pm

I sense an enhancement request coming soon!

bcarlson · Post by **bcarlson** » Mon Apr 14, 2008 9:30 am

Good idea. I hadn't even thought that far yet (needed more coffee).

John Smith · Post by **John Smith** » Mon Apr 14, 2008 7:33 pm

bcarlson wrote:Bummer. We were hoping for something more dynamic. Grab a dataset, look up its info and write it out.

Oh well. The problem is that we have potentially hundreds out there.

Brad.

Not sure of your requirements but I'll def look at RCP.
Have you thought of using RCP in a generic job to move your datasets?That way you don't have to write hundreds but just one job?

bcarlson · Post by **bcarlson** » Tue Apr 15, 2008 10:11 am

Yes, we would definitely be using RCP and creating generic job. In fact, we use RCP across the board in our environment.

Brad