Page 1 of 1

data set

Posted: Wed Oct 19, 2011 5:19 pm
by phanikumar
I have a dataset which has some 400,00 records in it,for my testing purpose i need to create a copy of the data set with a sample of first 200 records from the data set. Can somebody let me know how to do this.

Regards

Kumar

Re: data set

Posted: Wed Oct 19, 2011 5:53 pm
by SURA
I am not sure how much it is possible to fetch part of the records using command line and created it as .ds file. The simple option is use that .ds as a source and in the TFM make it run in Sequence use @OUTROWNUM = 200. So that you can get 200 records which can again write into a .ds file.

DS User

Posted: Wed Oct 19, 2011 6:03 pm
by chulett
<moved here from the TX forum>

Re: data set

Posted: Wed Oct 19, 2011 6:08 pm
by ray.wurlod
SURA wrote:...write into a .ds file.
Please be aware that data are never written into the .ds file itself. This file is a descriptor that stores the locations of the segment files in which the data are actually stored. There is one segment file per resource disk directory per node, provided that there are sufficient records to make the full distribution worth while.

Re: data set

Posted: Wed Oct 19, 2011 6:27 pm
by SURA
Ray,I mean to used a dataset file stage to write it.

One more question; can't we hack it to read the part of the records?

Don't as why?

DS User

Posted: Wed Oct 19, 2011 7:13 pm
by ray.wurlod
Do you speak binary? Storage in Data Sets uses DataStage internal format where, among other things, all numeric data are stored in binary format. A Data Set is not intended to be a database (otherwise they would have called it that). A Data Set stores a set of data: a whole set of data.

Posted: Wed Oct 19, 2011 9:41 pm
by SURA
In the Data Set Management, output can see some ORCHESTRATE codes, node details etc. I thought of to hack and use that is the aim of my question.

DS User