configration file, paging standards?

peep · Post by **peep** » Tue Sep 18, 2012 4:47 pm

OS: AIX
DS: 8.7
questuon 1:
In configration file for datasets are there any standard paging spaces ?
like while running a job it is showing on 3 nodes

node1 5001216
node2 0
node3 5001216

this is very consistant.
so do you where i can find these paging standards?
Question 2:
how can i archive dataset files in node1 ,node 2 ,node3

ray.wurlod · Post by **ray.wurlod** » Tue Sep 18, 2012 8:25 pm

1) There are no standards for paging. What determines how much data goes to each node is the partitioning algorithm. Your scenario (populating only two nodes out of three) might be seen if the data were partitioned using a key-based partitioning algorithm based on a two-valued field, what is classed as an Indicator were you profiling it with Information Analyzer.

2) The best way to archive Data Sets is to use the orchadmin command. This guarantees that every defined data file is picked up. The data, or "segment" files reside on directories identified as disk resource in the configuration file, and the descriptor file (the one with the ".ds" suffix to its name) records the location of these segments files (along with other information, such as the record schema).

peep · Post by **peep** » Tue Sep 18, 2012 9:05 pm

partitioning algorithm can be seen in config file?

if not whr can i see it?

ray.wurlod · Post by **ray.wurlod** » Tue Sep 18, 2012 9:41 pm

The partitioning algorithm can not be seen in the configuration file.

It can be seen in the job design, on the input link of any stage executing in parallel mode.

It can also be seen in the job score.

peep · Post by **peep** » Tue Sep 18, 2012 9:49 pm

Its can also be seen in the ?
in job design input link and where?

peep · Post by **peep** » Tue Sep 18, 2012 10:15 pm

is it something related to transport block size Env variable?

jwiles · Post by **jwiles** » Tue Sep 18, 2012 11:12 pm

Are you perhaps inquiring about operating system paging rather than dataset partitioning?

peep · Post by **peep** » Wed Sep 19, 2012 1:24 am

Dataset ... not OS
but it is just a question..

so where can i find those values ?

i see the dataset.ds file size as 0, 5001216,131072 ... are these values which set in env variables via administrator client ? or these are default values ?

ArndW · Post by **ArndW** » Wed Sep 19, 2012 1:26 am

The file sizes are a result of the data that is put in them. This means that your file 0 has no records, which means that your selected partitioning algorithm isn't distributing the records evenly and you should rethink the algortihm used or the key used.

peep · Post by **peep** » Wed Sep 19, 2012 2:14 am

wat ever ur reply is can u pls paste it in my inbox..
im nt able to read here

ArndW · Post by **ArndW** » Wed Sep 19, 2012 3:07 am

Dear peep,

1. This is a forum, not an SMS telephone exchange. Please use real words and vowels since many here, including myself and yourself, are not native English speakers and communication is difficult enough with the technical nature of DataStage.

2. While most of the subject matter is freely visible, some is only visible to those who have opted to support the forum by becoming premium members.

The initial part of my earlier message is visible and provides enough information to suggest the path to a solution.

peep · Post by **peep** » Wed Sep 19, 2012 3:38 am

ok I am sorry for using short forms.
where can I see the input link data ?
In my research I found this block values can be defined in job properties and it can be determined by considering the largest records in the link of a job and setting this value more than the largest record value.

we can set this value in job parameters APT_Default_TRANSPORT_BLOCK_SIZE.

Here is more info please correct me if i am wrong.

APT_AUTO_TRANSPORT_BLOCK_SIZE- does this take the required amount of space to pass the records . Rather using the defined amount of disk space.
By using this variable we can allow the job to use the required amount of disk space.

ArndW · Post by **ArndW** » Wed Sep 19, 2012 4:12 am

The APT_TRANSPORT_BLOCK_SIZE does not affect disk use, it affects the block size of data transport between stages at runtime.
What is your partitioning algorithm and key for your data set?
Try setting it to "round robin" and then your 3 part files will have almost identical sizes.

peep · Post by **peep** » Wed Sep 19, 2012 4:19 am

auto partitioning..

where can i find the key of datasets?

ArndW · Post by **ArndW** » Wed Sep 19, 2012 4:34 am

Go to the job where you write to the dataset and see what the partitioning algorithm and key was set to in the job, the dataset inherits/uses these settings.