Page 1 of 1

Posted: Tue Sep 21, 2010 11:45 am
by mhester
That is actually contrary to what I have seen. I do not use eSame and have not since the first version of this tool. Auto has always worked. Did you see a repartitioner in the score for the job?

If the job that created the dataset used in the scd job was partitioned by columns that are different than what is needed in the CDC job then Auto would repartition I would think.

Posted: Wed Sep 22, 2010 12:47 am
by ray.wurlod
I'm sure I've listed the rules here in the past.

(Auto) gives:
  • Hash on the "keys" on inputs of stages that require key-partitioned data.

    DB2 on the input to a DB2 Enterprise stage (probably also on a DB2 Connector, but I have yet to verify that).

    Entire on a reference input to a Lookup stage.

    Round robin in all other cases.

Posted: Wed Sep 22, 2010 1:44 pm
by asorrell
Ray,

I believe it uses "DB2 Connector" for the DB2 Connector. New partitioning type at 8 - uses a slightly different mechanism to determine the number of db2 partitions and the partitioning key from what I can tell.

Posted: Wed Sep 22, 2010 3:32 pm
by mhester
Partitioned reads and writes were never available in 8.0 only in 8.1 with the DB2 Connector. DB2 Connector partitioning was available, but the "guts" were not there.

The connector makes a call to the DPF DB to read the partition map and db2nodes.cfg for the table being loaded or read and understands how to partition the data. Auto cannot be used here if you want partitioned reads or writes - well it can, but it will be slow.

The following is from a framework developer at IBM whom I have known for many years and was one of the original developers of the product -
If you set Partitioning to Auto, the framework will determine the best partitioning method needed. If your data has been pre-partitioned and saved to datasets with preserve-partitioning flag set, and you use the same number of partitions in the data flow that reads datasets, data partition will be preserved.

It is recommended you set Partitioning to Auto and let the framework handle the rest. You may want to use eSame inside a data flow between two stages when needed, but not between two different data flows.