Input buffer underrun

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
Gazelle
Premium Member
Premium Member
Posts: 108
Joined: Mon Nov 24, 2003 11:36 pm
Location: Australia (Melbourne)

Input buffer underrun

Post by Gazelle »

Requirement:
We have a job that reads a dataset, makes the required changes, and overwites the dataset.

To do this, we write to a temporary dataset, and after the job completes successfully, we "rename" the temporary dataset to the proper dataset name (using orchadmin copy).

Problem:
We get an error when tyring to read the new version of the dataset in a job:
Input buffer underrun; traversed 3211280 fewer bytes than present in the record buffer.

Trying to read the dataset via orchadmin dump also gets an error:
Premature EOF on node xxx Socket operation on non-socket
...
Internal Error: (0):processmgr/msgport.C: 1299: FAILURE: Impossible code statement reached


Investigation:
Although the daset column definitions look the same in Designer, they are different when listing the schema definition using: orchadmin ls
Under the covers, DataStage does what it wants regarding the column order, in the name of "optimisation", as explained in the post "Dataset schema is disordered"

The orchadmin copy -help states:
Copy the schema, contents and preserve-partitioning flag of the
specified ORCHESTRATE file dataset. If the preserve-partitioning
flag is set, the copy will have the same number of partitions and
record order as the original. If the target file already exists,
it will be truncated first. If the preserve-partitioning flag of
the source file is set and the target file already exists, it must
have the same number of partitions as the source file.

The copy command has no options. A warning message is issued if
the target does not already exist. This is a bug, not a feature.
It looks like the orchadmin copy command truncates the data only, and leaves the schema definition intact. So if the source dataset has the columns in a different order, the target dataset becomes unreadable.

Workaround:
Unless anyone can come up with a better idea, we plan to "rename" the dataset using a DataStage job, instead of orchadmin.

By the way, I like the "This is a bug, not a feature" statement in the help text... I think I might use that in my jobs, instead of fixing defects!
jwiles
Premium Member
Premium Member
Posts: 1274
Joined: Sun Nov 14, 2004 8:50 pm
Contact:

Post by jwiles »

Create the temporary dataset, then use orchadmin to delete the existing one followed by "renaming" then new one.

Regards,
- james wiles


All generalizations are false, including this one - Mark Twain.
Gazelle
Premium Member
Premium Member
Posts: 108
Joined: Mon Nov 24, 2003 11:36 pm
Location: Australia (Melbourne)

Post by Gazelle »

Good idea. Thanks James.
Post Reply