Migration 7.5x2 to 8.0

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Raftsman
Premium Member
Premium Member
Posts: 335
Joined: Thu May 26, 2005 8:56 am
Location: Ottawa, Canada

Migration 7.5x2 to 8.0

Post by Raftsman »

We migrated to the new version of Datastage and received a few issues that we can't seem to resolve.

1. After the migration, none of our DS datasets that we created in 7.5 could be found or read by 8.0. Is there a migration routine for datasets?

2. Smaller jobs consisting of 5 to 10 stages work fine in 7.5 and 8.0 but jobs that contain more than this and worked in 7.5 abort in 8.0. We received node failure errors. Is there configuration parameters that need modifying? The error messages are very vague and really don't tell us where to start looking. Has anyone experienced this?

Thanks
Jim Stewart
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Data Sets don't migrate. Why would you want to? You normally overwrite them each run anyway. There is a post by kali recently on the question of how to migrate Data Sets, but no useful reply. I don't know of a reliable way to move them.

Can't help with your second point, as I still have not played with version 8.0.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
DSguru2B
Charter Member
Charter Member
Posts: 6854
Joined: Wed Feb 09, 2005 3:44 pm
Location: Houston, TX

Post by DSguru2B »

Best to get in touch with support as 8.0 is still new in the market and not very many users.
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
lstsaur
Participant
Posts: 1139
Joined: Thu Oct 21, 2004 9:59 pm

Post by lstsaur »

How much RAM and how many CPUs that you have on the machine? I was told by IBM support during my Hawk Beta 2 testing that large jobs or too many jobs (1,000 or more), you need "at least" 4GB of RAM. Otherwsie, your jobs might get aborted.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

One of the presenters at IOD 2006 also mentioned that your client machine needs 2GB minimum memory.

How much is needed on the server side depends on how you distribute the various server components. For example, you can have the Application Server, the Domain Server and the database server all on one (heavy duty) machine or on multiple machines. Such is the flexibility of service-oriented architecture.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
vmcburney
Participant
Posts: 3593
Joined: Thu Jan 23, 2003 5:25 pm
Location: Australia, Melbourne
Contact:

Post by vmcburney »

Use the new performance monitoring reports and graphs to monitor jobs that fail. Do they fail when they are run individually or when they are run with a lot of other jobs? Compare your current environment and project variables to the values you had pre migration to find out if any important settings have been changed. Check your temp paths are the same.

Agree with Ray on the datasets. If you have persistent datasets that need to be migrated then your design is wrong. Datasets should be treated as temporary tables.
kumar_s
Charter Member
Charter Member
Posts: 5245
Joined: Thu Jun 16, 2005 11:00 pm

Post by kumar_s »

By the way, does it mean the version of Dataset has been changed in Ver 8?
Impossible doesn't mean 'it is not possible' actually means... 'NOBODY HAS DONE IT SO FAR'
vmcburney
Participant
Posts: 3593
Joined: Thu Jan 23, 2003 5:25 pm
Location: Australia, Melbourne
Contact:

Post by vmcburney »

It doesn't need to change. While the metadata repository has changed the parallel engine hasn't. Datasets should be the same. It's the cataloguing of datasets that is causing your problems. If you have upgraded it should still be available through the dataset manager. If it isn't get in touch with Ascential support.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

kumar_s wrote:By the way, does it mean the version of Dataset has been changed in Ver 8?
Someone with version 8.0 might use the Data Set Management tool (under Tools menu in DataStage/QualityStage Designer, since there is no longer a Manager client) and let us know what the Data Set version number is. In any case, they should be upwards compatible.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Raftsman
Premium Member
Premium Member
Posts: 335
Joined: Thu May 26, 2005 8:56 am
Location: Ottawa, Canada

Post by Raftsman »

This is the following error I receive from the job.

Here's a brief overview of what I have attempted to determine the problem.

I took the job that keeps aborting and started removing stages to see if I could narrow down the problem. What is left is, two aggregators stream being joined into one dataset. This aborts. I removed the join and put the stream into their own unique datasets. The job ran fine. I took the datasets, created a new parallel job and joined them together creating a new dataset. This worked fine.

So in summary, my initial job will not work if I join the aggregates in one dataset. I get the following error;

buffer(1),7: Failure during execution of operator logic.
buffer(1),7: Input 0 consumed 0 records.
buffer(1),7: Output 0 produced 0 records.
buffer(1),7: Fatal Error: Cannot find protocol entry for tcp protocol

Can anyone please interpret what the messages means.

Thanks
Jim Stewart
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Take a look at the score. This will show you the buffer operators that were inserted to avoid data flow deadlock situations. There are at least two of these (buffer(1) is the second). Based on the virtual Data Sets these are using, you might discern what is happening.

My guess is that the TCP port numbers (by default 10000 and 11000) used by conductor, section leader and player processes to communicate with each other is blocked by your firewall.

Another possibility is that you're in a multi-machine configuration, and that some form of repartitioning is required. This would employ TCP/IP sockets, but for whatever reason, the TCP protocol has not been set up (or has been disabled or blocked). There are environment variables that specify the default port number used by the APT_Communicator class; you can find this in Chapter 6 of the Parallel Job Advanced Developer's Guide
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Raftsman
Premium Member
Premium Member
Posts: 335
Joined: Thu May 26, 2005 8:56 am
Location: Ottawa, Canada

Post by Raftsman »

We do not have a firewall and during the installation, we deferred to the installation defaults. We are running a 8 node Windows servers with plenty of memory. We are opening up tickets with IBM in order to solve this issue.

After reading through the information, we are still unclear on what is causing the error. The message states a TCP protocol error but we think it's more than that. We are have trouble with numerous jobs and not every job has the same error.

We are contemplating moving back to 7.5x2. At least we could move forward.
Jim Stewart
Raftsman
Premium Member
Premium Member
Posts: 335
Joined: Thu May 26, 2005 8:56 am
Location: Ottawa, Canada

Post by Raftsman »

More information on the issue.

I have been dissecting the job into smaller chunks to help debug the problem.

I have narrowed it down to the following. Within the job there are two joins and 5 aggregate stages. If I remove the final join and created two datasets, the job runs fine. As soon as I put the join back in where its two inputs are aggregate stages and create one dataset, the job aborts on TCP protocol issues.

Please remember, this job ran fine in 7.5x2. There must be some config setting we have overlooked.

Does anyone have any feedback for this

Thanks
Jim Stewart
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

7.5x2 was a whole heap of compromises held together by duct tape. Fortunately, duct tape is a very versatile substance.

Version 8.0 is far more likely to be rigorous about such things as correct partitioning and sorting of input links when required. Try inserting Copy stages on the links between the Aggregator stages and the Join stage. You may even benefit from Sort stages set to "don't sort (already sorted)".
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Raftsman
Premium Member
Premium Member
Posts: 335
Joined: Thu May 26, 2005 8:56 am
Location: Ottawa, Canada

Post by Raftsman »

Hi all,

Here's an update on my current situation. The jobs that worked in 7.5x2 and not in 8.0 has been defined as a partitioning error. The job will work using 1 or 2 nodes. Sometimes they work with 3. As soon as we change the configuration to 4 or more, the jobs abort. The issue is in IBM tech supports hands. Looks like a bug in version 8.0.

I will let you know when this issue gets resolved.

Thanks
Jim Stewart
Post Reply