After Migrating to V8.1 Issue with Sort Stage

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
Vidyut
Participant
Posts: 24
Joined: Wed Oct 13, 2010 12:45 am

After Migrating to V8.1 Issue with Sort Stage

Post by Vidyut »

We just migrated our development environment to V8.1 from 7.5.2.
Have done all the configuration known to me.
We have one Datastage Server(04) and one Database Server(05)
Everything is running fine except the SORT Stage.
Sort is running fine if i change Sort Utility from datastage to Unix.

We are getting error on Node4 that belongs to the Database Server.

#### STAGE: Sort_31
## Operator
tsort
## Operator options
-key 'account_name'
-asc
-stable
## General options
[ident('Sort_31'); jobmon_ident('Sort_31')]
## Inputs
0< [] 'DB2_UDB_Enterprise_40:DSLink2.v'
## Outputs
0> [] 'Sort_31:DSLink30.v'
;
#################################################################
#### STAGE: Data_Set_35
## Operator
copy
## General options
[ident('Data_Set_35')]
## Inputs
0< [] 'Sort_31:DSLink30.v'
## Outputs
0>| [ds] '/home/dsadm/practice/pooja/sample.ds'
;
#################################################################
#### STAGE: DB2_UDB_Enterprise_40
## Operator
db2read
## Operator options
-query 'select account_name from dsedw.account_arngmnt_dimn fetch first 10 rows only'
-dbname 'edw_d_d1'
-server 'dsedw'
-client_instance 'dsedw'
-user 'dsadm'
-password '[&__V0S40P1_password]'
## General options
[ident('DB2_UDB_Enterprise_40'); jobmon_ident('DB2_UDB_Enterprise_40')]
## Outputs
0> [modify (
account_name:nullable string[]=account_name;
)] 'DB2_UDB_Enterprise_40:DSLink2.v'
;
# End of OSH code

Item #: 6
Event ID: 80
Timestamp: 2010-10-13 11:29:47
Type: Warning
User Name: dsadm
Message Id: DSTAGE_RUN_W_0005
Message: Parallel debug is turned on (environment variable DS_PXDEBUG is set)

Item #: 7
Event ID: 81
Timestamp: 2010-10-13 11:29:47
Type: Info
User Name: dsadm
Message Id: IIS-DSEE-TFCN-00001
Message: main_program: IBM WebSphere DataStage Enterprise Edition 8.1.0.5040
Copyright (c) 2001, 2005-2008 IBM Corporation. All rights reserved

Item #: 8
Event ID: 82
Timestamp: 2010-10-13 11:29:47
Type: Info
User Name: dsadm
Message Id: IIS-DSEE-TFCN-00006
Message: main_program: conductor uname: -s=AIX; -r=3; -v=5; -n=PNBDWH04; -m=00CE74414C00

Item #: 9
Event ID: 83
Timestamp: 2010-10-13 11:29:47
Type: Info
User Name: dsadm
Message Id: IIS-DSEE-TOSH-00002
Message: main_program: orchgeneral: loaded
orchsort: loaded
orchstats: loaded

Item #: 10
Event ID: 84
Timestamp: 2010-10-13 11:29:48
Type: Info
User Name: dsadm
Message Id: IIS-DSEE-TCOS-00023
Message: main_program: Dump:
{
text="tsort
-key 'account_name'
-asc
-stable
[ident('Sort_31'); jobmon_ident('Sort_31')]
0< [] 'DB2_UDB_Enterprise_40:DSLink2.v'
0> [] 'Sort_31:DSLink30.v'
;
copy
[ident('Data_Set_35')]
0< [] 'Sort_31:DSLink30.v'
0>| [ds] '/home/dsadm/practice/pooja/sample.ds'
;
db2read
-query 'select account_name from dsedw.account_arngmnt_dimn fetch first 10 rows only'
-dbname 'edw_d_d1'
-server 'dsedw'
-client_instance 'dsedw'
-user 'dsadm'
-password '******'
[ident('DB2_UDB_Enterprise_40'); jobmon_ident('DB2_UDB_Enterprise_40')]
0> [modify(account_name:nullable string=account_name;)] 'DB2_UDB_Enterprise_40:DSLink2.v'
;",
line=1, column=1, name="", qualname="",
op={
text="tsort
-key 'account_name'
-asc
-stable
[ident('Sort_31'); jobmon_ident('Sort_31')]
0< [] 'DB2_UDB_Enterprise_40:DSLink2.v'
0> [] 'Sort_31:DSLink30.v'",
line=1, column=1, name=tsort, qualname=Sort_31,
wrapout={},
wrapperfile=tsort, kind=non_wrapper_cdi_op, exec_mode=none,
args="'account_name'-asc'-stable'",
input={ text="
0< [] 'DB2_UDB_Enterprise_40:DSLink2.v'", line=6,
column=1, name="", qualname="Sort_31[i0]",
data="DB2_UDB_Enterprise_40:DSLink2.v"
},
output={ text="
0> [] 'Sort_31:DSLink30.v'", line=7, column=1,
name="", qualname="Sort_31[o0]",
data="/home/dsadm/practice/pooja/sample.ds"
}
},
op={
text="
db2read
-query 'select account_name from dsedw.account_arngmnt_dimn fetch first 10 rows only'
-dbname 'edw_d_d1'
-server 'dsedw'
-client_instance 'dsedw'
-user 'dsadm'
-password '******'
[ident('DB2_UDB_Enterprise_40'); jobmon_ident('DB2_UDB_Enterprise_40')]
0> [modify(account_name:nullable string=account_name;)] 'DB2_UDB_Enterprise_40:DSLink2.v'",
line=14, column=1, name=db2read, qualname=DB2_UDB_Enterprise_40,
wrapout={},
wrapperfile=db2read, kind=non_wrapper_cdi_op, exec_mode=none,
args="'select account_name from dsedw.account_arngmnt_dimn fetch first 10 rows only'-dbname'edw_d_d1'-server'dsedw'-client_instance'dsedw'-user'dsadm'-password'******'",
output={
text="
0> [modify(account_name:nullable string=account_name;)] 'DB2_UDB_Enterprise_40:DSLink2.v'",
line=22, column=1, name="",
qualname="DB2_UDB_Enterprise_40[o0]",
data="DB2_UDB_Enterprise_40:DSLink2.v",
outadapt="account_name:nullable string=account_name;"
}
},
data={ text="
0< [] 'DB2_UDB_Enterprise_40:DSLink2.v'", line=6, column=1,
name="DB2_UDB_Enterprise_40:DSLink2.v",
qualname="DB2_UDB_Enterprise_40:DSLink2.v",
partwrapout={},
collwrapout={},
dir=flow, kind=ds, inrefcount=1, writer=DB2_UDB_Enterprise_40,
reader=Sort_31, pp=none, trunc=default,
ident="DB2_UDB_Enterprise_40:DSLink2.v"
},
data={ text="
0>| [ds] '/home/dsadm/practice/pooja/sample.ds'", line=12,
column=1, name="/home/dsadm/practice/pooja/sample.ds",
qualname="/home/dsadm/practice/pooja/sample.ds",
partwrapout={},
collwrapout={},
dir=output, kind=ds, writer=Sort_31, reader="", pp=none,
trunc=replace, ident="/home/dsadm/practice/pooja/sample.ds"
}
}
.

Item #: 11
Event ID: 85
Timestamp: 2010-10-13 11:29:48
Type: Info
User Name: dsadm
Message Id: IIS-DSEE-TFSC-00001
Message: main_program: APT configuration file: /home/dsadm/Ascential/DataStage/Configurations/default.apt
{
node "node1"
{
fastname "PNBDWH04"
pools "" "node1"
resource disk "/dsprocessph2/dsprocess/DataStage/Datasets" {pools ""}
resource scratchdisk "/dsprocessph2/dsprocess/DataStage/Scratch" {pools "" "sort"}
}
node "node2"
{
fastname "PNBDWH04"
pools "" "node2"
resource disk "/dsprocessph2/dsprocess/DataStage/Datasets" {pools ""}
resource scratchdisk "/dsprocessph2/dsprocess/DataStage/Scratch" {pools "" "sort"}
}
node "node3"
{
fastname "PNBDWH04"
pools "" "node3"
resource disk "/dsprocessph2/dsprocess/DataStage/Datasets" {pools ""}
resource scratchdisk "/dsprocessph2/dsprocess/DataStage/Scratch" {pools "" "sort"}
}
node "node4"
{
fastname "PNBDWH05"
pools "" "node4"
resource disk "/bmetlscratch2/dsprocess/DataStage/Datasets" {pools ""}
resource scratchdisk "/bmetlscratch2/dsprocess/DataStage/Scratch" {pools "" "sort"}
}
}

Item #: 12
Event ID: 86
Timestamp: 2010-10-13 11:29:50
Type: Info
User Name: dsadm
Message Id: IIS-DSEE-TFSU-00016
Message: main_program: This step has 2 datasets:
ds0: {op0[1p] (sequential DB2_UDB_Enterprise_40)
eOther(APT_HashPartitioner { key={ value=account_name,
subArgs={ asc }
}
})<>eCollectAny
op1[4p] (parallel Sort_31)}
ds1: {op1[4p] (parallel Sort_31)
[pp] =>
/home/dsadm/practice/pooja/sample.ds}
It has 2 operators:
op0[1p] {(sequential DB2_UDB_Enterprise_40)
on nodes (
node4[op0,p0]
)}
op1[4p] {(parallel Sort_31)
on nodes (
node1[op1,p0]
node2[op1,p1]
node3[op1,p2]
node4[op1,p3]
)}
It runs 5 processes on 4 nodes.

Item #: 13
Event ID: 87
Timestamp: 2010-10-13 11:29:51
Type: Info
User Name: dsadm
Message Id: IIS-DSEE-TUTL-00031
Message: node_node4: The open files limit is 102400; raising to 9223372036854775807.

Item #: 14
Event ID: 88
Timestamp: 2010-10-13 11:29:51
Type: Info
User Name: dsadm
Message Id: IIS-DSEE-TFPM-00324
Message: Sort_31,0: Calling runLocally: step=0, node=node1, op=1, ptn=0

Item #: 15
Event ID: 89
Timestamp: 2010-10-13 11:29:51
Type: Info
User Name: dsadm
Message Id: IIS-DSEE-TFPM-00324
Message: Sort_31,1: Calling runLocally: step=0, node=node2, op=1, ptn=1

Item #: 16
Event ID: 90
Timestamp: 2010-10-13 11:29:51
Type: Info
User Name: dsadm
Message Id: IIS-DSEE-TFPM-00324
Message: Sort_31,2: Calling runLocally: step=0, node=node3, op=1, ptn=2

Item #: 17
Event ID: 91
Timestamp: 2010-10-13 11:29:51
Type: Info
User Name: dsadm
Message Id: IIS-DSEE-TFPM-00324
Message: DB2_UDB_Enterprise_40,0: Calling runLocally: step=0, node=node4, op=0, ptn=0

Item #: 18
Event ID: 92
Timestamp: 2010-10-13 11:29:51
Type: Info
User Name: dsadm
Message Id: IIS-DSEE-TFPM-00324
Message: Sort_31,3: Calling runLocally: step=0, node=node4, op=1, ptn=3

Item #: 19
Event ID: 93
Timestamp: 2010-10-13 11:29:51
Type: Fatal
User Name: dsadm
Message Id: IIS-DSEE-TFOR-00001
Message: Sort_31,3: Failure during execution of operator logic. [api/operator_rep.C:399]

Item #: 20
Event ID: 94
Timestamp: 2010-10-13 11:29:51
Type: Info
User Name: dsadm
Message Id: IIS-DSEE-TFOR-00163
Message: Sort_31,3: Input 0 consumed 0 records.

Item #: 21
Event ID: 95
Timestamp: 2010-10-13 11:29:51
Type: Info
User Name: dsadm
Message Id: IIS-DSEE-TFOR-00094
Message: Sort_31,3: Output 0 produced 0 records.

Item #: 22
Event ID: 96
Timestamp: 2010-10-13 11:29:51
Type: Fatal
User Name: dsadm
Message Id: IIS-DSEE-TOSO-00027
Message: Sort_31,3: Fatal Error: Need to be able to open at least 16 files; please check your ulimit setting for number of file descriptors [sort/merger.C:1087]

Item #: 23
Event ID: 97
Timestamp: 2010-10-13 11:29:51
Type: Fatal
User Name: dsadm
Message Id: IIS-DSEE-TFOR-00001
Message: Sort_31,3: Failure during execution of operator logic. [api/operator_rep.C:399]

Item #: 24
Event ID: 98
Timestamp: 2010-10-13 11:29:51
Type: Info
User Name: dsadm
Message Id: IIS-DSEE-TFOR-00163
Message: Sort_31,3: Input 0 consumed 0 records.

Item #: 25
Event ID: 99
Timestamp: 2010-10-13 11:29:51
Type: Info
User Name: dsadm
Message Id: IIS-DSEE-TFOR-00094
Message: Sort_31,3: Output 0 produced 0 records.

Item #: 26
Event ID: 100
Timestamp: 2010-10-13 11:29:51
Type: Fatal
User Name: dsadm
Message Id: IIS-DSEE-TOSO-00014
Message: Sort_31,3: Fatal Error: Sorter handshake read failed: unexpected EOF [sort/merger.C:418]

Item #: 27
Event ID: 101
Timestamp: 2010-10-13 11:29:51
Type: Fatal
User Name: dsadm
Message Id: IIS-DSEE-TFPM-00192
Message: node_node4: Player 2 terminated unexpectedly. [processmgr/player.C:149]

Item #: 28
Event ID: 102
Timestamp: 2010-10-13 11:29:56
Type: Fatal
User Name: dsadm
Message Id: IIS-DSEE-TFPM-00338
Message: main_program: APT_PMsectionLeader(4, node4), player 2 - Unexpected exit status 1. [processmgr/slprocess.C:368]

Item #: 29
Event ID: 103
Timestamp: 2010-10-13 11:29:56
Type: Fatal
User Name: dsadm
Message Id: IIS-DSEE-TFSC-00011
Message: main_program: Step execution finished with status = FAILED. [sc/sc_api.C:242]

Item #: 30
Event ID: 104
Timestamp: 2010-10-13 11:29:56
Type: Info
User Name: dsadm
Message Id: IIS-DSEE-TCOS-00026
Message: main_program: Startup time, 0:08; production run time, 0:00.

Item #: 31
Event ID: 105
Timestamp: 2010-10-13 11:29:57
Type: Control
User Name: dsadm
Message Id: DSTAGE_RUN_I_0075
Message: Job Untitled4 aborted.

Please Help.....
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

I would start with your ulimit settings at both the user and the system limits level, see
...Need to be able to open at least 16 files; please check your ulimit setting for number of file descriptors...
Vidyut
Participant
Posts: 24
Joined: Wed Oct 13, 2010 12:45 am

Post by Vidyut »

have set all the parameters in ulimit file as unlimited for root as well as dsadm user in both the servers i.e.datastage and database
The problem still exists....
Thanks
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

Edit your job and add a before-job subroutine call to "ExecSH" with the parameter "ulimit -a" and see what the actual runtime value is. I am willing to be that the value you get is "16" and not unlimited.

Note that you ulimit value is often set or modified in the dsenv file.
Vidyut
Participant
Posts: 24
Joined: Wed Oct 13, 2010 12:45 am

Post by Vidyut »

Hi

The issue is now resolved after installing Fix Pack 1(FP1)

Thanks for all your support.....
Post Reply