Page 1 of 1

After Migrating to V8.1 Issue with Sort Stage

Posted: Wed Oct 13, 2010 1:00 am
by Vidyut
We just migrated our development environment to V8.1 from 7.5.2.
Have done all the configuration known to me.
We have one Datastage Server(04) and one Database Server(05)
Everything is running fine except the SORT Stage.
Sort is running fine if i change Sort Utility from datastage to Unix.

We are getting error on Node4 that belongs to the Database Server.

#### STAGE: Sort_31
## Operator
tsort
## Operator options
-key 'account_name'
-asc
-stable
## General options
[ident('Sort_31'); jobmon_ident('Sort_31')]
## Inputs
0< [] 'DB2_UDB_Enterprise_40:DSLink2.v'
## Outputs
0> [] 'Sort_31:DSLink30.v'
;
#################################################################
#### STAGE: Data_Set_35
## Operator
copy
## General options
[ident('Data_Set_35')]
## Inputs
0< [] 'Sort_31:DSLink30.v'
## Outputs
0>| [ds] '/home/dsadm/practice/pooja/sample.ds'
;
#################################################################
#### STAGE: DB2_UDB_Enterprise_40
## Operator
db2read
## Operator options
-query 'select account_name from dsedw.account_arngmnt_dimn fetch first 10 rows only'
-dbname 'edw_d_d1'
-server 'dsedw'
-client_instance 'dsedw'
-user 'dsadm'
-password '[&__V0S40P1_password]'
## General options
[ident('DB2_UDB_Enterprise_40'); jobmon_ident('DB2_UDB_Enterprise_40')]
## Outputs
0> [modify (
account_name:nullable string[]=account_name;
)] 'DB2_UDB_Enterprise_40:DSLink2.v'
;
# End of OSH code

Item #: 6
Event ID: 80
Timestamp: 2010-10-13 11:29:47
Type: Warning
User Name: dsadm
Message Id: DSTAGE_RUN_W_0005
Message: Parallel debug is turned on (environment variable DS_PXDEBUG is set)

Item #: 7
Event ID: 81
Timestamp: 2010-10-13 11:29:47
Type: Info
User Name: dsadm
Message Id: IIS-DSEE-TFCN-00001
Message: main_program: IBM WebSphere DataStage Enterprise Edition 8.1.0.5040
Copyright (c) 2001, 2005-2008 IBM Corporation. All rights reserved

Item #: 8
Event ID: 82
Timestamp: 2010-10-13 11:29:47
Type: Info
User Name: dsadm
Message Id: IIS-DSEE-TFCN-00006
Message: main_program: conductor uname: -s=AIX; -r=3; -v=5; -n=PNBDWH04; -m=00CE74414C00

Item #: 9
Event ID: 83
Timestamp: 2010-10-13 11:29:47
Type: Info
User Name: dsadm
Message Id: IIS-DSEE-TOSH-00002
Message: main_program: orchgeneral: loaded
orchsort: loaded
orchstats: loaded

Item #: 10
Event ID: 84
Timestamp: 2010-10-13 11:29:48
Type: Info
User Name: dsadm
Message Id: IIS-DSEE-TCOS-00023
Message: main_program: Dump:
{
text="tsort
-key 'account_name'
-asc
-stable
[ident('Sort_31'); jobmon_ident('Sort_31')]
0< [] 'DB2_UDB_Enterprise_40:DSLink2.v'
0> [] 'Sort_31:DSLink30.v'
;
copy
[ident('Data_Set_35')]
0< [] 'Sort_31:DSLink30.v'
0>| [ds] '/home/dsadm/practice/pooja/sample.ds'
;
db2read
-query 'select account_name from dsedw.account_arngmnt_dimn fetch first 10 rows only'
-dbname 'edw_d_d1'
-server 'dsedw'
-client_instance 'dsedw'
-user 'dsadm'
-password '******'
[ident('DB2_UDB_Enterprise_40'); jobmon_ident('DB2_UDB_Enterprise_40')]
0> [modify(account_name:nullable string=account_name;)] 'DB2_UDB_Enterprise_40:DSLink2.v'
;",
line=1, column=1, name="", qualname="",
op={
text="tsort
-key 'account_name'
-asc
-stable
[ident('Sort_31'); jobmon_ident('Sort_31')]
0< [] 'DB2_UDB_Enterprise_40:DSLink2.v'
0> [] 'Sort_31:DSLink30.v'",
line=1, column=1, name=tsort, qualname=Sort_31,
wrapout={},
wrapperfile=tsort, kind=non_wrapper_cdi_op, exec_mode=none,
args="'account_name'-asc'-stable'",
input={ text="
0< [] 'DB2_UDB_Enterprise_40:DSLink2.v'", line=6,
column=1, name="", qualname="Sort_31[i0]",
data="DB2_UDB_Enterprise_40:DSLink2.v"
},
output={ text="
0> [] 'Sort_31:DSLink30.v'", line=7, column=1,
name="", qualname="Sort_31[o0]",
data="/home/dsadm/practice/pooja/sample.ds"
}
},
op={
text="
db2read
-query 'select account_name from dsedw.account_arngmnt_dimn fetch first 10 rows only'
-dbname 'edw_d_d1'
-server 'dsedw'
-client_instance 'dsedw'
-user 'dsadm'
-password '******'
[ident('DB2_UDB_Enterprise_40'); jobmon_ident('DB2_UDB_Enterprise_40')]
0> [modify(account_name:nullable string=account_name;)] 'DB2_UDB_Enterprise_40:DSLink2.v'",
line=14, column=1, name=db2read, qualname=DB2_UDB_Enterprise_40,
wrapout={},
wrapperfile=db2read, kind=non_wrapper_cdi_op, exec_mode=none,
args="'select account_name from dsedw.account_arngmnt_dimn fetch first 10 rows only'-dbname'edw_d_d1'-server'dsedw'-client_instance'dsedw'-user'dsadm'-password'******'",
output={
text="
0> [modify(account_name:nullable string=account_name;)] 'DB2_UDB_Enterprise_40:DSLink2.v'",
line=22, column=1, name="",
qualname="DB2_UDB_Enterprise_40[o0]",
data="DB2_UDB_Enterprise_40:DSLink2.v",
outadapt="account_name:nullable string=account_name;"
}
},
data={ text="
0< [] 'DB2_UDB_Enterprise_40:DSLink2.v'", line=6, column=1,
name="DB2_UDB_Enterprise_40:DSLink2.v",
qualname="DB2_UDB_Enterprise_40:DSLink2.v",
partwrapout={},
collwrapout={},
dir=flow, kind=ds, inrefcount=1, writer=DB2_UDB_Enterprise_40,
reader=Sort_31, pp=none, trunc=default,
ident="DB2_UDB_Enterprise_40:DSLink2.v"
},
data={ text="
0>| [ds] '/home/dsadm/practice/pooja/sample.ds'", line=12,
column=1, name="/home/dsadm/practice/pooja/sample.ds",
qualname="/home/dsadm/practice/pooja/sample.ds",
partwrapout={},
collwrapout={},
dir=output, kind=ds, writer=Sort_31, reader="", pp=none,
trunc=replace, ident="/home/dsadm/practice/pooja/sample.ds"
}
}
.

Item #: 11
Event ID: 85
Timestamp: 2010-10-13 11:29:48
Type: Info
User Name: dsadm
Message Id: IIS-DSEE-TFSC-00001
Message: main_program: APT configuration file: /home/dsadm/Ascential/DataStage/Configurations/default.apt
{
node "node1"
{
fastname "PNBDWH04"
pools "" "node1"
resource disk "/dsprocessph2/dsprocess/DataStage/Datasets" {pools ""}
resource scratchdisk "/dsprocessph2/dsprocess/DataStage/Scratch" {pools "" "sort"}
}
node "node2"
{
fastname "PNBDWH04"
pools "" "node2"
resource disk "/dsprocessph2/dsprocess/DataStage/Datasets" {pools ""}
resource scratchdisk "/dsprocessph2/dsprocess/DataStage/Scratch" {pools "" "sort"}
}
node "node3"
{
fastname "PNBDWH04"
pools "" "node3"
resource disk "/dsprocessph2/dsprocess/DataStage/Datasets" {pools ""}
resource scratchdisk "/dsprocessph2/dsprocess/DataStage/Scratch" {pools "" "sort"}
}
node "node4"
{
fastname "PNBDWH05"
pools "" "node4"
resource disk "/bmetlscratch2/dsprocess/DataStage/Datasets" {pools ""}
resource scratchdisk "/bmetlscratch2/dsprocess/DataStage/Scratch" {pools "" "sort"}
}
}

Item #: 12
Event ID: 86
Timestamp: 2010-10-13 11:29:50
Type: Info
User Name: dsadm
Message Id: IIS-DSEE-TFSU-00016
Message: main_program: This step has 2 datasets:
ds0: {op0[1p] (sequential DB2_UDB_Enterprise_40)
eOther(APT_HashPartitioner { key={ value=account_name,
subArgs={ asc }
}
})<>eCollectAny
op1[4p] (parallel Sort_31)}
ds1: {op1[4p] (parallel Sort_31)
[pp] =>
/home/dsadm/practice/pooja/sample.ds}
It has 2 operators:
op0[1p] {(sequential DB2_UDB_Enterprise_40)
on nodes (
node4[op0,p0]
)}
op1[4p] {(parallel Sort_31)
on nodes (
node1[op1,p0]
node2[op1,p1]
node3[op1,p2]
node4[op1,p3]
)}
It runs 5 processes on 4 nodes.

Item #: 13
Event ID: 87
Timestamp: 2010-10-13 11:29:51
Type: Info
User Name: dsadm
Message Id: IIS-DSEE-TUTL-00031
Message: node_node4: The open files limit is 102400; raising to 9223372036854775807.

Item #: 14
Event ID: 88
Timestamp: 2010-10-13 11:29:51
Type: Info
User Name: dsadm
Message Id: IIS-DSEE-TFPM-00324
Message: Sort_31,0: Calling runLocally: step=0, node=node1, op=1, ptn=0

Item #: 15
Event ID: 89
Timestamp: 2010-10-13 11:29:51
Type: Info
User Name: dsadm
Message Id: IIS-DSEE-TFPM-00324
Message: Sort_31,1: Calling runLocally: step=0, node=node2, op=1, ptn=1

Item #: 16
Event ID: 90
Timestamp: 2010-10-13 11:29:51
Type: Info
User Name: dsadm
Message Id: IIS-DSEE-TFPM-00324
Message: Sort_31,2: Calling runLocally: step=0, node=node3, op=1, ptn=2

Item #: 17
Event ID: 91
Timestamp: 2010-10-13 11:29:51
Type: Info
User Name: dsadm
Message Id: IIS-DSEE-TFPM-00324
Message: DB2_UDB_Enterprise_40,0: Calling runLocally: step=0, node=node4, op=0, ptn=0

Item #: 18
Event ID: 92
Timestamp: 2010-10-13 11:29:51
Type: Info
User Name: dsadm
Message Id: IIS-DSEE-TFPM-00324
Message: Sort_31,3: Calling runLocally: step=0, node=node4, op=1, ptn=3

Item #: 19
Event ID: 93
Timestamp: 2010-10-13 11:29:51
Type: Fatal
User Name: dsadm
Message Id: IIS-DSEE-TFOR-00001
Message: Sort_31,3: Failure during execution of operator logic. [api/operator_rep.C:399]

Item #: 20
Event ID: 94
Timestamp: 2010-10-13 11:29:51
Type: Info
User Name: dsadm
Message Id: IIS-DSEE-TFOR-00163
Message: Sort_31,3: Input 0 consumed 0 records.

Item #: 21
Event ID: 95
Timestamp: 2010-10-13 11:29:51
Type: Info
User Name: dsadm
Message Id: IIS-DSEE-TFOR-00094
Message: Sort_31,3: Output 0 produced 0 records.

Item #: 22
Event ID: 96
Timestamp: 2010-10-13 11:29:51
Type: Fatal
User Name: dsadm
Message Id: IIS-DSEE-TOSO-00027
Message: Sort_31,3: Fatal Error: Need to be able to open at least 16 files; please check your ulimit setting for number of file descriptors [sort/merger.C:1087]

Item #: 23
Event ID: 97
Timestamp: 2010-10-13 11:29:51
Type: Fatal
User Name: dsadm
Message Id: IIS-DSEE-TFOR-00001
Message: Sort_31,3: Failure during execution of operator logic. [api/operator_rep.C:399]

Item #: 24
Event ID: 98
Timestamp: 2010-10-13 11:29:51
Type: Info
User Name: dsadm
Message Id: IIS-DSEE-TFOR-00163
Message: Sort_31,3: Input 0 consumed 0 records.

Item #: 25
Event ID: 99
Timestamp: 2010-10-13 11:29:51
Type: Info
User Name: dsadm
Message Id: IIS-DSEE-TFOR-00094
Message: Sort_31,3: Output 0 produced 0 records.

Item #: 26
Event ID: 100
Timestamp: 2010-10-13 11:29:51
Type: Fatal
User Name: dsadm
Message Id: IIS-DSEE-TOSO-00014
Message: Sort_31,3: Fatal Error: Sorter handshake read failed: unexpected EOF [sort/merger.C:418]

Item #: 27
Event ID: 101
Timestamp: 2010-10-13 11:29:51
Type: Fatal
User Name: dsadm
Message Id: IIS-DSEE-TFPM-00192
Message: node_node4: Player 2 terminated unexpectedly. [processmgr/player.C:149]

Item #: 28
Event ID: 102
Timestamp: 2010-10-13 11:29:56
Type: Fatal
User Name: dsadm
Message Id: IIS-DSEE-TFPM-00338
Message: main_program: APT_PMsectionLeader(4, node4), player 2 - Unexpected exit status 1. [processmgr/slprocess.C:368]

Item #: 29
Event ID: 103
Timestamp: 2010-10-13 11:29:56
Type: Fatal
User Name: dsadm
Message Id: IIS-DSEE-TFSC-00011
Message: main_program: Step execution finished with status = FAILED. [sc/sc_api.C:242]

Item #: 30
Event ID: 104
Timestamp: 2010-10-13 11:29:56
Type: Info
User Name: dsadm
Message Id: IIS-DSEE-TCOS-00026
Message: main_program: Startup time, 0:08; production run time, 0:00.

Item #: 31
Event ID: 105
Timestamp: 2010-10-13 11:29:57
Type: Control
User Name: dsadm
Message Id: DSTAGE_RUN_I_0075
Message: Job Untitled4 aborted.

Please Help.....

Posted: Wed Oct 13, 2010 1:09 am
by ArndW
I would start with your ulimit settings at both the user and the system limits level, see
...Need to be able to open at least 16 files; please check your ulimit setting for number of file descriptors...

Posted: Wed Oct 13, 2010 1:17 am
by Vidyut
have set all the parameters in ulimit file as unlimited for root as well as dsadm user in both the servers i.e.datastage and database
The problem still exists....
Thanks

Posted: Wed Oct 13, 2010 1:26 am
by ArndW
Edit your job and add a before-job subroutine call to "ExecSH" with the parameter "ulimit -a" and see what the actual runtime value is. I am willing to be that the value you get is "16" and not unlimited.

Note that you ulimit value is often set or modified in the dsenv file.

Posted: Thu Oct 21, 2010 4:03 am
by Vidyut
Hi

The issue is now resolved after installing Fix Pack 1(FP1)

Thanks for all your support.....