Wrapper - Difference in execution ( Sequential v/s Parallel

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
nishadkapadia
Charter Member
Charter Member
Posts: 47
Joined: Fri Mar 18, 2005 5:59 am

Wrapper - Difference in execution ( Sequential v/s Parallel

Post by nishadkapadia »

Hi,

OS: AIX
PX :7.5.1
Config file:-8 nodes

I have a job which does the following:-
a) Extraction of Header Record
b) Extraction of Tail Record
c) Row Count
d) Delimiter Sanity Check
e) FileName Check

To do this, i have couple of wrappers and a transform in a job. (Multiple instance is not checked). The wrapper is executed in parallel mode withi the job.

This job is called through a sequence looping though files. The job failes after processing some files (around 80). If the wrapper is executed in sequential mode then it runs fine.

I have set APT_DEBUG_SUBPROC env variable to true, however there is no information which guides me as to the plausible problem.It is possible that i am not looking at the right place.

As per my understanding i feel that wrapper should be executed in sequential mode otherwise it executes the same command ( depending on the nodes in the config file).Please do correct me on my understanding.

However, i still dont get why it fails when wrapper is executed in parallel mode.?has any one seen this problem or i am missing something here.

I did a search on the forum however was unable to find a similar thread.However , if this has been covered earlier it would be great if anyone could direct me to the thread.

Thanks in advancee for all your help.
kumar_s
Charter Member
Charter Member
Posts: 5245
Joined: Thu Jun 16, 2005 11:00 pm

Post by kumar_s »

What is the command been wrapped in Wrapper stage?
Impossible doesn't mean 'it is not possible' actually means... 'NOBODY HAS DONE IT SO FAR'
nishadkapadia
Charter Member
Charter Member
Posts: 47
Joined: Fri Mar 18, 2005 5:59 am

Post by nishadkapadia »

kumar_s wrote:What is the command been wrapped in Wrapper stage?
Hi,

It's a simple command of "wc -l " having <<filename>> as an argument with value option within the wrapper stage.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

What are the symptoms of "failure"?

How many headers do you want? Does that suggest why sequential mode works and parallel mode does not?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
nishadkapadia
Charter Member
Charter Member
Posts: 47
Joined: Fri Mar 18, 2005 5:59 am

Post by nishadkapadia »

Yes,That is the most confusing.
Without any changes to the job, it works in parallel mode. However, it fails on processing some random file in sequential mode.

Job Design
From sequential files i read 4 records( since delimiter check)

For header i use a head stage and i check in transformer depending on Dcount of the record. Since head stage will return for each partition and one is not sure the header record is in which. There could be a better way to do this.

For row count i use a wrapper with a simple command line "wc -l".

For footer record i use a wrapper since while using the tail stage it invariably will read the entire record into a virtual dataset file and apply tail stage.

Code: Select all

Seq        -->     Copy --> Head Stage --> Transformer --> Seq
(4 rows)                   --> Wrapper ( RowCount)         
                               --> Wrapper ( Footer Record ).
There are no additional messages.
Please do suggest whether i need to turn on some extra logging information.

Thanks in anticipation.
ray.wurlod wrote:What are the symptoms of "failure"?

How many headers do you want? Does that suggest why sequential mode works and parallel mode does not? ...
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

How do you manage writing to a Sequential File stage as target in parallel mode? Is a Collector present on its input link?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
kumar_s
Charter Member
Charter Member
Posts: 5245
Joined: Thu Jun 16, 2005 11:00 pm

Post by kumar_s »

What is the error message you are getting? Post the log. If the log is cryptic, try to reset the job, if find "From the Previous run..." post that log as well.
Impossible doesn't mean 'it is not possible' actually means... 'NOBODY HAS DONE IT SO FAR'
nishadkapadia
Charter Member
Charter Member
Posts: 47
Joined: Fri Mar 18, 2005 5:59 am

Post by nishadkapadia »

Hi,

Yes i am having a collector on the Seq Stage.

I am not able to find the error message, maybe i am missing something here.

The log is hereby inline attached. For the actual filename of my job, i have replaced '<<ACTUAL_FILE_NAME>>' in the log. This particular file had 1000 records and i had to kill the job.

I have set Envrionment Variable *DUMP_SCORE,*DUMP,*ECHO,*EXPLAIN,*DEBUG_SUBPROC env to True for any possibility for information. Please do suggest whether i am missing something here.

I did receive a warning of Lookup in the director. However, felt that warnings are just flags to lookout for.Could it be the reason.
Lkp_GenericFile,0: Ignoring duplicate entry at table record 1; no further warnings will be issued for this table


main_program: Explanation:
Step has 16 operators.
???, invoked with args: -nrecs 1
input port 0 bound to data entity "RowCount_NK_195:Ln_Rd_RowCnt_GenericFile.v"
output port 0 bound to data entity "Hd_GenericFile:Ln_Rd_Hd_Generic_File.v"
???, invoked with args: -schema record {final_delim=end} ( Output_Column: string[max=5000] {quote=none}; ) -file <<FILE NAME>> -rejects fail -reportProgress yes -first 4 -sourceNameField fileNameColumn
output port 0 bound to data entity "Sqfl_GenericFile:Ln_Rd_Sqfl_GenericFile.v"
???
input port 0 bound to data entity "Sqfl_GenericFile:Ln_Rd_Sqfl_GenericFile.v"
output port 0 bound to data entity "Cpy_GenericFile:Ln_Rd_Cpy_GenericFile.v"
output port 1 bound to data entity "Cpy_GenericFile:Ln_Rd_Cpy_GenericFile_Header.v"
output port 2 bound to data entity "Cpy_GenericFile:Ln_Rd_Cpy_Generic_File_Header.v"
output port 3 bound to data entity "Cpy_GenericFile:DSLink7.v"
???, invoked with args: -flag run -name V0S118_Test_Wrapper_DynSchema_Tx_GenericFile
input port 0 bound to data entity "Cpy_GenericFile:Ln_Rd_Cpy_GenericFile_Header.v"
output port 0 bound to data entity "Tx_GenericFile:Ln_Rd_Tx_Delimiter_TransDate.v"
???, invoked with args: -schema record {final_delim=end, delim=",", quote=double} ( Delimiter_Col: string; fileNameColumn: string[max=50]; Action: string[max=50]; ) -file /db2fs1/nishad/Test_File_1.txt -append -rejects continue
input port 0 bound to data entity "Transformer_188:DSLink189.v"
???, invoked with args: -nrecs 1
input port 0 bound to data entity "Cpy_GenericFile:Ln_Rd_Cpy_Generic_File_Header.v"
output port 0 bound to data entity "Hdr_Generic_File:Ln_Rd_Hd_Generic_File.v"
???, invoked with args: -flag run -name V0S128_Test_Wrapper_DynSchema_Trn_Header_FileName
input port 0 bound to data entity "Hdr_Generic_File:Ln_Rd_Hd_Generic_File.v"
output port 0 bound to data entity "Trn_Header_FileName:Ln_Rd_Trn_Header_FileName.v"
???, invoked with args: -schema record {final_delim=end, delim=",", quote=double} ( Header_Record: string {quote=none}; fileNameColumn: string; Trans_Date: string[max=38]; ) -file /db2fs1/nishad/Header_FileName.txt -append -rejects continue
input port 0 bound to data entity "Trn_Header_FileName:Ln_Rd_Trn_Header_FileName.v"
???, invoked with args: -flag run -name V4S2_Test_Wrapper_DynSchema_Trn_Tail_GenericFile -argvalue FILENAME=<<ACTUAL_FILE NAME>>
input port 0 bound to data entity "Tail_NK_194:Ln_Rd_GenericFile.v"
output port 0 bound to data entity "Trn_Tail_GenericFile:DSLink145.v"
???, invoked with args: -flag run -name V0S169_Test_Wrapper_DynSchema_Trn_Generic_File -argvalue FILENAME=<<FILE NAME>>
input port 0 bound to data entity "Hd_GenericFile:Ln_Rd_Hd_Generic_File.v"
output port 0 bound to data entity "Trn_Generic_File:DSLink170.v"
???, invoked with args: -table -key File_Name -keep -ifNotFound fail
input port 0 bound to data entity "Trn_Generic_File:DSLink170.v"
input port 1 bound to data entity "Trn_Tail_GenericFile:DSLink145.v"
output port 0 bound to data entity "Lkp_GenericFile:DSLink177.v"
???, invoked with args: -schema record {final_delim=end, delim=",", quote=double} ( Tail_Column: string; fileName: string; Actual_Row_Count: int32; Seq_Row_Count: string[max=10]; Seq_Extraction_Date: string[max=38]; ) -file /db2fs1/nishad/Tail_FileName.txt -append -rejects continue
input port 0 bound to data entity "Lkp_GenericFile:DSLink177.v"
???, invoked with args: -keep last -key IsChangedVar
input port 0 bound to data entity "Tx_GenericFile:Ln_Rd_Tx_Delimiter_TransDate.v"
output port 0 bound to data entity "Rd_Generic_File:Ln_Rd_Generic_File.v"
???, invoked with args: -flag run -name V0S188_Test_Wrapper_DynSchema_Transformer_188
input port 0 bound to data entity "Rd_Generic_File:Ln_Rd_Generic_File.v"
output port 0 bound to data entity "Transformer_188:DSLink189.v"
Operator "Tail_NK_194" , invoked with args: <<FILE NAME>>
input port 0 bound to data entity "Cpy_GenericFile:DSLink7.v"
output port 0 bound to data entity "Tail_NK_194:Ln_Rd_GenericFile.v"
Operator "RowCount_NK_195" , invoked with args: <<FILE NAME>>
input port 0 bound to data entity "Cpy_GenericFile:Ln_Rd_Cpy_GenericFile.v"
output port 0 bound to data entity "RowCount_NK_195:Ln_Rd_RowCnt_GenericFile.v"
Step has 16 data entities.
Data "RowCount_NK_195:Ln_Rd_RowCnt_GenericFile.v"(an Orchestrate data set)
written by operator "RowCount_NK_195"
read by operator "Hd_GenericFile"
Data "Hd_GenericFile:Ln_Rd_Hd_Generic_File.v"(an Orchestrate data set)
written by operator "Hd_GenericFile"
read by operator "Trn_Generic_File"
Data "Sqfl_GenericFile:Ln_Rd_Sqfl_GenericFile.v"(an Orchestrate data set)
written by operator "Sqfl_GenericFile"
read by operator "Cpy_GenericFile"
Data "Cpy_GenericFile:Ln_Rd_Cpy_GenericFile.v"(an Orchestrate data set)
written by operator "Cpy_GenericFile"
read by operator "RowCount_NK_195"
Data "Cpy_GenericFile:Ln_Rd_Cpy_GenericFile_Header.v"(an Orchestrate data set)
written by operator "Cpy_GenericFile"
read by operator "Tx_GenericFile"
Data "Cpy_GenericFile:Ln_Rd_Cpy_Generic_File_Header.v"(an Orchestrate data set)
written by operator "Cpy_GenericFile"
read by operator "Hdr_Generic_File"
Data "Cpy_GenericFile:DSLink7.v"(an Orchestrate data set)
written by operator "Cpy_GenericFile"
read by operator "Tail_NK_194"
Data "Tx_GenericFile:Ln_Rd_Tx_Delimiter_TransDate.v"(an Orchestrate data set)
written by operator "Tx_GenericFile"
read by operator "Rd_Generic_File"
Data "Transformer_188:DSLink189.v"(an Orchestrate data set)
written by operator "Transformer_188"
read by operator "Sqfl_Rd_TransDate"
Data "Hdr_Generic_File:Ln_Rd_Hd_Generic_File.v"(an Orchestrate data set)
written by operator "Hdr_Generic_File"
read by operator "Trn_Header_FileName"
Data "Trn_Header_FileName:Ln_Rd_Trn_Header_FileName.v"(an Orchestrate data set)
written by operator "Trn_Header_FileName"
read by operator "Sqfl_Header_FileName"
Data "Tail_NK_194:Ln_Rd_GenericFile.v"(an Orchestrate data set)
written by operator "Tail_NK_194"
read by operator "Trn_Tail_GenericFile"
Data "Trn_Tail_GenericFile:DSLink145.v"(an Orchestrate data set)
written by operator "Trn_Tail_GenericFile"
read by operator "Lkp_GenericFile"
Data "Trn_Generic_File:DSLink170.v"(an Orchestrate data set)
written by operator "Trn_Generic_File"
read by operator "Lkp_GenericFile"
Data "Lkp_GenericFile:DSLink177.v"(an Orchestrate data set)
written by operator "Lkp_GenericFile"
read by operator "Sequential_File_178"
Data "Rd_Generic_File:Ln_Rd_Generic_File.v"(an Orchestrate data set)
written by operator "Rd_Generic_File"
read by operator "Transformer_188"


Thanks in anticipation.
Post Reply