Page 1 of 1

The job is extremely slow after join

Posted: Thu Aug 02, 2007 8:56 pm
by xicheng_my_love
I have to Sequence File joined,then transformed,then upsert into Oracle.
Reading from Sequence File is about 25000 rows/sec,but after joining the link shows that it is about 50 rows/sec,it is intolerable.
Then I find that when upserting into Oracle many records are rejected.Maybe there is too much space in my data,so it is too long for the datatype(varchar) specified in the Oracle and this makes Oracle rejects records.
I set the Environment Variable APT_STRING_PADCHAR to 0x00,and use trim() in the Transformer.This works,the job runs fast.
But there is another problem,if I use trim() in the Transformer,because some trimmed fields of some records are null,so these records are dropped and this is not conformed to the business logic.
How can I solve this problem?thanks.

Posted: Thu Aug 02, 2007 8:58 pm
by ArndW
Put a HandleNull() call in the derivations where you TRIM data that might be null and where you don't want it to be. Or a simple

Code: Select all

IF IsNull(In.Col) THEN SetNull() ELSE TRIM(In.Col)

Posted: Thu Aug 02, 2007 9:17 pm
by xicheng_my_love
I use substitution derivation in the Transformer,this is what I do:if isNull($1) then setNull() else trim($1).but it still can not work.thanks.

Posted: Thu Aug 02, 2007 10:51 pm
by balajisr
You might be using $1 elsewhere in the transformer. Please check it.

Posted: Fri Aug 03, 2007 4:11 am
by ArndW
I had no idea that $1 works in a transform stage. What value gets used?

Posted: Fri Aug 03, 2007 4:20 am
by ray.wurlod
$1 (indeed regular expressions) are used in Derivation Substitution, not in derivation expressions themselves. For example, if you have 200 VarChar columns and want to apply a Trim() function to all of them, select them all then right click and choose Derivation Substitution from the menu. The rest is explained in the GUI.