Page 1 of 1

what stage to use?

Posted: Fri Oct 28, 2005 10:37 am
by Jay
Hi All,

My requirement is this.

Split a column value , value1.value2,
to two separate columns containing value1 and value2.

e.x. Input col1=1234.5678 --> Output col1 = 1234 and Output col2=5678

What stage do i use other than the transformer?

DS ver = 7.5 EE(PX)

Thanks in advance
jay

Posted: Fri Oct 28, 2005 10:50 am
by ArndW
I took a look at the functions available to you in a modify stage and there is nothing that will do this for you on a numeric datatype. I would use a transform stage and (if it is a string datatype) using the FIELD function; if it is a decimal field then INT() and ({Original}-INT({Original})*{precision}

Posted: Fri Oct 28, 2005 11:09 am
by Jay
Thanks. I was not getting any ideas on modify stage either.

Re: what stage to use?

Posted: Fri Oct 28, 2005 11:38 am
by Raog
HI,

U dont require any other stage other than transformer. U only require target to have 2 columns to store the value 1 and value 2.

Thx,
Rao.
Jay wrote:Hi All,

My requirement is this.

Split a column value , value1.value2,
to two separate columns containing value1 and value2.

e.x. Input col1=1234.5678 --> Output col1 = 1234 and Output col2=5678

What stage do i use other than the transformer?

DS ver = 7.5 EE(PX)

Thanks in advance
jay

Posted: Fri Oct 28, 2005 12:44 pm
by ArndW
Rao,

Transform stages in Px can and should be avoided where possible. That is why Jay asked if it could be done any other way - the modify and copy stages are the most likely candidates; but the INDEX function isn't built into either of them, and their intrinsic function sets are more geared toward type conversions than content manipulations.

Posted: Fri Oct 28, 2005 1:19 pm
by track_star
Arnd is 110% correct regarding transformers in PX jobs. Avoid them like the plague if you are expecting any kind of performance out of your job (which is the reason we're all using PX in the first place, right?!?!?). Jay, my suggestion is this:

1. Create a shell script that uses sed and/or awk to parse the string and create the two output values, then create a wrapper in the GUI and wrap that script.

2. Create a simple C++ program in a buildop that uses builtin C++ string manipulation functions to parse the string and let the GUI create all of the necessary files and the stage.

Posted: Fri Oct 28, 2005 6:44 pm
by ray.wurlod
What is the source - text file or database table or other?

What is the data type that contains 1234.5678 ?

Posted: Mon Oct 31, 2005 1:36 pm
by Jay
Thanks for the suggestions. Here are more details.

The source will be an Oracle load table with a max of about 25 million records.

The datatype is string, Varchar2(20), to be exact. Its basically 2 ids in 1 column.

Target is also an Oracle table.But the columns all have their meaningful, respective, individual (etc etc) values.

Now, thinking about it, I can also do a substr() in Oracle stage itself.

Please let me know if there are any other ideas.

thanks
jay

Posted: Tue Nov 01, 2005 6:12 pm
by ameyvaidya
Do not know much about the performance aspects (vis-a-vis a transformer) of this, but cant a column import stage be used for this?

Posted: Wed Nov 02, 2005 1:25 am
by ThilSe
Hi,

You can also try using the "Column Import" stage with "." as a delimitter

Regards
Senthil

Posted: Wed Nov 02, 2005 11:42 am
by Jay
thanks guys....i can see which one performs better....i'll keep u posted

my environment has not yet been set up....so might just take some time....

Posted: Thu Nov 03, 2005 3:22 am
by roy
Hi,
Since you say the source is in Oracle DB how about user define sql that will split the column in the select?
Check performance it might be good for you performance wise and eliminate C++ programing.

IHTH,

Posted: Thu Nov 03, 2005 10:03 am
by Jay
Yes. so performance is of vital importance....

also i will selecting from a load table and inserting into a partitioned table...

already there are problems with sub-partitions...it seems there is a bug with Oracle 9i ver 9.2.0.4.0

if you create subpartitions, then the data dictionary gets corrupted...the DBAs suggested going with list partioning on a composite key....

so lets see how many more hurdles arise....it should be fine as long as they are not too high...
jay

Posted: Mon Nov 14, 2005 11:26 am
by Jay
Hi all,

I got the job to run. i have not done performance testing yet as i dont have a big enough test file.

In the Oracle stage, in the query i say:

nvl(to_number(substr(ZONE_ID,6,4),'9999'),'9999') as ZONE_CODE

ZONE_ID is varchar(20). But actual value is like 1234.5678.

Column ZONE_CODE is defined as Integer (4) in the Oracle stage.

i get a warning for implicit conversions.

SRC_ORA: When checking operator: When binding output interface field "ZONE_CODE" to field "ZONE_CODE": Implicit conversion; from source type "decimal[38,10]" to result type "int32": Possible range/precision limitation

Any ideas as to what i can do to avoid these warnings. This warning is in the Oracle stage itself.

How do i use the modify stage to do the implicit conversions or is there any other way?

Thanks
jay

Posted: Mon Nov 14, 2005 2:06 pm
by ameyvaidya
Hi Jay,

Recommend reading the column as a decimal 38,10 and converting it to an Int32 in a modify stage downstream.