what stage to use?

Jay · Post by **Jay** » Fri Oct 28, 2005 10:37 am

Hi All,

My requirement is this.

Split a column value , value1.value2,
to two separate columns containing value1 and value2.

e.x. Input col1=1234.5678 --> Output col1 = 1234 and Output col2=5678

What stage do i use other than the transformer?

DS ver = 7.5 EE(PX)

Thanks in advance
jay

ArndW · Post by **ArndW** » Fri Oct 28, 2005 10:50 am

I took a look at the functions available to you in a modify stage and there is nothing that will do this for you on a numeric datatype. I would use a transform stage and (if it is a string datatype) using the FIELD function; if it is a decimal field then INT() and ({Original}-INT({Original})*{precision}

Jay · Post by **Jay** » Fri Oct 28, 2005 11:09 am

Thanks. I was not getting any ideas on modify stage either.

Raog · Post by **Raog** » Fri Oct 28, 2005 11:38 am

HI,

U dont require any other stage other than transformer. U only require target to have 2 columns to store the value 1 and value 2.

Thx,
Rao.

Jay wrote:Hi All,

My requirement is this.

Split a column value , value1.value2,
to two separate columns containing value1 and value2.

e.x. Input col1=1234.5678 --> Output col1 = 1234 and Output col2=5678

What stage do i use other than the transformer?

DS ver = 7.5 EE(PX)

Thanks in advance
jay

ArndW · Post by **ArndW** » Fri Oct 28, 2005 12:44 pm

Rao,

Transform stages in Px can and should be avoided where possible. That is why Jay asked if it could be done any other way - the modify and copy stages are the most likely candidates; but the INDEX function isn't built into either of them, and their intrinsic function sets are more geared toward type conversions than content manipulations.

track_star · Post by **track_star** » Fri Oct 28, 2005 1:19 pm

Arnd is 110% correct regarding transformers in PX jobs. Avoid them like the plague if you are expecting any kind of performance out of your job (which is the reason we're all using PX in the first place, right?!?!?). Jay, my suggestion is this:

1. Create a shell script that uses sed and/or awk to parse the string and create the two output values, then create a wrapper in the GUI and wrap that script.

2. Create a simple C++ program in a buildop that uses builtin C++ string manipulation functions to parse the string and let the GUI create all of the necessary files and the stage.

ray.wurlod · Post by **ray.wurlod** » Fri Oct 28, 2005 6:44 pm

What is the source - text file or database table or other?

What is the data type that contains 1234.5678 ?

Jay · Post by **Jay** » Mon Oct 31, 2005 1:36 pm

Thanks for the suggestions. Here are more details.

The source will be an Oracle load table with a max of about 25 million records.

The datatype is string, Varchar2(20), to be exact. Its basically 2 ids in 1 column.

Target is also an Oracle table.But the columns all have their meaningful, respective, individual (etc etc) values.

Now, thinking about it, I can also do a substr() in Oracle stage itself.

Please let me know if there are any other ideas.

thanks
jay

ameyvaidya · Post by **ameyvaidya** » Tue Nov 01, 2005 6:12 pm

Do not know much about the performance aspects (vis-a-vis a transformer) of this, but cant a column import stage be used for this?

ThilSe · Post by **ThilSe** » Wed Nov 02, 2005 1:25 am

Hi,

You can also try using the "Column Import" stage with "." as a delimitter

Regards
Senthil

Jay · Post by **Jay** » Wed Nov 02, 2005 11:42 am

thanks guys....i can see which one performs better....i'll keep u posted

my environment has not yet been set up....so might just take some time....

roy · Post by **roy** » Thu Nov 03, 2005 3:22 am

Hi,
Since you say the source is in Oracle DB how about user define sql that will split the column in the select?
Check performance it might be good for you performance wise and eliminate C++ programing.

IHTH,

Jay · Post by **Jay** » Thu Nov 03, 2005 10:03 am

Yes. so performance is of vital importance....

also i will selecting from a load table and inserting into a partitioned table...

already there are problems with sub-partitions...it seems there is a bug with Oracle 9i ver 9.2.0.4.0

if you create subpartitions, then the data dictionary gets corrupted...the DBAs suggested going with list partioning on a composite key....

so lets see how many more hurdles arise....it should be fine as long as they are not too high...
jay

Jay · Post by **Jay** » Mon Nov 14, 2005 11:26 am

Hi all,

I got the job to run. i have not done performance testing yet as i dont have a big enough test file.

In the Oracle stage, in the query i say:

nvl(to_number(substr(ZONE_ID,6,4),'9999'),'9999') as ZONE_CODE

ZONE_ID is varchar(20). But actual value is like 1234.5678.

Column ZONE_CODE is defined as Integer (4) in the Oracle stage.

i get a warning for implicit conversions.

SRC_ORA: When checking operator: When binding output interface field "ZONE_CODE" to field "ZONE_CODE": Implicit conversion; from source type "decimal[38,10]" to result type "int32": Possible range/precision limitation

Any ideas as to what i can do to avoid these warnings. This warning is in the Oracle stage itself.

How do i use the modify stage to do the implicit conversions or is there any other way?

Thanks
jay

ameyvaidya · Post by **ameyvaidya** » Mon Nov 14, 2005 2:06 pm

Hi Jay,

Recommend reading the column as a decimal 38,10 and converting it to an Int32 in a modify stage downstream.

DSXchange

what stage to use?

what stage to use?

Re: what stage to use?