varchar to integer - Moderator

Archive of postings to DataStageUsers@Oliver.com. This forum intended only as a reference and cannot be posted to.

Moderators: chulett, rschirm

Locked
admin
Posts: 8720
Joined: Sun Jan 12, 2003 11:26 pm

varchar to integer - Moderator

Post by admin »

This is a topic for an orphaned message.
admin
Posts: 8720
Joined: Sun Jan 12, 2003 11:26 pm

Post by admin »

Clif,

Data Elements are a really nifty idea that never really took of or is used in DataStage. It essentially (or is it Ascentially?) allows you to "tag" what type of content a particular routine/transform to data elements (that would use the routine). This way you could display for example only Date type functions/routines in the drop down in the derivation window when the data element for the column is "date". I realize that this might be a double take on the Data Type but in reality Data Types like Char and VarChars etc. get used for a whole lot of different data values where you would not be able to arbitrarily determine what types of functions/routines
one might wish to apply to it. In addition there are internal techniques
to retrieve the various meta data associated with a particular column (name, width, description, data element etc) which data element can be used to apply a particular type of transformation (for example the XML
reader/Writer stages use a XML data element used in this fashion). The
reality is these are rarely if ever used but are kind of nifty.

Regards,

Allen Spayth


----- Original Message -----
From: "Moderator"
To:
Sent: Wednesday, May 09, 2001 3:48 PM
Subject: Re: varchar to integer


> Which brings up a very fundamental question not explained in the class
> I took. What is the purpose of having both and SQL type and a Data
> Element? Ive used both ways--just defining SQL type of something
> like INT and either using or not using a Data Element of Numeric. I
> couldnt see where it made a difference?
>
> What concept am I missing here?
>
> Regards,
>
> Clif
>
>
> Wednesday, May 09, 2001, 5:48:28 PM, you wrote:
>
> > DataStage is typeless
>
>
admin
Posts: 8720
Joined: Sun Jan 12, 2003 11:26 pm

Post by admin »

Allen,

Do you work for Ascential? I am just interested as you seem to know a bit about the internals of the product.

I have also wondered about accessing the metadata for a column, as it would be good to validate the data before putting it into the database and having the database reject/truncate the data by calling a standard function (user written or built-in DataStage - Hint). Obviously this would rely on the metadata being a reflection of the actual target. Arguably I could probably write a function to do this, but it seems that DataStage must have all this information available internally when processing so it would be quicker for it be a built-in function, rather than me having to parse the data in the various DataStage files.


Phil

> -----Original Message-----
> From: Allen Spayth - wwc [SMTP:daspayth@wwc.com]
> Sent: Thursday, May 10, 2001 12:54 PM
> To: informix-datastage@oliver.com
> Subject: Re: varchar to integer - Moderator
>
> Clif,
>
> Data Elements are a really nifty idea that never really took of or is
> used in DataStage. It essentially (or is it Ascentially?) allows you
> to "tag" what type of content a particular routine/transform to data
> elements (that would use the routine). This way you could display for
> example only Date type functions/routines in the drop down in the
> derivation window when the data element for the column is "date". I
> realize that this might be a double take on the Data Type but in
> reality Data Types like Char and VarChars etc. get used for a whole
> lot of different data values where you would not be able to
> arbitrarily determine what types of functions/routines
> one might wish to apply to it. In addition there are internal techniques
> to retrieve the various meta data associated with a particular column
> (name,
> width, description, data element etc) which data element can be used to
> apply a particular type of transformation (for example the XML
> reader/Writer stages use a XML data element used in this fashion). The
> reality is these are rarely if ever used but are kind of nifty.
>
> Regards,
>
> Allen Spayth
>
>
> ----- Original Message -----
> From: "Moderator"
> To:
> Sent: Wednesday, May 09, 2001 3:48 PM
> Subject: Re: varchar to integer
>
>
> > Which brings up a very fundamental question not explained in the
> > class I took. What is the purpose of having both and SQL type and a
> > Data Element? Ive used both ways--just defining SQL type of
> > something like INT and either using or not using a Data Element of
> > Numeric. I couldnt see where it made a difference?
> >
> > What concept am I missing here?
> >
> > Regards,
> >
> > Clif
> >
> >
> > Wednesday, May 09, 2001, 5:48:28 PM, you wrote:
> >
> > > DataStage is typeless
> >
> >
admin
Posts: 8720
Joined: Sun Jan 12, 2003 11:26 pm

Post by admin »

Phil,

The plugin documentation talks about the meta data and internal operations that can be taken advantage of. Ive put together a few plugins in the past and played with some of the meta data handling capabilities. And your correct it would be pretty easy to have a "standard validation" stage which would detect things like invalid dates, nulls in not null defined columns,
none numeric in numeric columns etc. As for the database rejects; Ive
thought long and hard about how one might consider things of this nature. If you take for example the not null scenario; If you test every column respectively of every row for a empty string or null this requires significant CPU time and will impact overall throughput. If instead you let the database reject it and then have DataStage handle the exception via the reject handler you can avoid the column tests. However, if a notable amount of data actually has a null condition which would invoke the reject in the database you would also significantly impact the overall performance and would likely be better off doing the test in DataStage. So I believe perhaps the best strategy is to use a tool like Quality Manager (part of the DataStage XE Suite) to analysis the source system such that you can determine the frequency that these anomalies will occur so you can determine the best approach to the ETL process itself. ( i.e. to test for the sake of testing is worthless if 100% will always pass the test).

Regards,


Allen
----- Original Message -----
From: "Walker, Phil (Forest Resources)"
To:
Sent: Wednesday, May 09, 2001 6:29 PM
Subject: RE: varchar to integer - Moderator


> Allen,
>
> Do you work for Ascential? I am just interested as you seem to know a
> bit about the internals of the product.
>
> I have also wondered about accessing the metadata for a column, as it
would
> be good to validate the data before putting it into the database and
having
> the database reject/truncate the data by calling a standard function
> (user written or built-in DataStage - Hint). Obviously this would rely
> on the metadata being a reflection of the actual target. Arguably I
> could
probably
> write a function to do this, but it seems that DataStage must have all
this
> information available internally when processing so it would be
> quicker
for
> it be a built-in function, rather than me having to parse the data in
> the various DataStage files.
>
>
> Phil
>
> > -----Original Message-----
> > From: Allen Spayth - wwc [SMTP:daspayth@wwc.com]
> > Sent: Thursday, May 10, 2001 12:54 PM
> > To: informix-datastage@oliver.com
> > Subject: Re: varchar to integer - Moderator
> >
> > Clif,
> >
> > Data Elements are a really nifty idea that never really took of or
> > is
used
> > in DataStage. It essentially (or is it Ascentially?) allows you to
"tag"
> > what type of content a particular routine/transform to data elements
(that
> > would use the routine). This way you could display for example only
Date
> > type functions/routines in the drop down in the derivation window
> > when
the
> > data element for the column is "date". I realize that this might be
> > a double take on the Data Type but in reality Data Types like Char
> > and VarChars etc. get used for a whole lot of different data
> > values where you would not be able to arbitrarily determine what
> > types of functions/routines
> > one might wish to apply to it. In addition there are internal
techniques
> > to retrieve the various meta data associated with a particular
> > column (name, width, description, data element etc) which data
> > element can be used to apply a particular type of transformation
> > (for example the XML
> > reader/Writer stages use a XML data element used in this fashion). The
> > reality is these are rarely if ever used but are kind of nifty.
> >
> > Regards,
> >
> > Allen Spayth
> >
> >
> > ----- Original Message -----
> > From: "Moderator"
> > To:
> > Sent: Wednesday, May 09, 2001 3:48 PM
> > Subject: Re: varchar to integer
> >
> >
> > > Which brings up a very fundamental question not explained in the
> > > class I took. What is the purpose of having both and SQL type and
> > > a Data Element? Ive used both ways--just defining SQL type of
> > > something like INT and either using or not using a Data Element of
> > > Numeric. I couldnt see where it made a difference?
> > >
> > > What concept am I missing here?
> > >
> > > Regards,
> > >
> > > Clif
> > >
> > >
> > > Wednesday, May 09, 2001, 5:48:28 PM, you wrote:
> > >
> > > > DataStage is typeless
> > >
> > >
Locked