can we use mail and html or URL as a source in Datastage

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
pavankatra
Participant
Posts: 86
Joined: Wed Mar 03, 2010 3:09 am

can we use mail and html or URL as a source in Datastage

Post by pavankatra »

hi,
can we use mail and html or URL as a source in Datastage.If possible please tell me how to acheive this.

Thanks in advance.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

The short answer is 'no', at least not directly. There may be some 'workarounds' for this but I'll leave that for others to post.
-craig

"You can never have too many knives" -- Logan Nine Fingers
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

You're getting into the realm of unstructured data here, and that can only easily be done in DataStage using custom components. I've been on a couple of sites where emails of known structure were used as a source, and even that needed careful coding (in a Transformer stage in a server job).
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
jcthornton
Premium Member
Premium Member
Posts: 79
Joined: Thu Mar 22, 2007 4:58 pm
Location: USA

Post by jcthornton »

I'll readily agree with our most active participants. Those sources do not have standard connection stages because they do not have a fixed format that is easy to handle within DataStage.

Rather than building custom components within DataStage for performing the processing from those sources, I would recommend using a tool better suited to turning unstructured data into structured data. There are multiple paths to go down to do this, but ultimately it comes down to parsing.

So, use your favorite parsing tool to give the data you want to look at a standard structure - store the formatted, structured data in your preferred staging location - and it will become highly usable for anything you would like to do with it in DataStage.

If you want to couple it tighter with DataStage, it is possible to wrap your parser in ways to allow you to make it a part of the ETL job(s), although that is a route I personally prefer to avoid. Modularize everything, make each module simple, and use well defined interfaces in between. Makes it easier to understand, easier to troubleshoot, easier to maintain, and easier to replace individual components later.
Jack Thornton
----------------
Spectacular achievement is always preceded by spectacular preparation - Robert H. Schuller
pavankatra
Participant
Posts: 86
Joined: Wed Mar 03, 2010 3:09 am

Post by pavankatra »

jcthornton wrote:I'll readily agree with our most active participants. Those sources do not have standard connection stages because they do not have a fixed format that is easy to handle within DataStage.

Rather than building custom components within DataStage for performing the processing from those sources, I would recommend using a tool better suited to turning unstructured data into structured data. There are multiple paths to go down to do this, but ultimately it comes down to parsing.

So, use your favorite parsing tool to give the data you want to look at a standard structure - store the formatted, structured data in your preferred staging location - and it will become highly usable for anything you would like to do with it in DataStage.

If you want to couple it tighter with DataStage, it is possible to wrap your parser in ways to allow you to make it a part of the ETL job(s), although that is a route I personally prefer to avoid. Modularize everything, make each module simple, and use well defined interfaces in between. Makes it easier to understand, easier to troubleshoot, easier to maintain, and easier to replace individual components later.
Thanks to all.
Sreenivasulu
Premium Member
Premium Member
Posts: 892
Joined: Thu Oct 16, 2003 5:18 am

Post by Sreenivasulu »

You can make the format simplistic - blank email (with only the subject). Something like that would be workable. But i do not know how to read the email exchange server port using datastage.
Regards
Sreeni
Post Reply