hi,
can we use mail and html or URL as a source in Datastage.If possible please tell me how to acheive this.
Thanks in advance.
can we use mail and html or URL as a source in Datastage
Moderators: chulett, rschirm, roy
-
- Participant
- Posts: 86
- Joined: Wed Mar 03, 2010 3:09 am
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
You're getting into the realm of unstructured data here, and that can only easily be done in DataStage using custom components. I've been on a couple of sites where emails of known structure were used as a source, and even that needed careful coding (in a Transformer stage in a server job).
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
-
- Premium Member
- Posts: 79
- Joined: Thu Mar 22, 2007 4:58 pm
- Location: USA
I'll readily agree with our most active participants. Those sources do not have standard connection stages because they do not have a fixed format that is easy to handle within DataStage.
Rather than building custom components within DataStage for performing the processing from those sources, I would recommend using a tool better suited to turning unstructured data into structured data. There are multiple paths to go down to do this, but ultimately it comes down to parsing.
So, use your favorite parsing tool to give the data you want to look at a standard structure - store the formatted, structured data in your preferred staging location - and it will become highly usable for anything you would like to do with it in DataStage.
If you want to couple it tighter with DataStage, it is possible to wrap your parser in ways to allow you to make it a part of the ETL job(s), although that is a route I personally prefer to avoid. Modularize everything, make each module simple, and use well defined interfaces in between. Makes it easier to understand, easier to troubleshoot, easier to maintain, and easier to replace individual components later.
Rather than building custom components within DataStage for performing the processing from those sources, I would recommend using a tool better suited to turning unstructured data into structured data. There are multiple paths to go down to do this, but ultimately it comes down to parsing.
So, use your favorite parsing tool to give the data you want to look at a standard structure - store the formatted, structured data in your preferred staging location - and it will become highly usable for anything you would like to do with it in DataStage.
If you want to couple it tighter with DataStage, it is possible to wrap your parser in ways to allow you to make it a part of the ETL job(s), although that is a route I personally prefer to avoid. Modularize everything, make each module simple, and use well defined interfaces in between. Makes it easier to understand, easier to troubleshoot, easier to maintain, and easier to replace individual components later.
Jack Thornton
----------------
Spectacular achievement is always preceded by spectacular preparation - Robert H. Schuller
----------------
Spectacular achievement is always preceded by spectacular preparation - Robert H. Schuller
-
- Participant
- Posts: 86
- Joined: Wed Mar 03, 2010 3:09 am
Thanks to all.jcthornton wrote:I'll readily agree with our most active participants. Those sources do not have standard connection stages because they do not have a fixed format that is easy to handle within DataStage.
Rather than building custom components within DataStage for performing the processing from those sources, I would recommend using a tool better suited to turning unstructured data into structured data. There are multiple paths to go down to do this, but ultimately it comes down to parsing.
So, use your favorite parsing tool to give the data you want to look at a standard structure - store the formatted, structured data in your preferred staging location - and it will become highly usable for anything you would like to do with it in DataStage.
If you want to couple it tighter with DataStage, it is possible to wrap your parser in ways to allow you to make it a part of the ETL job(s), although that is a route I personally prefer to avoid. Modularize everything, make each module simple, and use well defined interfaces in between. Makes it easier to understand, easier to troubleshoot, easier to maintain, and easier to replace individual components later.
-
- Premium Member
- Posts: 892
- Joined: Thu Oct 16, 2003 5:18 am