Read a webpage data

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
devidotcom
Participant
Posts: 247
Joined: Thu Apr 27, 2006 6:38 am
Location: Hyderabad

Read a webpage data

Post by devidotcom »

Hi All,

I have a scenerio not sure if it is possible with DataStage.
I have a webpage URL address which contains records that I need to read and process. Is there a way we can read these records. I read about external source stage but not sure if it works with this.

The webpage would have that looks like

c1 c2 c3 c4
100 10 200 20
300 30 400 40
500 50 600 60

Once I read this webpage i need to convert columns into rows using a pivot stage and generate a flat file with the data given below

c1 c2
100 10
200 20
300 30
400 40
500 50
600 60

Any input appreciated. Thanks

Devi
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

If the site hosts a web service that could deliver that data, then yes. Otherwise (AFAIK) you'd need to find some other way to retrieve that and either leverage it via an External Source stage or just land the data and process it from there like 'normal'.
-craig

"You can never have too many knives" -- Logan Nine Fingers
eostic
Premium Member
Premium Member
Posts: 3838
Joined: Mon Oct 17, 2005 9:34 am

Post by eostic »

I haven't done it myself, but have talked with folks who have used JavaPack to do it, since java has fairly common libraries for http work...and at least one time heard that someone wrappered a perl script that did it, as perl (apparently?) has decent functions for http activity.

Ernie
Ernie Ostic

blogit!
<a href="https://dsrealtime.wordpress.com/2015/0 ... ere/">Open IGC is Here!</a>
devidotcom
Participant
Posts: 247
Joined: Thu Apr 27, 2006 6:38 am
Location: Hyderabad

Post by devidotcom »

Thanks for the inputs.
I guess I will try with the perl option and update the post.

One more thing. This webpage could have more than what I am looking for like some descriptions, image... etc I would have to eliminate them and then look for the data i want.
Post Reply