URL parsing in Datastage

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
maheshsada
Participant
Posts: 69
Joined: Tue Jan 18, 2005 12:15 am

URL parsing in Datastage

Post by maheshsada »

we are looking to load URL data from files, but should not load any detail information held in the final part of the URL (e.g. personal information, chat messages in the URL parameters).
Effectively what we need to do is load the URL string up to (and including) the final / in the url - this should be unaffected by the number of / in the url.
For example:
http://www.url1.co.uk/message_new.php?s ... Text=Hello World&cNickname=John Smith

should be loaded as http://www.url1.co.uk/

http://www.url2.com/tones/wallpreview.a ... na1Picture

should be loaded as http://www.url2.com/tones/

http://www.url3.com/images/icon/news.gif

should be loaded as http://www.url3.com/images/icon/

http://www.url4.net/mqa/m/_UK18_/-6643847943/9593.3gp

should be loaded as http://www.url4.net/mqa/m/_UK18_/-6643847943/

Is there a facility to do this in Datastage?

Many Thanks for your help

Regards
Magesh S
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

This is an example of the powerful string handling functions available in Server. This solves your question:

Code: Select all

FIELD(In.URL,'/',1,DCOUNT(In.URL,'/')-1)
The DCOUNT() function returns the number of fields delimited by '/' and the FIELD(string,delimiter,from[,to]) function gives you all fields except the last. It's easy and efficient.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

If you want to throw some money at the problem, there is always the dreaded Click Pack option from Ascential. It adds specific URL and weblog stages, from what I recall, plus the ability to use Perl in Server jobs.
-craig

"You can never have too many knives" -- Logan Nine Fingers
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

ArndW wrote:This is an example of the powerful string handling functions available in Server. This solves your question:

Code: Select all

FIELD(In.URL,'/',1,DCOUNT(In.URL,'/')-1)
The DCOUNT() function returns the number of fields delimited by '/' and the FIELD(string,delimiter,from[,to]) function gives you all fields except the last. It's easy and efficient.
Slightly more efficient is

Code: Select all

Field(In.URL, "/", 1, Count(In.URL, "/"))
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
vmcburney
Participant
Posts: 3593
Joined: Thu Jan 23, 2003 5:25 pm
Location: Australia, Melbourne
Contact:

Post by vmcburney »

The click pack web log reader stage is very efficient and it comes with some server routines that help parse things such as URLS. Are they still charging extra money for it? Thought it would be free by now.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Not positive, but I think it's still a chargable option. Also heard it is no longer being actively supported and will be phased out in the next release. :?
-craig

"You can never have too many knives" -- Logan Nine Fingers
maheshsada
Participant
Posts: 69
Joined: Tue Jan 18, 2005 12:15 am

Post by maheshsada »

Hi
I have used Field(DSLink4.col1,"/",1,Count(DSLink4.col1,"/")), iam gettting the following warning

DataStage Job 97 Phantom 9010
Program "JOB.1124520702.DT.1386333214.TRANS2": Line 43, Improper data type.
Attempting to Cleanup after ABORT raised in stage test1..CTransformerStage5
DataStage Phantom Aborting with @ABORT.CODE = 3

but the job gets completed and iam getting the result what i required. If i remove the count and use some number like 2 or 3, the job is getting completed without warning.

any updates

regards
Magesh S
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

Most likely you are getting a count() returned value of 0, which the Field() function doesn't like. If you reset the job you should be able to see your exact error in the "from previous run" log entry.
maheshsada
Participant
Posts: 69
Joined: Tue Jan 18, 2005 12:15 am

Post by maheshsada »

Hi

I have checked individually the count function its giving valid values (i.e. count of the "/"). When i put the count in the field function its giving the phantom error.

any updates

regards
Magesh S
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Look closely at your data, as simply adding the count function inside the field function is not the issue. Did you try resetting the job so we can get more specific information about the phantom?

You can also do the count outside of the field function first to determine if there are any delimiters in the string before actually attempting the field function.
-craig

"You can never have too many knives" -- Logan Nine Fingers
Post Reply