How to handle multi-page Web Service requests

A forum for discussing DataStage<sup>®</sup> basics. If you're not sure where your question goes, start here.

Moderators: chulett, rschirm, roy

iq_etl
Premium Member
Premium Member
Posts: 105
Joined: Tue Feb 08, 2011 9:26 am

How to handle multi-page Web Service requests

Post by iq_etl »

I'm relatively new to both DataStage and Web services and would like some general direction for an integration task I'm working on. Our team is working with a new HR management system and attempting to pull data from it via Web services into our existing Oracle database.

I've successfully done this using the Web Services Client and XML Stage. However, when it comes to requests with a large number of results -- we have a maximum of 999 objects per page -- I'm not really sure what to do.

My assumption is that I would make a request and find the total number of pages in the response, then use that to somehow loop my Web Services request, repeating it for each page until I've extracted all the results.

I've not found any documentation on this specifically. I'd appreciate any pointers anyone has to offer.

Thanks.
eostic
Premium Member
Premium Member
Posts: 3838
Joined: Mon Oct 17, 2005 9:34 am

Post by eostic »

What dictates this "maximum number per page"? That sounds like it is something that the author has put into the service?

Very often these kinds of functions are coded by the web services designer......where "all" the rows are cached somewhere, and then on the next call, the user identifies themselves (somehow) and then gets the "next set" of rows....I've seen batch numbers be passed from the client, or just a "number of rows I can consume".....SF.com does some of this kind of logic in its retrieval oriented web services.

No real easy way to fake it, unless you bring down all the data each time and then sift thru it yourself after storing it somewhere.

Do pages "matter"?

Ernie
Ernie Ostic

blogit!
<a href="https://dsrealtime.wordpress.com/2015/0 ... ere/">Open IGC is Here!</a>
iq_etl
Premium Member
Premium Member
Posts: 105
Joined: Tue Feb 08, 2011 9:26 am

Post by iq_etl »

eostic wrote:What dictates this "maximum number per page"? That sounds like it is something that the author has put into the service?
Yes, it's specified in the Web service. There are parameters for Page (which page to get) and Count (number of rows per page) So if I make a request for page 1 with count 999, I can request "Total_Pages" and "Total_Count" and see how many pages of 999 rows there are and request specific pages on subsequent requests.

What do you mean by "matter"? I guess they do, in that I have to be able to specify which rows to return if my total exceeds the maximum.

For such responses, I could certainly get all results by rerunning the job manually. But to automate it, would I need to pass "Total_Pages" to some external script and have it rerun the job, iterating the "Page" parameter? Or can DataStage do that internally?
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Ernie, question for you - should this discussion be over in the SOA/RTI forum? Happy to move it if that's the case.
-craig

"You can never have too many knives" -- Logan Nine Fingers
eostic
Premium Member
Premium Member
Posts: 3838
Joined: Mon Oct 17, 2005 9:34 am

Post by eostic »

"Does paging matter" was just a reference to the fact that I thought you had a web services without any special parameters, and 999 was the maximum that you could ever receive, and wanted to chop it up in some other fashion (your own definition of pages).

What you are looking at is a loop. A bit tricky, because you need the number for the page argument to be generated "upstream" from the Web Services Transformer, and then "go back again" to increment things before the next invocation.

I did this years ago in a Server Job, sharing UV/Basic variables that I knew would be recognized since the Stages run in the same process...not sure if there is a way to do it in EE. Will have to leave that to others in the forum.

How many rows are you anticipating, and do you have constraints that require that you use an EE Job? (ie...using QualityStage, for example).

One quick thought, but it would need lots of testing to know when you hit "the limit of pages", is to just feed the page values via rowGen or via a dummy flat file, and code for the "failure point" when there aren't any pages or any rows (when you have exceeded the number). Might depend on how nice the payload and return code is on the return when you exceed it and what options you have for cleanly killing the Job once you have hit that limit.

It would be a bummer to have to call the "whole job" again for each page run....

Ernie
Ernie Ostic

blogit!
<a href="https://dsrealtime.wordpress.com/2015/0 ... ere/">Open IGC is Here!</a>
eostic
Premium Member
Premium Member
Posts: 3838
Joined: Mon Oct 17, 2005 9:34 am

Post by eostic »

Craig...no..it should stay here.... this is normal DataStage stuff. Web Services, yes, but not in the RTI/ISD sense.

Thanks.

Ernie
Ernie Ostic

blogit!
<a href="https://dsrealtime.wordpress.com/2015/0 ... ere/">Open IGC is Here!</a>
iq_etl
Premium Member
Premium Member
Posts: 105
Joined: Tue Feb 08, 2011 9:26 am

Post by iq_etl »

I'm working on employee data at the moment, and might get as high as 50-60,000 rows, although I hope we'll ultimately just be calling for updated records and not the whole set. Future phases may involve transactional data that could theoretically be more.

I don't think I'm constrained to EE jobs, but you're running into my DS inexperience there. How could a Server job be preferable?

I'll think about that "run to failure" idea. But it would be nice to have the flexibility to resume the job in case of failure.
iq_etl
Premium Member
Premium Member
Posts: 105
Joined: Tue Feb 08, 2011 9:26 am

Post by iq_etl »

I'm back with a little progress using a sequence and loop stages.

I've created a job that makes a Web service request for a row count and the corresponding total number of pages, then writes that total page value to a parameter set.

Then in the sequence start loop stage I use that parameter as the loop's "To:" value with starting value and step set to 1.

I would like to trigger my Web Service Call job, using the loop's count to specify the page in the Web service request. However, I'm not sure how to pass the loop's current count into an input value in my job. The article referenced below seems to say I can insert it as a variable, but I can't get that to work and it's unclear if the count variable actually passes into the job or just stays in the sequence.

So my current questions are

How can I best pass a count variable from the sequence into an input argument in my Web Services Client job? Or will I need to use Web Services Transformer?

Should I rather find a way to handle the counts myself by iterating another parameter set variable?

Thanks for any insights.


https://www-01.ibm.com/support/knowledg ... rties.html
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Start Loop activity properties should be able to accept job parameter references enclosed in sharp signs.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
iq_etl
Premium Member
Premium Member
Posts: 105
Joined: Tue Feb 08, 2011 9:26 am

Post by iq_etl »

Yep, I've got the parameter in my start loop, but now I'm trying to get the loop's counter from the sequence and into a job.

I've found the User Variables Activity and have specified a "Count" variable there with the expression "start_page_loop.$Counter".

Now how to get "Count" into an input argument?
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

The same way - by reference to the upstream activity variable $Counter.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
eostic
Premium Member
Premium Member
Posts: 3838
Joined: Mon Oct 17, 2005 9:34 am

Post by eostic »

Assuming that you have successfully imported your WSDL, you should be able to "Load" the arguments for the request directly in the Stage when it is defined as a Source....and then use a Job Parameter there to establish your "fixed" value for that run.

Alternatively, send one "row" downstream from a RowGen, and stick your Job Parameter in a Derivation of a Transformer that preceeds the WSTransformer Stage.

Ernie
Ernie Ostic

blogit!
<a href="https://dsrealtime.wordpress.com/2015/0 ... ere/">Open IGC is Here!</a>
iq_etl
Premium Member
Premium Member
Posts: 105
Joined: Tue Feb 08, 2011 9:26 am

Post by iq_etl »

At the moment, I'm putting the activity variable, "start_page_loop.$Counter" (but no quotes) into the input argument in the Web Services Client stage in my job.

Result: It appears to be using it literally and I'm getting an xml error for "start_page_loop.$Counter" being an invalid value.

1. Is it an issue of Web Service Client stage just not accepting activity variables as inputs, and do I need to switch to Web Services Transformer, assuming a Transformer stage can take the activity variable and send it into the WSTransformer?

2. If I need to be using the job parameter instead of the activity variable, that will require me to iterate it myself separately, rather than using the sequence loop's count, correct?

Thank you for your help -- I feel like I'm close to getting this to work.
eostic
Premium Member
Premium Member
Posts: 3838
Joined: Mon Oct 17, 2005 9:34 am

Post by eostic »

I have only ever done this (use Web Client Stage with inputs from elsewhere) via formal DataStage Job Parameters (regular #parms#).

Ernie
Ernie Ostic

blogit!
<a href="https://dsrealtime.wordpress.com/2015/0 ... ere/">Open IGC is Here!</a>
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Are you loading it using the tool or typing it in manually? If the latter are you surrounding it with sharp signs?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply