Now let’s take a look at the datasource itself. The datasource essentially is a function itself, with arguments to pass, and variables to return. In the preprocess section, the POST_DATA variable where the XML SOAP request string was built is put into here. In addition, the actual web service URL is stated here as well.
In the preprocess section, there are also two built-in variables that can be used, LIMIT and OFFSET. These two variables are used to ‘page’ results in a cursor. In the example above, we look at LIMIT to populate a variable called MAXRESULTS. The MAXRESULTS variable is then used in the COUNT element (in this case 10) to bring back 10 results per request. If the user needs more, then the datasource then starts at the next row and retrieves 10 more results.
The simple xml section is a hierarchical representation of the XML response from the SOAP API, to be flattened out into a 2-dimension look when the data is retrieved. Indentation is used to signify a parent-child relationship. The {loop=content} statement acts as a loop within the XML, iterating through the XML. The end nodes (highlighted in BOLD) are the fields that is used to capture information and passed back to the calling routine. Note that fields can be skipped in the simple xml section if the user does not need it.
It should be noted here that by using simple xml to represent the XML response, there is no provision for providing a “dynamic” representation of the XML using simple xml. So in essence, you would have to potentially write a different datasource function for each different search type in this case. For generate an advanced datasource that could output differently depending on the search type would require outputting a datasource in Buddyscript. We’ll cover this in a different blog.
datasource LiveSearchAPI(SEARCH, CULTURE_INFO) => Title, Description, Url, Source, NewsYear, NewsMonth, NewsDay, NewsHour, NewsMinute, NewsSecond {expire="in 1 hour" continue_on_error="true" timeout="15" }
preprocess
if LIMIT>10 || LIMIT<=0
MAXRESULTS = 10
else
MAXRESULTS = LIMIT
FIELDLIST = "Title Description Url Source DateTime"
POST_DATA = BuildSearchAPIPostData(SEARCH, "News", OFFSET, MAXRESULTS, CULTURE_INFO, FIELDLIST)
http
http://soap.search.msn.com:80/webservices.asmx
header
Accept: application/soap+xml
postdata {encode=no}
POST_DATA
simple xml
Envelope
Body
SearchResponse
Response
Responses
SourceResponse
Offset => RESULTOFFSET // Where we're starting from.
Total => TOTAL // Total number of results.
Results
Result {loop=content}
Title
Description
Url
Source
DateTime
Year
Month
Day
Hour
Minute
Second
postprocess
INFO.Offset = RESULTOFFSET
INFO.MaxCount = TOTAL
return INFO
There are other datasource properties that should be considered to either increase performance and or deal with potential errors in accessing/retrieving information from the datasource. The first one is the Timeout property. You can specify this time in order to lengthen or shorten the time it takes before the datasource quits accessing the web service. The default value is 10 seconds. In our case, we have it at 15 seconds. The next property is the continue_on_error property. By changing this property to ‘yes’, execution will still continue and the datasource caller can retrieve the error message in the SYS.Data.Error variable. This is only on those sources that call the ABErrorProc. The final property is very important. It is the Expire property. This determines how long retrieved data should be valid, i.e. kept in cahsed memory. The ability to cache retrieved data in memory will improve performance on retrieving information in the datasource. You should consider these factors:
1) how often does the data change?
2) how often will the same retrieved data be asked again?
3) how large is the retrieved data set?
4) server memory cache size (N/A on hosted applications)
5) how fast does the web service perform?
All of these are considerations. In our case, since news items change frequently, we’ll set it for a relatively short time period, say 1 hour.
Examples of the Expire property:
Expire=”never” /* this is the default expiration for most non-Buddyscript datasources */
Expire=”in 1 hour”
Expire =”now” /* no caching at all, same as “never” */
Expire=”tomorrow at 5am” /* Note that this time is the server time, not the client time. In hosted applications, this is in GMT time */
The postprocess section is important for returning a range of information. For datasources that do not handle the processing of data using offsets and limits (i.e. simple xml), if the postprocess section is missing, the processing QueryServer will process the data coming back from the datasource in its entirety. In cases where the output coming back is one entity or one row, or if the amount of data needed to be processed is small, the postprocess section is not needed.
Looking at the postprocess section, this section is used to set the offset and total count of rows in a variable. This variable is then used by Buddyscript to control the display of output.
In this case, INFO is the name of an object variable. The names of the variables inside the object is Offset and MaxCount, and these values are populated from the datasource:
Finally, here is a crude routine to pass a request to the Live Search API, access the web service and display the contents of the data, using Buddyscript code to control the amount of data coming in.
? Tell me some news about STRING=Anything
LOCALE="en-us"
TITLE, DESCRIPTION, LINK, SOURCE, YEAR, MONTH, DAY, HOUR, MINUTE, SECOND = LiveSearchAPI(STRING, LOCALE) show 10
* Here are the results:
- TITLE, SOURCE
* <blank/>
<ifmore>Type "more" for more news.</ifmore>
- Sorry, no news sites were found for your input.
In the input, you could ask a question such as “Tell me some news about Baron Davis” for example, and get back results that looks like this (notice that the output only contains 2 out of the 10 arguments returned, Title and Source):
Here are the results:
Baron Davis Going South, San Francisco Gate
The Baron Davis, Gilbert Arenas Switch-a-roo?, San Francisco Gate
Clippers set sights on Baron Davis, Los Angeles Times
Baron Davis on verge of signing with Clippers, Washington Post
NBA: Warriors trying to woo Brand, Newsday
Baron Davis becomes free agent, Chicago Sun-Times
Report: Davis to ditch Warriors for Clippers, FOXSports.com
Logo? Colors? History? Don't mean a thing if you ain't got that team, CBS Sportsline
Davis on verge of joining Clippers, CNN Sports Illustrated
Davis on verge of signing with Clippers, Salon
Type “more” for more news.
.
Notice that in the pattern routine, there is an option called SHOW 10. This means to output 10 rows at a time. Buddyscript will go to the output datasource to retrieve the information, which in this case happens to be exactly 10 rows, since the request was to buffer 10 rows per datasource request. If the user were to type in “more”, another 10 rows will be retrieved from the datasource and 10 more rows will be displayed, and so on. (Note that there is a Buddyscript variable named SYS.Presentation.Maxlength that also controls the number of characters that can be displayed on an IM client. Depending on what this is set to, this number will also control the number of rows displayed back.)
With the SHOW command, this allows the user a quick and equivalent way of emulating a forward read-only cursor, i.e. displaying x number of rows of output at one time. The other option would be to put the data into an object and loop through the object, displaying each row, which involved more coding. It’s very possible that for control purposes, the latter method is the right way to go, but for quick coding and display, SHOW is very powerful.
Hopefully you have gotten a chance to absorb the intricacies of using datasources by accessing a really powerful web service. Thanks for your attention!