Check out this code! (Part I)
| Are you the type that forgets to return books on time? Wish you could just keep an open tab at the local library? I know that I do! Instead of waiting for that email notice that arrives a week after the fact, this code can help you keep on top of what's checked out, and what's due. |
| Arian Kulp Difficulty: Intermediate Time Required: 1-3 hours Cost: Free Hardware: None |
Introduction
I love our local library. For myself, I get countless books that I wouldn't want to buy since I'll never read them again. For my two kids, we have an immense selection of children's books to rotate. The problem, is getting them back on time! We don't have one single library day, so often our library bookshelf at home contains books that all go back different times, and things like movies are always due back sooner anyway. What would be great, I decided, would be an application sitting in the system tray to show me what's coming coming up, or already overdue. Being able to have one view for all library accounts in our family is a great thing!
Part I of this article highlights creating a reusable class library (DLL) to communicate with a specific library back-end server, Horizon Information Portal, in use in hundreds of libraries. A simple test application will show how to use this code to search the card catalog and to see what books you have checked out. Concepts include XML consumption, HTTP requests, and a clean object model abstracted away from any user interface. A very simple test harness application is included.
Part II will take that reusable library and create a desktop application to expose more of the available search features, and allow multiple libraries and patrons to be monitored for upcoming and overdue books. Concepts will include using a notification icon (appearing in the system tray), context menus (right-click), data binding, and predicates.
The source code is available in both C# and VB (the download links are above the introduction). If you didn't know yet, you can program for .NET without paying a single dime! Just download one of the Express editions of Visual Studio, from this link: Visual Studio 2005 Express Edition.
Looking on the Horizon
Our local library uses Horizon Information Portal, thus my choice of back-end. From Google searches, I find hundreds of libraries using the system so it must be pretty popular. I contacted the SirsiDynix Corporation (creators of Horizon) for information on their API, but after a number of transfers and a promised callback that never materialized, I realized that they just don't have that side of things put together yet. They do obviously support it though. I learned that by appending &GetXML=true to a query string, every Horizon request returns well-formed XML. It is this XML that enables an object model to be created.
The first thought with XML-based requests/responses is to use the built-in .NET XML serialization features. I started with this approach, but quickly grew frustrated. Without documentation or schema files, there was a lot of manual tweaking of data types. It was also difficult to generate a schema since certain nodes only appeared in certain situations, so I was never sure that I had everything. Worse still, half-way through my coding a new version was released and that actually changed the name of one of the elements. After more searching, I saw that not all libraries always run the newest version, so data will vary depending on what library you connect to. XML serialization just didn't seem flexible enough.
My solution was to just work with the XML manually. If you haven't worked with XML yet, don't be afraid! The basic concept is a tree of nodes. Each node may or may not have attributes, and will either contain text or more nodes. Accessing any data in an XML file is just figuring out the path to the node.
XML Messages
The Horizon Information Portal XML format relies on elements with text, not a single attribute that I've come across. There are messages for basic search, advanced search, renewing materials, getting account information and a number of other things. Requests are solely URL-based, and the responses are pure XML (similar to REST, but more monolithic). Requests can get to be fairly long including all required parameters and the session ID.
Responses all begin with the same blocks of data. Information regarding the server, client browser, and authentication status come back with every message. The messages also include the complete set of search options (by title, by author), sort options, and search limitations (by library branch, by collection). This is a large amount of information (10k or so), and it comes back with every response. It's not very efficient, but it gets the job done. Amazingly, the XML data also includes presentation information for the toolbar. As it turns out, all of the redundant data is because the entire dataset is tailored for presentation. Adding GetXML=true to your query string returns XML instead of HTML, but it's the same information. If it wasn't for this option, you'd be stuck "scraping" the HTML response, and it would be very fragile indeed. Unfortunately, it's still somewhat fragile. If SirsiDynix does decide to make an API available directly, they'll need to separate their data a little better.
Sending most responses requires parameters called menu and aspect, and sometimes submenu as well. All messages should include a session ID (generated by the server). Once a user is authenticated, this session ID keeps you logged in. Even for unauthenticated requests such as search, I suspect that passing a bogus session ID causes the server to create a new one each time. There's no reason to make things difficult on the other side!
Sending Requests
To make it easier, I tried to factor out much of the communication-related code into the HorizonDataAccess class. This contains methods for formatting the query string and retrieving data via GET and POST operations. I also wrapped a number of methods for converting XML nodes into objects in the HorizonDataMapper. I discovered that due to the presentation nature of the XML data, many fields have no set location in the response. For example, when searching for a title, some libraries display author name, holding title (holding is the cool way of referring to books, CD's, DVD's, and other items the library checks out!), publisher, and publication date, while other libraries only may display the name and author. It seems configurable at the library level, and unfortunately the data is very bound to the display. The worst part is that these customizable fields don't even have fixed node names. You have to determine where fields will be from the itemheader block:
<itemheader>
<col>
<label>Location</label>
</col>
<col>
<label>Collection</label>
</col>
</itemheader>
Another library might use a named XML element for Location or Collection. This lack of consistency is very frustrating! To get around this problem, I read in the name of all dynamically-assigned fields:
Visual Basic
Friend Sub ReadDynamicFieldDefinitions(ByVal doc As XmlDocument)
' Clear it out...
dynamicFields = New Dictionary(Of String, String)()
' Find header fields
Dim labelNodes As XmlNodeList = doc.SelectNodes("//searchresults/header/col/label")
For i As Integer = 0 To labelNodes.Count - 1
dynamicFields.Add(labelNodes(i).InnerText, "cell[" & (i + 1) & "]/data/text")
Next
' Find item header fields
labelNodes = doc.SelectNodes("//searchresults/itemheader/col/label")
For i As Integer = 0 To labelNodes.Count - 1
dynamicFields.Add(labelNodes(i).InnerText, "item[0]/cell[" & (i + 1) & "]/data/text")
Next
End Sub
C# Code
internal void ReadDynamicFieldDefinitions(XmlDocument doc)
{
// Clear it out...
dynamicFields = new Dictionary<string, string>();
// Find header fields
XmlNodeList labelNodes = doc.SelectNodes("//searchresults/header/col/label");
for (int i = 0; i < labelNodes.Count; i++)
{
dynamicFields.Add(labelNodes[i].InnerText, "cell[" + (i + 1) + "]/data/text");
}
// Find item header fields
labelNodes = doc.SelectNodes("//searchresults/itemheader/col/label");
for (int i = 0; i < labelNodes.Count; i++)
{
dynamicFields.Add(labelNodes[i].InnerText, "item[0]/cell[" + (i + 1) + "]/data/text");
}
}
It's a mess, I admit! The dynamicFields object contains mappings from field names (like "Location" or "Status" or "Due Date") to XPath query. If you haven't worked with XPath, it's the scheme for describing the location of an element within an XML document. There's a great tutorial at W3Schools if you want to learn more. Retrieving that field then is just a simple lookup, then a call to SelectSingleNode() on the XmlNode object. The bad thing, is that because some libraries use this dynamic mapping, and others don't you need to check in two places for some fields. For instance, many libraries expose the due date of a holding in a node named duedate, while other libraries use the dynamic method. Lost yet? I understand! Thankfully you don't need to deal with it. You can just take the classes in this library as a starting point and get it all for free. It's great that SirsiDynix makes everything so configurable, but the data shouldn't change -- just the presentation of it.
The Object Model
Though the data is all XML, I didn't want to deal with XML everywhere. As you've seen, it's messy, but it's also too much to keep track of in an application. My next step was to create objects to encapsulate a library, patron, a holding, and a number of other concepts. Objects are setup from the XML data, but the application only sees the end objects. Simplicity!
The starting point is the Library class. This is where you define the URL for the server and give it a descriptive name. Any other actions will start with calls to this class. Create a Library object, set the BaseUrl property, and call the Init() method.
Visual Basic
Dim myLibrary as New Library
myLibrary.Name = "Hometown Library"
myLibrary.BaseUrl = "http://hometown.lib.us/ipac20/ipac.jsp"
C# Code
Library myLibrary = new Library();
myLibrary.Name = "Hometown Library";
myLibrary.BaseUrl = http://hometown.lib.us/ipac20/ipac.jsp;
Notice that URL. Every single Horizon Information Portal installation that I've come across ends with "ipac20/ipac.jsp". Just entering the host name isn't enough. The next step is to create a ServerRequest object. This contains the various fields that you set to indicate the search term and how to perform the search. Invoking the search is done by calling Search on the Library object. This will return a collection of Holding objects, each being a book, DVD, audiobook, etc.
Image 1 - Search Results from your local library
Visual Basic
Dim req As New ServerRequest("search", "", "basic_search")
req.SearchTerm = "programming"
Dim results As List(Of Holding) results = myLibrary.Search(req)
For Each h As Holding In results
Console.WriteLine(h.Name)
End For
C# Code
ServerRequest req = new ServerRequest("search", "", "basic_search");
req.SearchTerm = "programming";
List<Holding> results = myLibrary.Search(req);
foreach(Holding h in results)
{
Console.WriteLine(h.Name);
}
For authenticated messages (books checked out, patron name/address) you will need to create an instance of Patron. Invoke the Login method of the Library class, and it will authenticate the user and associate the Library and Patron together into a session (CurrentSession property). You can invoke methods like ReadPatronDetails to fill in the various user fields, items checked out, fines, holds, etc. If you don't call Login, or the session expires, the user will be logged in automatically. Note that there's no encryption on any libraries that I've tested on so passwords will be sent in plaintext.
Figure 2 - Checked out items
Final Note
This was created by analyzing the available XML, but it's clear that each library heavily customizes their search pages. Since the XML returns reflects the display, not the underlying data, one library might return the title when you search by title, but not by subject, while another library returns all fields in all searches. It can be frustrating, but if you needed to, you could tweak things a bit for a given library. Of course, better would be to add more resilience by adding any enhancements to the common code. Try not to code for one library. See if there's some flag somewhere that determines the behavior that you see. If so, responding to the flag would allow the enhancement to work for other libraries.
Next Steps
At this point, the code is ready to drop into another project. Once thing that occurred to me is that a Web-based application could create RSS feeds of checked out books, overdue, holds coming in, that sort of thing. Also, just RSS feeds of matching items for a given string -- possibly useful to monitor for new books added to the library. Maybe someone else has already implemented another library's API and a mega library application could result! My next column will discuss an application to take advantage of this code with data binding, the system tray, and more.
Conclusion
This was a really fun project for me. Such a feeling of excitement to figure out to use that XML data for new things! That's the beauty of XML -- once you have structured data you can do so much with it. I hope to hear that this code is put to use in interesting new projects. I'm glad to answer questions about the or address any (gulp!) bugs that are found! Download Visual Studio Express and get started!
Arian Kulp is an independent software developer and writer working in the Midwest. He has been coding since the fifth grade on various platforms, and also enjoys photography, nature, and spending time with his family. Arian can be reached through his web site at http://www.ariankulp.com.