Thursday, September 08, 2005

Web Service fun with Google

A friend of mine has been reading the book “Google Hacks”. While talking with him, he had some questions about some examples of the Perl examples in the book that utilize web services. I figured this would be a good opportunity to go over this concept and demonstrate an example of utilizing the web service.

Web Services, for those who are not familiar, are a Remote Procedure Call mechanism that is implemented through XML messages passed through HTTP. There are two dominant standards that I have come across, XML-RPC and SOAP. Web Service interface descriptors are usually available through a XML file called a WSDL (Web Service Descriptor Language) file. WSDL files contain a description of all attributes, methods, and classes a web service supports. An example would be the Google Search API, available at http://api.google.com/GoogleSearch.wsdl. More information can be found at http://api.google.com.

For this example I will use Visual Basic .Net 2005 Express Beta 2, available freely (for the moment) at http://lab.msdn.microsoft.com/express/vbasic/default.aspx. Be aware that this is beta quality software.

The first step is to start VB and create a new project. In the example, I will create a new Windows Application and call it “BlogGoogleSearch”. Add 3 components to Form1: a button called “cmdDoSearch”, a textbox called “txtSearchItem”, and a listbox called “lstResults”. My Form looked something like this:

Image hosted by Photobucket.com

From here, we need to make the application aware of the web service we will be utilizing. First, go up to the Project menu and select Add Web Reference.
Image hosted by Photobucket.com
In the dialog box that follows, enter in the URL of the WSDL file. Click on Go:
Image hosted by Photobucket.com
Once we click on Add Reference, the exposed web services will be added to the Solution Explorer under Web References. We now have a few Classes for performing searches on Google. One thing to note is that you will need to sign up with the Google API account. They will provide you with a key that you will need to use this API. For the event handler for cmdDoSearch we will add the following code:

Private Sub cmdDoSearch_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles cmdDoSearch.Click

     'The Actual Google search component. Create on instantiation to save space
     Dim google_search_component As New com.google.api.GoogleSearchService()
     'Result is one element in the Results collection. Results are returned from
     'the GoogleSearchService Class

     Dim results As com.google.api.GoogleSearchResult
     Dim result As com.google.api.ResultElement

     'If the google_Search_componenet was successfully allocated, try and do a search
     If Not google_search_component Is Nothing Then
          Try
          'search for the item in txtSearchItem on Google. Return 10 items, no
          'safe searching and use latin1 as the encoding
          results = google_search_component.doGoogleSearch("MyGoogleAPIKey", _
                    txtSearchItem.Text, _
                    0, _
                    10, _
                    False, _
                    "", _
                    False, _
                    "", _
                    "latin1", _
                    "latin1")
          Catch
               'If this fails, tell us why, reset pointers, and exit the sub program

                    MsgBox(Err.Description)
                    google_search_component = Nothing
                    results = Nothing

                    result = Nothing
                    Exit Sub
               End Try
          End If

          'If we recieved no results, then tell the user no results were there
               If (results Is Nothing) Then
                    MsgBox("Error: no results returned", MsgBoxStyle.OKOnly)
                    google_search_component = Nothing
                    results = Nothing
                    result = Nothing
               Exit Sub
          End If

          'For each item returned from the query, add the URL to the list box
          For Each result In results.resultElements

               lstResults.Items.Add(result.URL)
          Next
          'Free memory and exit
          google_search_component = Nothing
          results = Nothing
          result = Nothing
     End Sub

The resulting program will now take whatever is filled in the Textbox, execute the search on Google, and return the resulting URLs in the Listbox. Attached is a screen shot of the complete application.

Image hosted by Photobucket.com

So now lets user Ethereal to see what the SOAP message passed to Google looks like. Below, I formatted the XML to make it easier to read.

POST http://api.google.com/search/beta2 HTTP/1.1
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; MS Web Services Client Protocol 2.0.50215.44)
Content-Type: text/xml; charset=utf-8
SOAPAction: "urn:GoogleSearchAction"
Host: api.google.com
Content-Length: 913
Expect: 100-continue
Accept-Encoding: gzip
Proxy-Connection: Keep-Alive

<?xml version="1.0" encoding="utf-8" ?>
- <soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/" xmlns:soapenc="http://schemas.xmlsoap.org/soap/encoding/" xmlns:tns="urn:GoogleSearch" xmlns:types="urn:GoogleSearch/encodedTypes" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
- <soap:Body soap:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/">
- <tns:doGoogleSearch>
<key xsi:type="xsd:string">MyGoogleAPIKey</key>
<q xsi:type="xsd:string">Taosecurity</q>
<start xsi:type="xsd:int">0</start>
<maxResults xsi:type="xsd:int">10</maxResults>
<filter xsi:type="xsd:boolean">false</filter>
<restrict xsi:type="xsd:string" />
<safeSearch xsi:type="xsd:boolean">false</safeSearch>
<lr xsi:type="xsd:string" />
<ie xsi:type="xsd:string">latin1</ie>
<oe xsi:type="xsd:string">latin1</oe>
</tns:doGoogleSearch>
</soap:Body>
</soap:Envelope>

The results sent back from Google are encoded, so seeing the resulting packets would look like a garbled mess. Something in the proxy class for handling the web services know how to decode the return message.

There are pros and cons to the Web Services approach to RPC. One plus is that it follows the tenet of the Unix Philosophy that messages should be human readable. I believe in this very strongly since human time is much more expensive than machine time. So making something human readable makes it easier to track down issues. The second bonus is that the transport is done over commonly available ports through a firewall. By using HTTP transport as a mechanism, you do not need to reinvent the wheel of developing a message passing mechanism into programs.

On the con side is the high overhead with processing XML messages. Imagine thousands of messages being sent like the one above just to pass 1 word to a service to execute a search function. The cost/function ratio is a little out of whack. While human readable is a plus, this just over inflates RPC calls to an extreme, and there are more efficient ways to achieve the same end at a lower overhead cost. Also, implementation needs to be through out carefully for web service providers. I had to replace my key in these examples for security. Plain text transfers, while making it easier for an engineer to read, also make it that much easier for unwanted eavesdropping. So another layer of complexity, a secure transport mechanism such as a SSL tunnel, would be needed.

Of course this is a topic of hot debate right now. There are a lot more facts that go into it. My personal opinion is I feel that for Intranet applications, Web Services can be an incredibly powerful asset if implemented correctly. I do feel that if there are any critical or private transactions taking place, Web Services should never be exposed on the Internet. While some places, such as Google, and some weather services, can offer their web services as a bonus to their customers without exposing them to privacy concerns, banks and other industries should steer clear of this kind of implementation.

1 comment:

Joseph said...

Have a look at this, one google like search engine maker site http://googlefor.com