Identa-Rover:
A Land Rover Identifier
Expert System
with
A World Wide Web Interface

by
Jan A. Barglowski

for the California State University Chico
CSCI 223, Artificial Intelligence
Final Project
Fall 1995

Introduction

This report details an expert system that uses the World Wide Web as the interface. The expert system enables a person to classify Land Rovers, a type of 4 wheel drive vehicle similar to a Jeep. The expert system classifies by asking a series of questions concerning specific details about the vehicle along with the user picking one of several choices. Eventually, the list of potential types of Land Rovers is narrowed to one, many, or none by virtue of those answers. The World Wide Web, or WWW, is used to present a graphical front end to the expert system which can also be accessed across the Internet. By using pictures to convey sometimes subtle differences, the expert system's interaction can be made poignant, rather than descriptive.

I. Design

The Land Rover Identification expert system needed a fairly general and forgiving way of performing its classification. The reasons for this are based on how we envisioned it would be used. For example, a Land Rover would be identified from a photograph or memory, or identification would be attempted of an unusual, perhaps partially dismantled vehicle. In most cases it is considered unlikely that a person would take their computer out to the vehicle in order to classify it! In addition, the knowledge base should be easy to modify or add new attributes and vehicles.

The expert system functions by using a list of all attributes, a list of all Land Rovers, the knowledge base of Land Rovers, and the rules of execution. Each vehicle in the knowledge base is represented as a list of attribute-value pairs, and there is a corresponding question and set of legal answers for every attribute. The expert system uses the first item in the attribute list, finds the appropriate question, and asks it. After the user chooses one of the legal answers, the expert system then compares the user's attribute-value with each of the vehicles in the Land Rover list. If the vehicle's attribute does not match, then the Land Rover is taken out of the vehicle list. After several question, the list of matching vehicles shrinks until three conditions cause the expert system to stop. They are: 1) One vehicle is left, 2) the list becomes empty, or 3) all questions have been asked. In the first case, a solution has been found. In the second case, the user has either described a vehicle not in the knowledge base or has made a mistake. In the last case, we have a list of matching vehicles that survived that elimination process. One of these vehicles is probably a solution, but we do not have enough information to conclude which.

The expert system can produce the above case 3 by allowing "unknown" answers. The mechanism designed into the expert system to facilitate this was the "na" answer, which stands for "Don't know / Can't tell". By choosing the "na" answer to any question, the expert system does not prune the list of remaining Land Rovers -- in other words the "na" answers matches all the values of the attribute the question posed. The expert system can then hopefully find a solution during later questioning. This sort of process improves the chances of finding a solution if the knowledge base contains redundant attributes and corresponding questions.

Another use of the "na" symbol is utilized within the attribute values of the knowledge base itself. If a particular Land Rover contains the value "na" for one of its attributes, that attribute will match any answer given by the user for that attribute. This has a subtle difference with the use of "na" as an answer, which disqualifies the entire question for all remaining vehicles, whereas the use of "na" as an attribute value functions as a "wild card" for a single vehicle attribute. This value is especially useful in creating the facts for a particular vehicle, where the value would correspond to a "don't care" or "doesn't matter" value. Knowledge base construction is made even easier by any value defaulting to "na" unless specified.

II. Implementation

The expert system was implemented in the AI language called CLIPS, or C-Language-Integrated-Production-System. CLIPS was made specifically for creation of the expert systems by providing an expert system interpreter. While CLIPS provides many features ideal for AI programming, such as the Clips Object Oriented Library (COOL), only the facts, defrules, and deftemplates constructs were used in the Land Rover ID program. CLIPS also provides a way to take the expert system and create C language source, which can then be compiled into a binary that does not rely on the CLIPS shell for execution. This feature provided an easy way to make the expert system interface with a Common Gateway Interface (CGI) script for display to a Web browser. Lastly, CLIPS is free for student and research use.

The deftemplate construct used in CLIPS is analogous to a struct in C language. It first defines the name of the deftemplate then defines a number of attributes, or slots. Each slot is given its own name, and a possible default value. In this manner we can describe the template for the Land Rover:

; The master template of our items...
(deftemplate item "Land Rover"
  (slot kind)
  (slot headlamps
    (default na))
  (slot headlamp_rings
    (default na))
...

Notice how the template name is actually item, and not Land Rover. This is an attempt to make the expert system independent of what is getting classified, as the execution rules will work on item. Above, the slot kind refers to the kind of Land Rover, and does not have a default. All the other slots shown are attribute slots, and their values will default to na if none is given. When the facts are defined in the knowledge base, each Land Rover will be defined as an item.

In creating the knowledge base, special consideration was given to how the knowledge would be programmed. We did not want to create a hierarchical tree structure because we were starting with a small number of Land Rovers, and would want to expand this database at a later time. Then we would have to "fit" the new Land Rovers into the proper place in the tree, a potentially time-consuming task. In addition, the knowledge base of the Land Rovers does not lend itself to hierarchical organization. (see notes) This happens because first, the Land Rover vehicles have attributes that change over time from model to model, but not all attributes change at the same time. This leaves a staggered look in attribute values over time when looking at the entire knowledge base. Second, the attributes often change from one value to another then back to the first value, e.g. the earliest Land Rovers and the latest Land Rovers have rectangular grills, but ones in-between have other shapes. And third, the hierarchical tree needs to have intermediate nodes, but often the same questions were needed at more than one node to continue with the search along the tree. This makes for a difficult time differentiating which node requires which question. This last property makes forwarding chaining (or simulated backward chaining) difficult to implement for this particular classification problem.

Ultimately, a series of facts was used to define our knowledge base, where each Land Rover was represented as an item deftemplate. This makes it very easy to define a new Land Rover into the database -- just create a new item fact with the appropriate attribute values. Of course, the new Land Rover must have a unique set of attribute values, or the expert system will not differentiate it from other Land Rovers that have the same attribute values. If it is not possible to uniquely define a new Land Rover, an additional attribute must be added to our item deftemplate. Then, values for this new attribute must be given (or not if the default "na" is acceptable) for all Land Rovers in the knowledge base. Overall, this is not a hard act to accomplish programmatically, although this is where the "expert" gets consulted quite a bit.

Even though the knowledge is not contained in rules, the execution of the expert system uses rules. These rules are somewhat analogous to functions in C, but they are not called explicitly, but rather implicitly through an inference engine. This proved to be the most challenging part of creating the expert system. A rule is created with the defrule construct. Each defrule consist of a Left Hand Side (LHS) of preconditions and a Right Hand Side (RHS) of a sequence of commands to be executed if the LHS is true. This turned out to be a large exercise in creating defrules that would fire off predictably, and is very different from my procedural programming background!

Initially, we have all of our facts asserted. This includes the list of items (Rovers), the list of attributes, and all the pre-defined Land Rovers. First, we have a rule that will fire off and ask a question and give what legal answers the user can choose from. After the user has entered in their choice, the next rule fired will search through all the items in the item list, and remove any of the items that have attribute values that do not match. This rule will fire off once for every item, but the RHS will not execute if the attribute value does match. Finally, we would like to check if the next attribute (which the next question would ask about) is common to the remaining items. If so, we can remove this attribute and go on to the next. This process goes on and asks the next question, until one of the three final states is encountered. Either the item list goes empty, which indicates no matching items; the item list contains only one item, which indicates a solution; or the attribute list becomes empty, which indicates that more than one solution exists. Each one of these cases is executed by a rule.

III. WWW and CGI Interfacing

The Land Rover expert system needed little modification to work as part of the WWW page. In fact, all it needed was a well-defined protocol, or formatting of input and output, in order to interface with the CGI script. The Web server, in turn, invokes the CGI script

Interaction Between the ES, CGI, and Web Server Programs

The CGI Script was written in Tcl, a commonly available scripting language somewhat like C-shell. It handles the Web-specific protocols when invoked by the Web server and invokes the expert system interactively.

The Land Rover expert system was made into a binary executable program by using the CLIPS command constructs-to-c, which then creates C source code. On execution, the ES send out 4 lines of text:

	<keyword>		<question key> or final
	<question>		the question the expert system is now
asking
	<answers>		the legal answers for that question
	<left>			the Land Rovers left after previous answers

The expert system then expects one of the legal answers as input. Once the input is given the expert system then responds with either a new question (keyword is not final) or a final answer (keyword is final) and stops execution.

The Tcl CGI script can now interface easily with the expert system because is know that it always needs to read four lines and acts according to the keyword. Originally, the CGI used the Tcl extension expect to spawn off a the expert system and interact with it. Unfortunately, expect did not seem to capture all of the input from the ES, leaving a process hung. This was a highly undesirable effect, as expect took over 50% of the available CPU time waiting for input that already went by! To solve this problem, the Tcl command open was used to invoke an external command (the expert system) and create a bi-directional pipe to it. Now the CGI could read input and send output to the expert system reliably. The CGI used the following code to determine the output of the expert system:

# Start off the ID expert system...
set pipe [open "| $bin_dir/rover" r+]

set i 0
while {1} {

  # Read the info from the ES...
  gets $pipe key
  gets $pipe question
  gets $pipe answers
  set answers [string trim $answers ()]
  gets $pipe left
  set left [string trim $left ()]


  if {![string compare $key "final"]} {

    #puts stdout "Final page"
    #puts stdout "items left: $left"

    # Kill the pipe...
    catch {close $pipe}

    # Create the final page...
    build_final_page $left $choices $keys

    # That's all for the CGI...
    exit

  } else {

    if {$i < [llength $choices]} {

      # Answer this question...
      set c [lindex $choices $i]
      puts $pipe "$c"
      flush $pipe
      incr i

    } else {

      # Wrap up the ES session...
      catch {close $pipe}

      # Create the question page...
      build_question_page $key $question $answers $left $choices
$keys

      # That's all for the CGI...
      exit
    }
  }

The procedures build_question_page and build_final_page take the information the expert system gives, along with some "hidden" variables and creates the Hyper Text Markup Language (HTML) output that the Web clients creates the Web page with.

On the other side of the CGI, the Web server invokes the CGI from a command give by a Web client. The Web client give the Web server 2 sets of hidden variables within an HTML form. One of these variables, choices, keeps track of the previous choices the user has made to get to this point. The other variable, keys , contains the names of the questions that have been asked so far. By using the choices variable, the CGI script can then replay, one by one, the answers to each question that the user previously answered. Once the CGI runs out of answers, it then builds a new Web page with the last question on it as well as an updated choices and keys variables.

With a bit of ingenuity the entire question Web page can be created from very few variables. The choices variable only keeps track of the previous answers, so is not used to create the page other than to append on the newest choice. The answers variable is a Tcl list, which is then looped through and has an image string created from the key and the answer concatenated together, e.g. headlamps.breakfast.jpg. This image is then used to represent this choice in the Web page. The "Back to question" area uses the variable keys, which is the list of question to get to this point. Keys is then looped through and a link back to previous question is created.

Many pains were taken to make the Web pages run quickly. Originally, each image was a large JPEG file, and loaded very slowly over a dialup connection. These images were then substantially reduced by first trimming the size to 160x120 pixels, then saving the file as a GIF image with as few bits-per-pixel as possible without undue image degradation. Finally, a "Preload" page was created in order to read all the images used in the question pages into the cache of the user's Web browser. This speed up execution dramatically, as the images are now local to the user and all that is sent over the network is the text of the page.

IV. Further Development

This project has many further features to implement. The expert system does not remove questions from the list that won't further differentiate the remaining vehicles. This results in a sometimes useless question getting asked. The knowledge base will also expand -- many other Land Rovers, some unusual and other kinds such as Range Rovers will be added. Of course, the visual aspect of the Web page will continue to improve in both speed and design as time allows.

Summary

The Identa-Rover World Wide Web page shows how an expert system which is normally constrained to its shell can become interactive over the Internet. By using a wide variety of techniques, the expert system can be invoked using the CGI interface as described by the HTML language. Hidden variables can contain the past interaction of the Web users, and the CGI interface can supply the interaction back to the expert system at any time. In addition, the expert system benefits from the graphical interface the Web can display in term of easier operation and better presentation of the questions.