WebTelerobotic The telerobots built for the project

The telerobots built for the project

This chapter describes the three web telerobots built for this project and the development of the operator interface. It begins with an explanation of what was built and how web browsers and servers that were designed to provide information are used to teleoperate a robot. The provision of visual feedback is described and how the robots are controlled is explained. The problems that had to be overcome to achieve reliable telerobots are discussed. The design of the software is described including the technique for generating the operators interface, the methods adopted for tracking operators are discussed and the records used in the following chapters to analyse operator behaviour described. This is followed by a description of the methodology for operator interface development and some of the measurements made of operator reaction to the interface.

The first IRb6/L2-6 telerobot in Perth came online in September 1994. My intention was to provide a teleoperated industrial robot accessible to a large user base by not requiring any special equipment or software at the operator's end. The concept of a telerobot controllable through the World Wide Web is illustrated in Figure 41. 

The World Wide Web telerobot concept. (Taylor and Trevelyan 1995)
Figure 41

The concept has remained unaltered throughout the project, however the implementation has been constantly modified. There have been three different operating systems in the telerobot server, Windows 3.11, Windows 95 and Windows NT and three different robots, initially a six axis IRb6/L2-6 in Perth, Australia replaced by an ABB1400 and also a five axis IRb6/L2 at Carnegie Science Centre, Pittsburgh, USA. A variety of frame grabbers and imaging techniques have been trialed and the software constantly revised. A time line of telerobot development is shown in Figure 42.

Figure 42

The time line is provided to assist in understanding the different versions of the telerobots discussed and the data sets analysed in the following sections.

To operate the telerobot an operator specifies goal poses in accordance with the supervisory control model, as discussed in section 3.1.4, and the time critical control tasks are localised to the robot controller to avoid instability. Figure 43 shows schematically how the concept is applied to the telerobots built for this research.

Supervisory control as applied to the telerobots built for this project. The time critical control tasks are localised to the robot controller.
Figure 43

One version of the interface on the IRb6/L2-6 Perth telerobot included a computer generated image of the robot showing its current pose. The image was generated by a computer at the Technical University of Braunschweig, Germany, but, as it was not distinguished from the other elements in the operator's page, it was not apparent that it was being generated elsewhere in the world. This facility was abandoned when the Braunschweig robot simulator was taken off line, but the technique by which it was implemented demonstrates one method web services can be constructed from components offered by a variety of providers. All communication between the telerobot server and the Braunschweig computer took place indirectly via the operator's browser. That is, the page returned to the operator after each request to the telerobot contained the specifications of the computer generated image which the browser then requested from the Braunschweig computer as shown in Figure 44.

The IRP Manipulator viewer in Germany produced a model of the robot in its current pose with all communication between the computer in Germany and the robot in Australia taking place indirectly via the operator's browser.
Figure 44

The image specifications included the viewing angle and robot joint angles and had the following format:-

http://romy.rob.cs.tu-bs.de:4711/cgi-bin/newpose?VIEW0=SetName&L000=NumVal& ...&L0rl=NumVal&...&VIEWp&...&Lprl=NumVal&H=NumVal


  • SetName - The symbolic name of the specific setting.
  • NumVal - Character string representing a floating point number. Common formats accepted.
  • Lprl - The numerical value of the link l of the r-th robot in picture p
  • Hprx - The numerical value defining gripper opening.


Visual feedback is provided by Pulnix TM6-CN monochrome cameras which collect images of the robot workspace from different viewpoints. The cameras feed these images to a simple video multiplexer built with reed relays controlled from a PC brand, model PC-36 digital I/O board. Initially a Data Translation Quickcapture frame grabber was used. Of the 768 x 512 pixels received from the camera, only the central 512 x 512 field was stored. This restriction reflected the memory limitations of the Windows 3.11 operating system in use at the time where an image collection program ran in a DOS shell to convert each image to the GIF format [ ]. A FlashBus Pro - PCI bus Color Video Frame Grabber has been used in the IRb6/L2 Carnegie telerobot and a Matrox Meteor Frame grabber in the ABB1400 Perth telerobot. Both frame grabbers include dynamic link libraries for data capture and image compression which make them easy to use. In the ABB1400 telerobot and the IRb6/L2 Carnegie telerobot the GIF image encoding has been replaced by Jpeg encoding to minimise image data size. The image server provides some sophisticated services such as operator controllable image sizes, a software zoom and images from multiple cameras. All images are acquired at maximum resolution and then shrunk to the image size requested. As the gripper position is known and the cameras are calibrated, a region around the gripper can be cut from the image rather than shrinking the whole image. The size of the photographed region around the gripper is controlled by the zoom setting.

Other telerobots not requiring variable image sizes, software zoom or images from multiple cameras have been able to simplify imaging by using general purpose web video software. Webcam32 by Kolban (1997) is a good package used for My Toy Robot (Dion 1998) and the Jason Project (Fisher Jason Team 1998). Webcam32 requires a Microsoft Video for Windows compatible camera or frame grabber. Live imaging has been considered, but for most of the period of the project the technology to implement live imaging efficiently has not been available. Atkinson and Ciufo (1998), on a telerobot at the University of Wollongong, have implemented live imaging with server push technology but this does not allow interframe compression. On a fast link, server push is effective as a full sense of motion is achieved but on slower links, this effect is lost as it takes too long to download a frame. It also consumes most of the bandwidth making it slow to send commands to the telerobot. For these reasons, server push has not been used for live imaging on any of the telerobots built for this project.

Robot control

The telerobot server software constrains robot movements to remain within a cubic space above the table which has the blocks placed on it. Positions specified by operators that are outside this cube are adjusted by the software to bring them within the cube. In addition the pose is restricted to tilts of 45 degrees and spins of 90 degrees (see section 4.10 for discussion of spin and tilt) to avoid the robot joints winding up.

The ASEA series 2 robot controller used on the IRb6/L2 Carnegie and IRb6/L2-6 Perth telerobots can accept remote motion commands through a serial communication link (RS232, 9600 baud) from the telerobot server using a proprietary but published communications protocol. The status and position of the robot can also be obtained from the controller. The functions used to control the robots are:-

  1. Read status and robot position;
  2. Control robot operating mode (standby - operate);
  3. Move robot to programmed pose specified in Cartesian coordinates and quaternions; and
  4. Start/stop a program stored in the controller memory. Two short programs stored in the robot are used to open and close the gripper which is necessary because the controller does not permit direct control of the gripper via the serial communication link.

The S4 controller, which is a later model robot controller required for the ABB1400 robot, was much more difficult to use. In this case the telerobot server communicates with the ABB1400 robot controller using TCP/IP running over a SLIP serial connection. The robot manufacturer does not publish the communications protocol. Instead proprietary software is available of which two different versions have been used. The first operated as a dynamic link library with functions called from telerobot server software to transfer state data and robot control programs to and from the robot controller. The second version, used in 1998, has been an ActiveX control which operates asynchronously. The controlling software subscribes to the events about which it needs to be notified.

Direct commands to the robot cannot be made but can be issued indirectly by downloading a program to the robot controller and running it. In earlier versions, the program to be downloaded was constructed in the telerobot server to match the particular request received from an operator. Alternatively, the variables used in the robot controller programs, for example, target locations can be downloaded. Currently, a more complex generalised program is stored in the robot controller and only variables used by the program are downloaded for each request. This lowers the time per request by lessening the data transferred over the serial link.

The robot control software incorporates automatic compensation for errors in the kinematic model of the manipulator as follows:-

  1. Model data is read from a data file. The calibrated kinematic model is generated using a technique developed by Legnani and Trevelyan (1996).
  2. The program calculates the joint angles that are required to attain the specified pose, using the calibrated kinematic model.
  3. The software then calculates a Cartesian position and quaternion which, when sent to the robot’s controller, will result in those joint angles being attained, and hence the required pose.


The telerobots built for this project have always been heavily used and are expected to be available 24 hours a day. When accessing web telerobots, it is common for them to be inoperable and the difficulty of building a reliable system is frequently underestimated. Several web telerobots have been abandoned because of these problems. As other researchers in the telerobotics field can testify, the combination of an unstructured environment and random movements leads to many unexpected problems; “The ability of the system to enter an unexpected state far exceeded our ability to anticipate this happening” (Stein et al. 1994) and Simmons (1998) refers to the challenge of building a reliable telerobot. To put the problem in perspective, in the first eight months of 1998 the ABB1400 telerobot in Perth received on average more than 1000 requests a day (see Figure 65). A fault that would disable the robot that occurred every thousandth request would disable the robot on average more than once a day, which is too often. As well, some operators are deliberately destructive.

Much of the software adapted for this project was found to contain errors which were tolerable for other applications but had to be removed to achieve a fully reliable system. In the first IRb6/L2-6 telerobot in Perth the control software needed several improvements to avoid an indefinite hang up in the serial communication link with the robot. The communications software had used polling for serial communications but this had to be modified to be interrupt driven, as accurate timing was not possible with the multi tasking of the Windows 3.11 operating system. Mistakes were also discovered in the kinematics calculations as remote users attempted more ambitious tasks. When changing to the ABB1400 robot, similar errors were encountered with the proprietary communications libraries provided, having occasional errors. This highlights the need for higher levels of reliability than is required for many other robotics applications. As well the Windows 3.11 and Windows 95 operating systems generate occasional faults that render them inoperable. Observation and correction of problems can achieve local software reliability, but for operating systems and third party products, this method is not available.

To overcome instability in the IRb6/L2-6 Perth telerobot and the ABB1400 telerobot when it ran under the Windows 3.11 operating system a watchdog timer was implemented by running a short program to toggle two relays every thirty seconds. The array of relays originally installed to switch alternative cameras to the frame grabber was used. A PLC monitored this array and re-booted the computer if the relays stoped oscillating. The watchdog timer was also implemented on the IRb6/L2 Carnegie telerobot server which runs the Windows 95 operating system. The watchdog for the IRb6/L2 Carnegie telerobot was implemented without a separate PLC by using a one shot timer on the digital I/O card used to control the relay array. The one shot timer also closed a relay to re-boot the computer but was continually reset by a small program before it timed out. A watchdog timer was not used for the ABB1400 telerobot server which runs the Windows NT operating system.

Allowing interaction with the environment presents a slightly different set of reliability problems. The robot has physical limits such as joint limits, and singularities. Physical objects in the workspace also limit it. Currently the robot stops when one of these limits is encountered, and no attempt is made to continue execution. The operator is informed why the required task was not completed, so they learn how to avoid the problem on subsequent moves. To minimise the need for these messages the workspace is restricted to avoid all the above limits where possible.


The software written for the telerobots resides in a PC telerobot server and communicates with a telerobot operator indirectly via a World Wide Web server. The server software initially selected was Denny’s (1995) Win-httpd running under the Windows 3.11 operating system. In 1994, Win-httpd was unreliable but it has since evolved into Website which is reliable and still used. The web server connects to the Internet through a winsock. The first winsock used was Trumpet (Tattam 1994) and several others were trialed including Chameleon (Chamelon 1994). Winsocks are now built into the Windows 95 and NT operating system and, unlike the versions available in 1994, are also reliable and easy to use.

A web server will supply a static file to a browser or start a program, usually termed a script, in response to a client request. This is called a common gateway interface (CGI) request and on completion of the script the web server will return a temporary file that was output by the script. The telerobot operator communicates their commands to the script by means of HTML forms which contain fields that accept operator inputs. A variety of form elements are available including hidden fields, radio buttons, drop down boxes, text fields, and images that return the coordinates of a mouse click. Forms are submitted from the operators browser with either a "GET" or a "POST" type request, the difference lying in the CGI protocol for transmitting the form data to the script. With a "GET" type request the form data is passed to the script as a command line argument. The form data also appears in the address field of the web browser with the page returned and is recorded in the web server log file as part of the address. The "POST" type request stores the form data in a separate temporary file and passes the name of this file to the script as a command line argument. The HTTP protocol of the web is stateless but a web telerobot requires information to be preserved from request to request so that an operator can be identified. This is achieved by placing the data in a form field so that the data is returned to the web server when the form is submitted. When the data is of no interest to an operator, it can be stored in the "hidden" HTML field type which is not displayed by the browser though it can be seen by viewing the HTML source.

In the first version of the software, when the web form was submitted the web server launched a batch file and two scripts were launched from the batch file that executed within DOS shells. One script controlled the robot and the other generated images and the HTML output. However, it soon became apparent that communication was required between DOS shells to implement file locking and to ensure only a single user was issuing instructions to the robot at any time. The Windows 3.11 operating system provided no facility for this. Instead it was achieved by running a small terminate and stay resident program from the autoexec.bat file to reserve a few bytes in memory and reading and writing from these using absolute addressing to record and monitor file and telerobot statistics.

Web data is normally static. Therefore, to save data transmission and time, web browser programs store each web page retrieved by a user in a cache on the user’s computer. Thus, if the user wants to return to that page, it is retrieved from the cache rather than the server. However, an image used for telerobot control or for monitoring a changing scene is dynamic so a fresh version needs to be retrieved from the server each time it is accessed. Therefore, a unique number is included as part of the name of each image to force the browser into retrieving a new image each time.

During May 1995, I added the image illustrated in Figure 45, showing six wire frame models of the robot from different viewing angles.

Clickable wire frame images that were one method of specifying a location to which the telerobot should move. A request was submitted with a single click on any of the orthogonal wire frame views. This would specify two dimensions in 3D space and the third was kept constant
Figure 45

The wire frame image was only 4.5 kilobytes in size. It showed the robot from above, side on, end on and one other direction with the view from above and side on repeated at higher resolution. The circles indicate rotational joints. The wire frame image was included on the page as an HTML component of type “ISMAP”. Clicking on this element caused the HTML form to be submitted along with the image coordinates clicked and this was used to specify a goal position for the telerobot. A single click would provide only two parameters but three are required to define a position in space and three more to additionally specify the orientation. Therefore, orientation and the third component would be maintained during the move. For example, x and y may be altered but z will remain constant.

I wrote the first version of the telerobot software with assistance from Peter Murphy for the imaging and I have continued its development for most of the period of the project. I also adapted the software for the IRb6/L2 Carnegie telerobot which uses the Windows 95 operating system. Bradley Saracik provided assistance with development from December 1994 for a few months. From November 1995 until February 1996 Stephen Lepage and Shalini Cooray assisted with the changeover to the ABB1400 robot which required substantial software modifications including converting it to run as a Windows application. Barney Dalton joined the project in March 1996 and contributed to the changeover to the ABB1400 telerobot. Much of the development from then was carried out in collaboration with Barney who has also modified the software to run under the NT operating system. Gintaras Radzavinas and Stephen LePage contributed to development from December 1997. Peter Murphy contributed a Java applet showing a wire frame model of the robot in it's current pose that was included in a version of the operator interface and Harald Friz has developed a Java applet interface for robot control. Barney Dalton is continuing to develop the software with the aim of making it easy to implement a distributed processing environment where some of the processing is performed by Java applets running on the operator's computer. A structured programming model in C was used initially and existing modules developed for other applications were used wherever possible which included adapting kinematics and control software developed by James Trevelyan. With the changeover to the Windows operating environment, the software became event driven and Barney Dalton has more recently been gradually converting it to an object-oriented model implemented in C++. A number of peripheral applications such as user registration software have been added. The system design is shown in Figure 46.

The current telerobot system. On a request from an operator, the HTTPD server launches a CGI script that communicates with the robot and image servers to perform the request and obtain new pictures (Taylor and Dalton 1997).
Figure 46

The operator’s browser submits the form details as a CGI request to the web server, which receives the request and launches a CGI script. One copy of the script is launched for each request and several copies can be running simultaneously servicing the operator and observers. The script determines whether the request came from an operator and, if so, communicates using TCP/IP sockets with a server to operate the robot. It also communicates with another program to capture the images and the relationship between programs is shown in Figure 46. The script then generates the HTML page which will be returned to the operator.

Once a request is complete new images are taken of the workspace and a new form with the latest images and telerobot position is returned via CGI to the user. Only one person can control the telerobot at a time. Other users trying to gain access receive an observer page while the telerobot is busy (deemed as up to three minutes from the last operator's request). There are three classes of operators. They are: developers who can take control of the robot at any time; registered operators who must log on with a personal identification number (PIN) and who can take control from guests and finally guests who have the lowest priority. More than one move can be specified in a given request using a command script language, allowing faster operation for experienced operators. Telerobot operators submit requests in three different ways. They are: by filling in fields on a HTML form, by clicking images of the workspace, or by specifying multiple moves in a script form. Options for changing the setup such as altering image sizes, software zoom or switching images off are included in the same page along with links to more information and a chat applet. One of the many interfaces trialed is shown in Figure 47.

One of the interface variations on the ABB1400 telerobot.
Figure 47

JavaScript has been used to circumvent some of the limitations of CGI. For instance when specifying location by clicking images, a point in two different images is required to specify a location in three-dimensional space. Under normal CGI each image click results in a request being sent to the web server. JavaScript is therefore used to allow specification of a telerobot request with a single submission to the web server, however, this method does suffer from being non-standard and only the Netscape browser supports the functions used.

Interface template

The telerobot CGI script generates the pages returned to telerobot operators and observers from template files. This allows the interface layout to be altered by modifying the template file rather than changing the program code. The template file contains the HTML code that describes the page sent with additional tags (listed in Table 6) used to mark the location of variables.

Labels in the template
Label What gets substituted in
Operator details
HostNameVal The host name of the current controller
OperatorIdVal A unique number that is necessary to determine who has authority to control the telerobot
SessionIdVal For logging purposes, to keep track of a single session
Position & orientation details
XVal Telerobot's current X-coordinate value
YVal Telerobot's current Y-coordinate value
ZVal Telerobot's current Z-coordinate value
SpinVal Telerobot's current Spin value
TiltVal Telerobot's current Tilt value
GripperCheck Whether gripper should be open
MoveTypeCheck Telerobot's movement method
Movement details
MoveCountVal Number of requests made by current controller
StartTimeVal Time when operator first gained control of the telerobot
LastTimeVal Time when operator made their last request
Image details
Im1FileVal-Im4FileVal File names of the 4 image files
Im1SizeVal-Im4SizeVal Dimensions of the 4 images
Whether or not particular images should be displayed
Im1ZoomVal-Im2ZoomVal. Magnification level in cameras 1 & 2
Message details
ErrorVal If an error occurred, the message is displayed here
Interface details
InterfaceSelection Interface choices available
UserTemplate If using a personalised template, carry it over from previous requests
CheckedVal Is the user trying to gain control or just watch
A list of the labels used in the telerobot interface templates.
Table 6

For example one line of an operator's template is: <INPUT TYPE="TEXT" SIZE = 4 MAXLENGTH = 4 NAME="X" VALUE="XVal">. This is standard HTML format for a form field that allows a maximum of four characters to be entered with the default text being "Xval". When the form is submitted a variable named X will be assigned the value of whatever text is entered and the variable can be read by the telerobot CGI script. This HTML is rendered by the browser as shown encircled in Figure 48.

An operator's page template viewed with a web browser. The blue oval highlights the rendering of the HTML code: <INPUT TYPE="TEXT" SIZE = 4 MAXLENGTH = 4 NAME="X" VALUE="XVal">
Figure 48

However, the string "Xval" is not seen by the telerobot operator as it is the name of a variable which is replaced by the current Cartesian x coordinate of the robot gripper. That is, the software reads the template file, replaces the tags in the file with the current value of the variable represented by the tag and then writes the result to disk. The transmission of the HTML file to the user is then carried out by the web server. The template files reside in a directory that is accessible from the World Wide Web so that they can be viewed by template designers with the tags rather than the variables as shown in Figure 48.

There are four different interface templates:

  • New Operator - Seen only once when an operator first gains control of the telerobot.
  • Operator - Seen when operating the telerobot.
  • Clicked - Seen after clicking on the first image and before clicking on the second when moving the telerobot by image clicking.
  • Observer - Seen when not in control of the telerobot.

There are several critical components in a template including the hidden fields that specify the OperatorID and SessionID. These numbers identify an operator so they can remain in control of the telerobot. When a new operator first gains control, they are assigned an OperatorID and the telerobot program uses this ID to determine whether it should execute a submitted request. If any of the form fields are missing, default values will be used by the telerobot program and this will restrict the commands and options available to an operator. If labels are omitted, details concerning the position of the telerobot and other useful information will not be displayed.

A facility is provided to enable users to upload and use their own templates to control the interface layout. The user creates a template file with an editor on his or her own computer then uploads that file with the HTTPD upload procedure. A small script on the telerobot server moves the uploaded file into the appropriate template directory and the operator or observer then uses this template by entering the template name in a field on the operator's or observer's page. If a set of template files is uploaded with the same name, that set will remain in use as the person moves from observer to operator status or vice versa. An operator selects a standard template from a drop down list on the operators page, or where an operator has submitted their own template, selects it by filling in the template name in a text box.

Tracking telerobot operators

The HTTP protocol by which operators communicate with the telerobot is stateless and a method is required to determine whether requests are made by a current operator or someone classed as an observer. As well, by tracking individual visitors to the telerobot one can ascertain how many visitors there are, how often people come back, how long they stay and what they do. This is not a straightforward task and it is a problem shared by all web services that seek to track individuals.

There are four methods that can be used with the telerobot:-

  1. Tracking internet addresses. These are the numbers that are used to route data through the Internet and are necessarily provided to each web server to enable requested information to be returned. The numbers are usually associated with a domain name with which they can be easily interchanged by referring to a domain name service.
  2. Session Identifiers. Whenever an operator first gets control of the telerobot a unique identifier is generated and stored in a hidden field on the returned page. This value is returned every time an operator makes a request and will remain unchanged until the person loses control of the telerobot to another operator.
  3. Logging on. People are offered the opportunity to register. A password is automatically emailed to them and they can be tracked by their login name.
  4. Cookies. These are short messages left on the client computer that can only be read by the web server which placed them.

The first method, tracking internet addresses, is available to all web sites as web servers routinely log this data. The other methods require more effort to implement. Each has some advantages and disadvantages which make it suitable for different circumstances. It is also necessary to understand the sources of error in each and the levels of accuracy that can be achieved.

Tracking internet addresses

This is commonly used by researchers, for example Saucy and Mondada (1998), as it is always available and requires no action by the telerobot operator. An internet address also known as an Internet Protocol (IP) number identifies a computer, but not always uniquely. Many people access the Web through a proxy server, which means a request for a web page is always made to the one proxy server regardless of the server that holds the page requested. The proxy server then gets the page from the web server where it resides and forwards it on to the computer from which it was requested. In this case, the internet address of the proxy server is recorded and different people using the one proxy will have the same internet address. In addition, many people use dynamic addressing. In this case, every time the computer is connected to the Internet a different internet address is assigned to it. This is common with dial up connections to commercial internet service providers as it allows a small pool of internet addresses to be used by a large number of people.

Session identifiers

A session identifier is generated on the first request of a session and is based on the time and date in order to guarantee uniqueness. The same technique is used to identify the current operator every time a request is made to the telerobot. Each request received by the telerobot server is classified as being from an operator if the correct session identifier is present. This method is absolutely reliable and has been used in this research for identifying requests as being in a single session rather than tracking internet addresses.

User registration and logging on

A person can register by completing a form which allows some information to be collected. They are emailed a PIN number which, together with a logon name, is entered before gaining control of the telerobot. This has been the method used in this research for identifying individuals across sessions. There is no cost in registering so it seems unlikely that a person would share their account. The incentive to register is that registered users have preference over unregistered users for control of the telerobot. It seems quite likely that many registered users will have played with the telerobot prior to registration and there is no certainty that every time a registered user operates the telerobot they will log on. There is a high degree of assurance that the data identified by a user name is from a single person but less certainty that all the actions of that person are identified.


Cookies are short messages left on the client computer which can only be read by the web server from which they came. They identify computers, not people, so those who use more than one computer or those who share a computer will cause statistical errors. A further disadvantage with cookies is that they are sometimes disabled by operators and do not work with all browsers, although they do work with Netscape and Microsoft browsers, which most people use. Cookies have only used on the telerobot for counting unique visitors by the Webside Story (1998c) counting service discussed in section 5.4.

Operator records

Analysis of web server log files was the first method used to monitor operator behaviour. The web server logs contain:-

  1. Internet address of visitor;
  2. Date and time of request;
  3. Item requested;
  4. Page transmission time. However the value recorded is subject to error as the winsock (see section 4.4 regarding winsocks) buffers the data to be sent and this figure only measures the interval from the time the request has been received to the time that the data has been transferred to the winsock buffer;
  5. Referring site, which is the address of the web site from which a hyperlink was followed to the telerobot.

When the “Get” method of HTML form submission (see section 4.4 Software) was used, the server log also included the query string. To make use of the query string recorded in the server log, it was necessary to write software that could decode it and write in a format that was suitable for importing into a database. This method of logging telerobot activity had some limitations. Tracking operator activity becomes quite complex as requests from people with "observer" status are intermingled with operator requests and there is no record of the response to requests. Determining which requests come from an operator and which come from an observer requires applying the control algorithm to the log file data.

After changing to "Post" HTML form submission the method of monitoring operators was changed. An operator log file was created with data appended by the telerobot program after each request. The data recorded has varied from time to time but that recorded in 1998 is shown in Table 7. Requests to gain control of the telerobot by non-operators are recorded in a separate file.

Field Description
HostName Domain name of the computer accessing the telerobot.
SessionID A unique identifier is created for the first request in a session and recorded for each request in the current session. A session remains current until another operator takes control of the telerobot. The number is included as a hidden field in the form returned to the operator.
MoveCount The number of request received for this session.
TimeBuffer The time when processing this request was started.
TimeTaken The time taken since processing the previous request was started.
InterfaceVal The file name of the template used to generate the html page returned.
X_Val X coordinate after the request is executed.
Y_Val Y coordinate after the request is executed.
Z_Val Z coordinate after the request is executed.
Spin The spin angle after the request is executed in degrees.
Tilt The tilt angle after the request is executed in degrees.
Gripper 0 for closed and 1 for open.
MoveSuccess A value recording the success of a request. This could indicate a successful request, robot offline, emergency stop due to collision, or joint limit exceeded.
Error Message The error message that appears on the operators page when a request has been unsuccessful.
Cx_OnOff There are four cameras and this field is repeated for each with X replaced by the camera number. 0 indicates operator switched this camera on, 1 indicates operator switched this camera off.
Cx_Size Image size requested by operator.
Cx_GScale This was the number of shades of grey in the image when gif image coding was used but has been changed to the jpeg image quality expressed as a percentage.
UserName The login name of the registered user and "guest" if not being operated by a registered user.
MoveScript The series of moves requested for a multiple move request.

Data recorded for each request from an operator to the telerobot
Table 7

The data recorded includes all aspects of an operator's request, the robots response and fields that make it easy to identify sessions and the sequence of requests in a session. This makes more complete interpretation of operator behaviour easier than was possible when relying on the web server logs.

Approaches to interface design

With the supervisory control scheme employed on a web telerobot, a human is an intimate part of the control loop. Human-robot interactions then have a higher importance than is usually the case with conventional robots which are often expected to operate autonomously for most of their life. Therefore, the operator interface design is an issue deserving of attention. With the introduction of Java applets and JavaScript, there is now considerable flexibility available within the web browser framework and from observation it is apparent that small changes in layout can affect operator behaviour.

Interface development has been highly iterative, as it was usually easier to understand operator reaction to different interface designs in retrospect than to anticipate them. This is partly because those involved in system development are unable to view the interface from the same perspective as telerobot operators. The dilemma has been recognised by Lewis and Rieman who state: “you may read our examples of problem interfaces and say to yourself, ‘Well, that's an obvious problem. No one would really design something like that. Certainly I wouldn't.’ (Lewis and Rieman 1993b:5)". Then after giving several examples they go on to say “the flip side of the ‘it's obvious’ coin is that most actions used to accomplish tasks with an interface are quite obvious to people who know them, including, of course, the software designer. But the actions are often not obvious to the first-time user.”

The following sections provide an overview of the cognitive framework for interface design, its significance for the web telerobot interface and the interface interaction styles available. This is followed by a review of the methods available for interface development and the reasoning for the particular methods adopted for this research.

Cognitive frameworks for human computer interaction

The dominant intellectual framework that has characterised human computer interaction (HCI) theory has been cognitive (Preece et al. 1994). In general, cognition refers to the processes by which we become acquainted with things or how we gain knowledge. The major paradigm used to describe human cognition has been to characterise humans as information processors whereby information enters and exits the human mind through a series of ordered processing stages (Lindsay and Norman 1977), as shown in Figure 49.

Model of human information processing stages showing the basic and extended version. Adapted from Barber 1988 (1988)
Figure 49

The four stages involved are encoding, comparison, response selection and response execution. Extensions of the basic information processing model include the processes of attention and memory. The human processor model provides a means of characterising the various cognitive processes that are assumed to underlie the performance of a task. From this conceptual model, other models have been abstracted, often using the computer as a metaphor. Concepts such as buffers, memory stores and storage systems have provided psychologists with a means of developing more advanced models of information processing. According to Preece et al (1994:63) there has, since the 1980’s, been a move away from the information processing framework with the main problem being that it oversimplifies human behaviour and cannot therefore be used to predict responses. Preece et al (1994:67) argues that “it is becoming increasingly recognised that a cognitive perspective of the individual user performing various tasks at the interface is an inadequate conceptual framework for HCI.” In view of this, Landauer (1987) feels there is no sense in which we can study cognition meaningfully divorced from the task context in which it finds itself and Preece et al (1994:67) take the view that a central focus of research in HCI must be the environment in which users carry out their tasks. A similar view is provided by Winograd and Flores (1987) who state that: “it is clear that (and has been widely recognised) that one cannot understand a technology without having a functional understanding of how it is used. Furthermore, that understanding must incorporate a holistic view of the network of technologies and activities into which it fits, rather than treating the technological devices in isolation.”

Therefore, there is no strong cognitive theory for the web telerobot interface which provides a prescriptive prediction of the ideal interface design or which provides a mechanistic model in the sense that Newtonian mechanics does in the field of dynamics. Ideas can be drawn from interface designs for other applications but they are unlikely to be directly transferable to the web telerobot interface. In addition, controlling a teleoperable device is so different from using the web for reading that lessons learned with web interfaces generally are unlikely to be directly transferable. The preceding paragraphs indicate that there is no satisfactory alternative to developing the interface with an actual web telerobot and, sadly, they imply that it is difficult to apply the lessons learned from these web telerobots to others performing different tasks.

Interaction styles

The most basic form of human computer interaction is the command-driven application. Here a computer is instructed from a command line with no indication, prior to entering the command, of what the response will be. A user must know what commands are available and what they will do. This provides a general interface that works for a wide range of applications, for example, the DOS or Unix computer operating systems.

The next level of sophistication is the menu and form fill approach intended initially for clerical workers and mimicking paper forms. A menu and form fill interface provides guidance to users as to what is required and is less general than the command line style which requires redesign for each task. According to Preece et al (1994:118) one of the most well established findings in memory research is the fact that we can recognise material far more easily than we can recall it from memory.

The capacity to recognise is the principal advantage of menus and form fills in interfaces. The user has only to select from the options offered by a menu and is prompted for the required information with a form fill. Menus and form fills are usually used together. Menus are ideal when a user must select from a limited number of options. The advantage of form fills is that they prompt for the required information and allow for a large range of values to be entered.

Natural language provides another method of human computer interaction where a user types a command in natural language. However, the system needs to cope with vagueness, ambiguity and multiple methods of conveying the same information. Such systems will usually require more typing than a command system and they are complex to develop and therefore not often used. Engleberger had intended to provide the HelpMate robotic nursing assistant (discussed in section 2.4.10) with a voice recognition natural language interface but was not able to do so.

The final type of human computer interaction is direct manipulation which involves replacement of a command language by direct manipulation of the object of interest. Usually this means manipulating representations of objects on the screen, which can be moved around and manipulated with a mouse, but for telerobots, it would include master slave systems. The Apple Macintosh and the more recent Microsoft Windows operating systems are examples of direct manipulation interfaces which are based on the metaphor of a desktop. Files are kept in folders, information is copied to a clipboard, etc. Direct manipulation interfaces are frequently used in computer aided drawing (CAD) systems and robot simulation systems. The success of video games demonstrates the enthusiasm many people have for this type of interface. Shneiderman (1983) believes this success is due to:

  • Novices being able to learn the basic functionality quickly;
  • Experienced users being able to work extremely rapidly to carry out a wide range of tasks, even defining new functions and features;
  • Knowledgable intermittent users being able to retain operational concepts;
  • Error messages rarely being needed;
  • Users being able to see immediately if their actions are furthering their goals and, if they are not, being able to simply change the direction of their activity;
  • Users experiencing less anxiety because the system is comprehensible;

A major complication of using direct manipulation interfaces with web telerobotics, and one that is shared with three dimensional CAD systems, is representing and manipulating objects in three dimensional space on a two dimensional computer screen.

Heuristic analysis

Heuristics, also called guidelines, are general principles or rules of thumb that can guide design decisions. In the opinion of Lewis and Rieman no list of heuristics has been very effective “in improving the design process, although they're usually effective for critiquing favourite bad examples of someone else's design (Lewis and Rieman 1993c:25)”. The list of general heuristics developed by Molich and Nielsen (1990) which was used in developing the telerobot interface iterations and is shown in Table 8.

The list of general heuristics developed by Nielsen and Molich (1990).
Table 8

Nielsen and Molich also provide a technique for evaluating a design with respect to these criteria and have been able to show that it can be used to identify 75 percent of the heuristic problems. However, they point out that many interface problems are not heuristically identifiable. While the concept of controlling a device is quite different from acquiring information through a web page, the heuristics for web page design are also likely to be relevant. Nielsen (1997a) provides three guidelines for web pages which are:-

  • Be succinct: write no more than 50% of the text you would have used in a hard copy publication.
  • Write for scanability: don't require users to read long continuous blocks of text.
  • Use hypertext to split long information into multiple pages.

According to Nielsen, reading from a web page takes about 25% longer than reading from paper and therefore a succinct writing style is necessary. Scanability is important because useability studies have shown this is the way the web is used. As well, avoid the need to scroll because “we know from several user studies that users don't scroll” (Nielsen 1996).

Ratner et al recommend against giving too much credence to HTML style guides as they have found “little consistency among the 21 HTML style guides assessed, with 75% of recommendations appearing in only one style guide (Ratner et al. 1996)". Ratner et al argue that the web "represents a unique HCI environment (Ratner et al. 1996:368)" but point out that HTML style guides address web information content pages, largely ignoring issues such as help, message boxes and data entry that are important with web-based applications.

Development of the interface

An approach to interface development recommended by Preece et al (1994) is user-centred design. The underlying principles are:-

  • Involve users as much as possible, so that they can influence the design;
  • Integrate knowledge and expertise from the different disciplines that contribute to HCI design;
  • Be highly iterative, so that testing can be done to check that the design does indeed meet users' requirements.

Lewis and Rieman (1993a) propose a variation on user-centred design which they term-task centred design. This method was adopted for the interface development of the web telerobots in this research, however, with some variations. The steps involved in the task-centred design process and how they were applied in the course of this project are described below. This list was taken from Lewis and Rieman (1993a).
  1.  Figure out who's going to use the system to do what.
At the time of project commencement (May 1994), there were no web devices and therefore there was no information available regarding who would use the system or what they would do. These issues were addressed during the course of the research and are considered in Chapter 5 including the demographics of operators and what operators build with the telerobots.
  2.  Choose representative tasks for task-centred design.
The task chosen was the manipulation of wooden blocks.
  3.  Plagiarise.
There was little potential for plagiarisation as there no devices with web interfaces when the project began. However, some ideas were gained by reviewing operator stations for non-web telerobots. Later web telerobot implementations have been able copy elements of their interface from the web telerobots built for this project.
  4.  Rough out a design.
  5.  Think about it.
  6.  Create a mock-up or prototype.
This step was considered unnecessary as the interface could be easily modified, it seemed advantageous to test it with operators as early in the development process as possible and testing it with operators was easy to do.
  7.  Test it with users.
No testing was done prior to implementation.
  8.  Iterate.
There were no iterations prior to implementation.
  9.  Build it.
  10.  Track it.
Observing the actions of operators is enough to provide considerable insight into the changes that need to be made. For example, in the first interface version the robot gripper could frequently be seen opening and closing in front of (with respect to the camera) or behind blocks indicating that operators could not perceive depth with the feedback provided. As the interface developed, operator behaviour was analysed and quantitative data was also used to develop the interface design.
  11.  Change it.
The interface has gone through many revisions during the course of the project and steps 10 and 11 repeated many times. The heuristic guidelines discussed in section 4.8.3 were used to evaluate proposed changes at this point, for example, a procedure was added to automatically reset the robot after collisions rather than to improve the error messages advising of collision. An additional step not mentioned by Lewis and Rieman was added to the interface development procedure as well. This step was to seek interface ideas from others, including University of Western Australia students and the telerobot user population.

The menu/form fill interaction style was chosen from the interaction styles discussed in section 4.8.2 as it is the easiest for novice operators to understand and web browsers include the functionality required to implement menus and form fills. As well, web users are familiar with web forms so have less to learn to understand the interface than is required with other techniques. In 1998, the advent of Java has provided opportunities to develop direct manipulation interfaces but this was not available when the project commenced in 1994.

Having outlined the procedure adopted for interface development and the reasons for using it, the following sections describe issues which emerged during the actual development and provide some quantitative data that was used to develop the interface design.

Cues for location

Providing cues for an operator to form an understanding of the robot arm position and block positions has required considerable development time. The problem is to represent three dimensional space on a two dimensional screen with the additional restriction of the limited bandwidth provided by the Internet. This restriction requires an emphasis on maximum information for minimum data.

The cues provided for location are:-

  1. A grid marked on the table with 100 millimetre squares.
  2. Four camera views in black and white, three from fixed cameras and one from a camera slung under the robot arm.
  3. Gripper location in x,y,z coordinates.
  4. Gripper orientation as spin and tilt.

Initially there was only one camera and fluorescent lighting which provided no shadows. This made the perception of depth very difficult and operators had great difficulty in locating blocks. As a result lighting which caused shadows was added and this which improved depth perception. The addition of a second camera in December 1994 with a view from a direction at 90 degrees to the first has provided the best perception of depth. Discussion with operators and my own observation suggested that, with the second view available, the best lighting then became fluorescent lighting without shadows. However, while the perception of depth was excellent with this layout it became difficult to recognise the same object in both images and this complicated scene interpretation. Moving the cameras closer together made scene interpretation easier but reduced the depth information. The current layout of four cameras is shown in Figure 50.

An illustration of the current camera positions. Reducing the angle between cameras 1 and 2 makes recognising the same objects in either view easier but reduces depth information.
Figure 50

The optimum angle remains undetermined but operators prefer an angle between cameras of less than 90 degrees. This can be seen from Figure 51, which shows the cameras that operators chose to switch off. The advantage of switching off a camera is faster page updates but most operators will accept the default settings. The least popular camera was the camera positioned orthogonally, followed by that slung under the robot arm.
Operators are able to turn off cameras for faster page updates. This shows the cameras turned off with the camera layout of Figure 50 for a sample up to 15 December 1996.
Figure 51

A similar consideration applies in the vertical plane. Cameras aligned horizontally provide excellent height information but no plan view. Again the optimum is somewhere between a plan and horizontal view. Cameras 1 and 2 are set at different heights to provide a variety of height versus plan views as can be seen in Figure 52.

Figure 52

Another approach adopted to improve perception of pose was to remove some of the unnecessary system functionality. Initially operators were able to specify position and orientation in the full six degrees of freedom. Orientation was specified as roll, pitch and yaw. It was noted by a colleague, Barney Dalton that for all useful block manipulations only two orientation specifications were required, termed spin, and tilt. Spin is defined as rotation about the table's Z axis and tilt is the angle the gripper makes with the Z-axis. This ensures that a line drawn through the gripper end points is always parallel to the XY plane as shown in Figure 53.

Spin and Tilt orientations for the same coordinate position. Note that the jaws of the gripper are in a parallel plane to the table at all times.
Figure 53

Therefore, in subsequent versions of the interface for the ABB1400 telerobot, roll pitch and yaw were replaced by spin and tilt which is much easier for the operator to understand and simplifies the use of the telerobot without losing any useful functionality. Neither roll pitch and yaw or spin and tilt could be used with the he IRb6/L2 Carnegie telerobot as it is a five axis machine unable to adopt the required poses. The solution adopted for the IRb6/L2 Carnegie telerobot was lean defined as a rotation from vertical about the fourth axis and spin defined as rotation about the axis of the gripper. Spin only provides useful orientations for block manipulations when tilt is zero.

Camera image quality

Initially images were provided in GIF format and the operator controlled the quality by means of adjusting the number of shades of grey and the resolution. Some data was collected on operator imaging preferences which at the time were altered by clicking a button on the telerobot control page labelled “Change Image”. The page returned in response to the click allowed the operator to select three aspects of the image quality; size, resolution and number of shades of grey (See Figure 54).

Image quality options.
Figure 54

The defaults were: 16 shades of grey, 256 x 256 pixels, full resolution. With GIF coding reducing the number of shades of grey or the image size provides far greater savings in image data than reducing the resolution. For example, reducing the image size by half reduces the number of unique pixels by three quarters and this reduces the image data size by 60%. Reducing the resolution by half also reduces the number of unique pixels by three quarters but the reduction in image data size is only 3%.

Operator image selections for a sample of 30,800 requests between 16 March 1995 and 14 May 1995 are shown in Table 9.

Choice of image quality by operators 92% of requests to the telerobot specified the default image specification. * Default values.
Table 9

It can be seen that 93.1% of requests were made with 16 shades of grey, 97.1% with an image size of 256 pixels, 99.4% at full resolution and that overall 92% of requests were made with all of the default image specifications. It was anticipated that most operators would adjust the images to match their preference and that the preferences selected could be used to determine operator requirements. However, operators overwhelmingly accepted the defaults. Where selections were made they generally favoured increasing image quality and as far as can be determined the selections did seem rational. For example, 32 shades of grey significantly improves image quality over 16 shades but 64 shades provides only a marginal improvement over 32 shades and on some displays no improvement. Very few requests were made with settings larger than 32 shades. As well, reducing resolution provided significant loss of image quality with only a small saving in image data size and only 0.5 of requests were made with reduced resolution.

It is apparent that willingness to accept image defaults is dependant on interface complexity and layout. Many operators complained of aspects of the image quality that they could control and a few had even suggested that image quality should be adjustable. These users were obviously unaware of the capacity to control image quality. When changing from the IRb6/L2-6 Perth robot to the ABB1400 telerobot image quality selections were moved to the bottom of operator's normal page. For the Carnegie IRb6/L2 telerobot colour images were provided with only the size being adjustable and after changing to the Windows NT operating system imaging on the ABB1400 telerobot was also converted to Jpeg format with the image quality, expressed as a percentage, being controlled by the telerobot operator.

Telerobot commands

Having established a session an operator may select a number of different commands. Most of the commands available for the IRb6/L2-6 telerobot in Perth are shown in Figure 55.

Telerobot controls on the IRb6/L2-6 Perth telerobot. Roll. pitch and yaw was later replaced by spin and tilt..
Figure 55

The possible commands were:-

  • Move Telerobot
    • Relative to the current position.
    • To an absolute position.
  • Absolutely by clicking on a wire frame image as described in section 4.4 and illustrated in Figure 45. In this case an operator is limited to maintaining the gripper orientation vertical and varying any two of the three spatial coordinates.
  • Change the image specifications.
  • Reset the robot. If the robot was brought down on top of an object it would overload the motors and trip out. It could not be moved again until it was reset.
  • Access Telerobot. This is a bid to control the telerobot or the first page returned after gaining control.

Requests by operators were analysed to determine which commands operators were using and which method of moving the robot was most favoured. The number of times each command was used and the percentage of each command of the total requests from 16 March 1995 to 1 September 1995 is shown in Table 10.

Commands issued by operators to the telerobot. The different methods of moving are expressed as a percentage of the total requests and the sum of the percentages is greater than 100 as gripper operations can be performed in the same request as a move.
Table 10

The interface was being developed while this data was acquired in response to observation of operator reaction with these statistics acquired to quantify operator behaviour as a guide to further development. The data in Table 10 reveals a problem which operators had in using the telerobot. The number of reset requests was only 1 percent of all requests. Since 53% of requests are to take control of the telerobot, this equates to roughly 2 percent of requests made after an operator is in control of the telerobot. From watching operators interacting with the telerobot it was observed that an operator is likely to trip out the robot every 5 to 10 requests depending on the experience of the operator. Therefore if reset requests were in the correct proportion, they should be in the range of 4 to 7 percent of all requests. This means many move requests went unsatisfied because the robot required resetting. To eliminate this problem this reset function was automated. The software was modified to test the state of the robot prior to issuing a command to move it and to automatically reset the robot if the motors had been overloaded on the previous request. The information that the robot has tripped out is conveyed by a text message in the page returned to the operator associated with the request that caused the over load.

The method of moving, by clicking on a wire frame model was not available for the full period covered by the data in Table 10. Using operator request data from 10 June 1995 to 1 September 1995 when all methods of moving the telerobot were available and the default is move absolute over move relative. The move requests are given Table 11.

Table 11

The data in Table 11 shows a strong preference for moves to an absolute location rather than moves relative to the current location and also shows very little enthusiasm for moving the telerobot by clicking on the wire frame. This is despite clicking on the wire frame being a quicker method of specifying a move than calculating the move values from the grid in the image and entering the coordinates.

To speed up requests, it is possible to “check a box” to not update either of the two images or to not show the wire frame model. It is most practical not to update the images when the telerobot is not altering the environment for example when an operator is lifting the gripper from a block placed in a previous request. This is because the wire frame and numerical feedback provide the robot's position, this being the only aspect of the existing images that has changed. Even when modifying the environment the effect can usually be judged from the primary image so an operator could get a faster result if he elected not to update the secondary image. Operators' enthusiasm for switching off images to speed up response was examined. From 30,800 requests from users in control of the telerobot from 10 June 1995 to 1 September 1995 the percentage of image combinations requested is given in Table 12.

Percentage of 30,800 requests where options to not update an image or not show the wire frame was selected. Also in 0.5% of cases all of these feedback options were turned off.
Table 12

The default is to display all images and in 0.5% of cases operators chose to switch off all imaging which left only the telerobot position information in text. Almost no requests were made with the primary image switched off and in only about 4.5% of requests was the secondary image switched off. However, the wire frame image was turned off in nearly half of all requests, which indicates a strong operator dislike of the wire frame model, particularly given that for most options, operators accepted the defaults. This is despite the data content of the default camera images being about two and a half times the size of the wire frame. The wire frame contains no information on the blocks but the six views of the wire frame provides a comprehensive description of the telerobot state.

Many operators express difficulty in interpreting the wire frame so perhaps an alternative layout would improve its usefulness. Most operators are inexperienced so another factor influencing operator preference may be that the camera images are more familiar. The strategy of using the wire frame and not updating one or both camera images was almost never adopted though a few users were content to work from the primary image only. For this reason, the wire frame method of specifying moves was abandoned in later versions of the interface.

Seeking interface ideas

The templates provide a method of modifying interface layout and were used to seek out ideas for further interface designs from students and telerobot operators so that more changes could be made in accordance with step 11 of Lewis and Rieman's (1993a) task-centred design process described in section 4.9. On average thirteen hours twenty minutes per day the Perth telerobot alone (as measured over a seven-week period to 10 August 1997) was being operated. Therefore a considerable intellectual effort is expended on it. As well, telerobot operators frequently contributed suggestions on how the interface could be improved. So it was thought that enabling operators to develop, up load and use their own interface designs would provide an opportunity for telerobot operators to contribution further to the project. Contrary to expectations this facility, described in section 4.5, did not prove popular with operators. Writing templates seemed an easy process for those involved in developing the telerobot but it became apparent from teaching this process to students and from telerobot operators' comments that it was not simple for those unfamiliar the technique. Some of the ideas which emerged from the students Student Reports (Taylor 1998) were:-

  1. An inverse pyramid style, where the immediately attractive and most used functions are placed at the top of the page to reduce scrolling.
  2. Short text at the top of the page is hyperlinked to more comprehensive descriptions placed further down the page.
  3. Minimising page content to achieve response times as short as possible.
  4. The use of frames for non dynamic information.
  5. Consistent designs for each template.
  6. Hypertext links to important information relevant to that template.

Some of the designs developed by students are shown in Figure 56 and Figure 57.

The telerobot operators page developed by Simeon Bartley and Melissa Aspinall.
Figure 56

The telerobot operators page developed by Clayton Deegan, Haytham El-Ansary and Scott Fillery.
Figure 57

A interface design competition was run with a prize of US$100. The competition was open to allusers other than University of Western Australia students. The criteria for awarding the prize were the best ideas presented and later entries were encouraged to look at, and build on previously submitted designs. The competition did not prove popular. Quite a few people attempted a design but only three actually submitted entries. The low number of entries is attributed to the difficulty people had in understanding the process for designing a template and the time required to develop one. The entry judged as being the best is shown in Figure 58 and the other entries in Figure 59 and Figure 60.

The winning operators interface design by Bernat Plana.
Figure 58

An entry in the interface design competition submitted by Joe Petsche.
Figure 59

An entry in the interface design competition submitted by Kevin Christ.
Figure 60

Many of the ideas that emerged in these exercises were subsequently used in other versions of the operator's interface.