The telerobots built for the project
This chapter describes the three web telerobots built for this
project and the development of the operator interface. It begins with an
explanation of what was built and how web browsers and servers that
were designed to provide information are used to teleoperate a robot.
The provision of visual feedback is described and how the robots are
controlled is explained. The problems that had to be overcome to achieve
reliable telerobots are discussed. The design of the software is
described including the technique for generating the operators
interface, the methods adopted for tracking operators are discussed and
the records used in the following chapters to analyse operator behaviour
described. This is followed by a description of the methodology for
operator interface development and some of the measurements made of
operator reaction to the interface.
The first IRb6/L2-6 telerobot in Perth came online in September
1994. My intention was to provide a teleoperated industrial robot
accessible to a large user base by not requiring any special equipment
or software at the operator's end. The concept of a telerobot
controllable through the World Wide Web is illustrated in Figure 41.
The World Wide Web telerobot concept. (Taylor and Trevelyan 1995)
The concept has remained unaltered throughout the project, however the
implementation has been constantly modified. There have been three
different operating systems in the telerobot server, Windows 3.11,
Windows 95 and Windows NT and three different robots, initially a six
axis IRb6/L2-6 in Perth, Australia replaced by an ABB1400 and also a
five axis IRb6/L2 at Carnegie Science Centre, Pittsburgh, USA. A variety
of frame grabbers and imaging techniques have been trialed and the
software constantly revised. A time line of telerobot development is
shown in Figure 42.
The time line is provided to assist in understanding the different
versions of the telerobots discussed and the data sets analysed in the
To operate the telerobot an operator specifies goal poses in
accordance with the supervisory control model, as discussed in section
3.1.4, and the time critical control tasks are localised to the robot
controller to avoid instability. Figure 43 shows schematically how the
concept is applied to the telerobots built for this research.
Supervisory control as applied to the telerobots built for this project.
The time critical control tasks are localised to the robot controller.
One version of the interface on the IRb6/L2-6 Perth telerobot included a
computer generated image of the robot showing its current pose. The
image was generated by a computer at the Technical University of
Braunschweig, Germany, but, as it was not distinguished from the other
elements in the operator's page, it was not apparent that it was being
generated elsewhere in the world. This facility was abandoned when the
Braunschweig robot simulator was taken off line, but the technique by
which it was implemented demonstrates one method web services can be
constructed from components offered by a variety of providers. All
communication between the telerobot server and the Braunschweig computer
took place indirectly via the operator's browser. That is, the page
returned to the operator after each request to the telerobot contained
the specifications of the computer generated image which the browser
then requested from the Braunschweig computer as shown in Figure 44.
The IRP Manipulator viewer in Germany produced a model of the robot in
its current pose with all communication between the computer in Germany
and the robot in Australia taking place indirectly via the operator's
The image specifications included the viewing angle and robot joint
angles and had the following format:-
- SetName - The symbolic name of the specific setting.
- NumVal - Character string representing a floating point
number. Common formats accepted.
- Lprl - The numerical value of the link l of the r-th robot in
- Hprx - The numerical value defining gripper opening.
Visual feedback is provided by Pulnix TM6-CN monochrome cameras which
collect images of the robot workspace from different viewpoints. The
cameras feed these images to a simple video multiplexer built with reed
relays controlled from a PC brand, model PC-36 digital I/O board.
Initially a Data Translation Quickcapture frame grabber was used. Of the
768 x 512 pixels received from the camera, only the central 512 x 512
field was stored. This restriction reflected the memory limitations of
the Windows 3.11 operating system in use at the time where an image
collection program ran in a DOS shell to convert each image to the GIF
format [ ]. A FlashBus Pro - PCI bus Color Video Frame Grabber has been
used in the IRb6/L2 Carnegie telerobot and a Matrox Meteor Frame grabber
in the ABB1400 Perth telerobot. Both frame grabbers include dynamic
link libraries for data capture and image compression which make them
easy to use. In the ABB1400 telerobot and the IRb6/L2 Carnegie telerobot
the GIF image encoding has been replaced by Jpeg encoding to minimise
image data size. The image server provides some sophisticated services
such as operator controllable image sizes, a software zoom and images
from multiple cameras. All images are acquired at maximum resolution and
then shrunk to the image size requested. As the gripper position is
known and the cameras are calibrated, a region around the gripper can be
cut from the image rather than shrinking the whole image. The size of
the photographed region around the gripper is controlled by the zoom
Other telerobots not requiring variable image sizes, software
zoom or images from multiple cameras have been able to simplify imaging
by using general purpose web video software. Webcam32 by Kolban (1997)
is a good package used for My Toy Robot (Dion 1998) and the Jason
Project (Fisher Jason Team 1998). Webcam32 requires a Microsoft Video
for Windows compatible camera or frame grabber. Live imaging has been
considered, but for most of the period of the project the technology to
implement live imaging efficiently has not been available. Atkinson and
Ciufo (1998), on a telerobot at the University of Wollongong, have
implemented live imaging with server push technology but this does not
allow interframe compression. On a fast link, server push is effective
as a full sense of motion is achieved but on slower links, this effect
is lost as it takes too long to download a frame. It also consumes most
of the bandwidth making it slow to send commands to the telerobot. For
these reasons, server push has not been used for live imaging on any of
the telerobots built for this project.
The telerobot server software constrains robot movements to remain
within a cubic space above the table which has the blocks placed on it.
Positions specified by operators that are outside this cube are adjusted
by the software to bring them within the cube. In addition the pose is
restricted to tilts of 45 degrees and spins of 90 degrees (see section
4.10 for discussion of spin and tilt) to avoid the robot joints winding
The ASEA series 2 robot controller used on the IRb6/L2 Carnegie
and IRb6/L2-6 Perth telerobots can accept remote motion commands through
a serial communication link (RS232, 9600 baud) from the telerobot
server using a proprietary but published communications protocol. The
status and position of the robot can also be obtained from the
The functions used to control the robots are:-
- Read status and robot position;
- Control robot operating mode (standby - operate);
- Move robot to programmed pose specified in Cartesian
coordinates and quaternions; and
- Start/stop a program stored in the controller memory. Two short
programs stored in the robot are used to open and close the gripper
which is necessary because the controller does not permit direct control
of the gripper via the serial communication link.
The S4 controller, which is a later model robot controller required
for the ABB1400 robot, was much more difficult to use. In this case the
telerobot server communicates with the ABB1400 robot controller using
TCP/IP running over a SLIP serial connection. The robot manufacturer
does not publish the communications protocol. Instead proprietary
software is available of which two different versions have been used.
The first operated as a dynamic link library with functions called from
telerobot server software to transfer state data and robot control
programs to and from the robot controller. The second version, used in
1998, has been an ActiveX control which operates asynchronously. The
controlling software subscribes to the events about which it needs to be
Direct commands to the robot cannot be made but can be issued
indirectly by downloading a program to the robot controller and running
it. In earlier versions, the program to be downloaded was constructed in
the telerobot server to match the particular request received from an
operator. Alternatively, the variables used in the robot controller
programs, for example, target locations can be downloaded. Currently, a
more complex generalised program is stored in the robot controller and
only variables used by the program are downloaded for each request. This
lowers the time per request by lessening the data transferred over the
The robot control software incorporates automatic compensation
for errors in the kinematic model of the manipulator as follows:-
- Model data is read from a data file. The calibrated kinematic
model is generated using a technique developed by Legnani and Trevelyan
- The program calculates the joint angles that are required to
attain the specified pose, using the calibrated kinematic model.
- The software then calculates a Cartesian position and
quaternion which, when sent to the robot’s controller, will result in
those joint angles being attained, and hence the required pose.
The telerobots built for this project have always been heavily used
and are expected to be available 24 hours a day. When accessing web
telerobots, it is common for them to be inoperable and the difficulty of
building a reliable system is frequently underestimated. Several web
telerobots have been abandoned because of these problems. As other
researchers in the telerobotics field can testify, the combination of an
unstructured environment and random movements leads to many unexpected
problems; “The ability of the system to enter an unexpected state far
exceeded our ability to anticipate this happening” (Stein et al. 1994)
and Simmons (1998) refers to the challenge of building a reliable
telerobot. To put the problem in perspective, in the first eight months
of 1998 the ABB1400 telerobot in Perth received on average more than
1000 requests a day (see Figure 65). A fault that would disable the
robot that occurred every thousandth request would disable the robot on
average more than once a day, which is too often. As well, some
operators are deliberately destructive.
Much of the software adapted for this project was found to
contain errors which were tolerable for other applications but had to be
removed to achieve a fully reliable system. In the first IRb6/L2-6
telerobot in Perth the control software needed several improvements to
avoid an indefinite hang up in the serial communication link with the
robot. The communications software had used polling for serial
communications but this had to be modified to be interrupt driven, as
accurate timing was not possible with the multi tasking of the Windows
3.11 operating system. Mistakes were also discovered in the kinematics
calculations as remote users attempted more ambitious tasks. When
changing to the ABB1400 robot, similar errors were encountered with the
proprietary communications libraries provided, having occasional errors.
This highlights the need for higher levels of reliability than is
required for many other robotics applications. As well the Windows 3.11
and Windows 95 operating systems generate occasional faults that render
them inoperable. Observation and correction of problems can achieve
local software reliability, but for operating systems and third party
products, this method is not available.
To overcome instability in the IRb6/L2-6 Perth telerobot and the
ABB1400 telerobot when it ran under the Windows 3.11 operating system a
watchdog timer was implemented by running a short program to toggle two
relays every thirty seconds. The array of relays originally installed to
switch alternative cameras to the frame grabber was used. A PLC
monitored this array and re-booted the computer if the relays stoped
oscillating. The watchdog timer was also implemented on the IRb6/L2
Carnegie telerobot server which runs the Windows 95 operating system.
The watchdog for the IRb6/L2 Carnegie telerobot was implemented without a
separate PLC by using a one shot timer on the digital I/O card used to
control the relay array. The one shot timer also closed a relay to
re-boot the computer but was continually reset by a small program before
it timed out. A watchdog timer was not used for the ABB1400 telerobot
server which runs the Windows NT operating system.
Allowing interaction with the environment presents a slightly
different set of reliability problems. The robot has physical limits
such as joint limits, and singularities. Physical objects in the
workspace also limit it. Currently the robot stops when one of these
limits is encountered, and no attempt is made to continue execution. The
operator is informed why the required task was not completed, so they
learn how to avoid the problem on subsequent moves. To minimise the need
for these messages the workspace is restricted to avoid all the above
limits where possible.
The software written for the telerobots resides in a PC telerobot
server and communicates with a telerobot operator indirectly via a World
Wide Web server. The server software initially selected was Denny’s
(1995) Win-httpd running under the Windows 3.11 operating system. In
1994, Win-httpd was unreliable but it has since evolved into Website
which is reliable and still used. The web server connects to the
Internet through a winsock. The first winsock used was Trumpet (Tattam
1994) and several others were trialed including Chameleon (Chamelon
1994). Winsocks are now built into the Windows 95 and NT operating
system and, unlike the versions available in 1994, are also reliable and
easy to use.
A web server will supply a static file to a browser or start a
program, usually termed a script, in response to a client request. This
is called a common gateway interface (CGI) request and on completion of
the script the web server will return a temporary file that was output
by the script. The telerobot operator communicates their commands to the
script by means of HTML forms which contain fields that accept operator
inputs. A variety of form elements are available including hidden
fields, radio buttons, drop down boxes, text fields, and images that
return the coordinates of a mouse click. Forms are submitted from the
operators browser with either a "GET" or a "POST" type request, the
difference lying in the CGI protocol for transmitting the form data to
the script. With a "GET" type request the form data is passed to the
script as a command line argument. The form data also appears in the
address field of the web browser with the page returned and is recorded
in the web server log file as part of the address. The "POST" type
request stores the form data in a separate temporary file and passes the
name of this file to the script as a command line argument. The HTTP
protocol of the web is stateless but a web telerobot requires
information to be preserved from request to request so that an operator
can be identified. This is achieved by placing the data in a form field
so that the data is returned to the web server when the form is
submitted. When the data is of no interest to an operator, it can be
stored in the "hidden" HTML field type which is not displayed by the
browser though it can be seen by viewing the HTML source.
In the first version of the software, when the web form was
submitted the web server launched a batch file and two scripts were
launched from the batch file that executed within DOS shells. One script
controlled the robot and the other generated images and the HTML
output. However, it soon became apparent that communication was required
between DOS shells to implement file locking and to ensure only a
single user was issuing instructions to the robot at any time. The
Windows 3.11 operating system provided no facility for this. Instead it
was achieved by running a small terminate and stay resident program from
the autoexec.bat file to reserve a few bytes in memory and reading and
writing from these using absolute addressing to record and monitor file
and telerobot statistics.
Web data is normally static. Therefore, to save data transmission
and time, web browser programs store each web page retrieved by a user
in a cache on the user’s computer. Thus, if the user wants to return to
that page, it is retrieved from the cache rather than the server.
However, an image used for telerobot control or for monitoring a
changing scene is dynamic so a fresh version needs to be retrieved from
the server each time it is accessed. Therefore, a unique number is
included as part of the name of each image to force the browser into
retrieving a new image each time.
During May 1995, I added the image illustrated in Figure 45,
showing six wire frame models of the robot from different viewing
Clickable wire frame images that were one method of specifying a
location to which the telerobot should move. A request was submitted
with a single click on any of the orthogonal wire frame views. This
would specify two dimensions in 3D space and the third was kept constant
The wire frame image was only 4.5 kilobytes in size. It showed the
robot from above, side on, end on and one other direction with the view
from above and side on repeated at higher resolution. The circles
indicate rotational joints. The wire frame image was included on the
page as an HTML component of type “ISMAP”. Clicking on this element
caused the HTML form to be submitted along with the image coordinates
clicked and this was used to specify a goal position for the telerobot. A
single click would provide only two parameters but three are required
to define a position in space and three more to additionally specify the
orientation. Therefore, orientation and the third component would be
maintained during the move. For example, x and y may be altered but z
will remain constant.
I wrote the first version of the telerobot software with
assistance from Peter Murphy for the imaging and I have continued its
development for most of the period of the project. I also adapted the
software for the IRb6/L2 Carnegie telerobot which uses the Windows 95
operating system. Bradley Saracik provided assistance with development
from December 1994 for a few months. From November 1995 until February
1996 Stephen Lepage and Shalini Cooray assisted with the changeover to
the ABB1400 robot which required substantial software modifications
including converting it to run as a Windows application. Barney Dalton
joined the project in March 1996 and contributed to the changeover to
the ABB1400 telerobot. Much of the development from then was carried out
in collaboration with Barney who has also modified the software to run
under the NT operating system. Gintaras Radzavinas and Stephen LePage
contributed to development from December 1997. Peter Murphy contributed a
Java applet showing a wire frame model of the robot in it's current
pose that was included in a version of the operator interface and Harald
Friz has developed a Java applet interface for robot control. Barney
Dalton is continuing to develop the software with the aim of making it
easy to implement a distributed processing environment where some of the
processing is performed by Java applets running on the operator's
computer. A structured programming model in C was used initially and
existing modules developed for other applications were used wherever
possible which included adapting kinematics and control software
developed by James Trevelyan. With the changeover to the Windows
operating environment, the software became event driven and Barney
Dalton has more recently been gradually converting it to an
object-oriented model implemented in C++. A number of peripheral
applications such as user registration software have been added. The
system design is shown in Figure 46.
The current telerobot system. On a request from an operator, the HTTPD
server launches a CGI script that communicates with the robot and image
servers to perform the request and obtain new pictures (Taylor and
The operator’s browser submits the form details as a CGI request to
the web server, which receives the request and launches a CGI script.
One copy of the script is launched for each request and several copies
can be running simultaneously servicing the operator and observers. The
script determines whether the request came from an operator and, if so,
communicates using TCP/IP sockets with a server to operate the robot. It
also communicates with another program to capture the images and the
relationship between programs is shown in Figure 46. The script then
generates the HTML page which will be returned to the operator.
Once a request is complete new images are taken of the workspace
and a new form with the latest images and telerobot position is returned
via CGI to the user. Only one person can control the telerobot at a
time. Other users trying to gain access receive an observer page while
the telerobot is busy (deemed as up to three minutes from the last
operator's request). There are three classes of operators. They are:
developers who can take control of the robot at any time; registered
operators who must log on with a personal identification number (PIN)
and who can take control from guests and finally guests who have the
lowest priority. More than one move can be specified in a given request
using a command script language, allowing faster operation for
experienced operators. Telerobot operators submit requests in three
different ways. They are: by filling in fields on a HTML form, by
clicking images of the workspace, or by specifying multiple moves in a
script form. Options for changing the setup such as altering image
sizes, software zoom or switching images off are included in the same
page along with links to more information and a chat applet. One of the
many interfaces trialed is shown in Figure 47.
One of the interface variations on the ABB1400 telerobot.
CGI. For instance when specifying location by clicking images, a point
in two different images is required to specify a location in
three-dimensional space. Under normal CGI each image click results in a
allow specification of a telerobot request with a single submission to
the web server, however, this method does suffer from being non-standard
and only the Netscape browser supports the functions used.
The telerobot CGI script generates the pages returned to telerobot
operators and observers from template files. This allows the interface
layout to be altered by modifying the template file rather than changing
the program code. The template file contains the HTML code that
describes the page sent with additional tags (listed in Table 6) used to
mark the location of variables.
Labels in the template
||What gets substituted in
||The host name of the current controller
||A unique number that is necessary to determine who has
authority to control the telerobot
||For logging purposes, to keep track of a single session
Position & orientation details
||Telerobot's current X-coordinate value
||Telerobot's current Y-coordinate value
||Telerobot's current Z-coordinate value
||Telerobot's current Spin value
||Telerobot's current Tilt value
||Whether gripper should be open
||Telerobot's movement method
||Number of requests made by current controller
||Time when operator first gained control of the telerobot
||Time when operator made their last request
||File names of the 4 image files
||Dimensions of the 4 images
|Whether or not particular images should be displayed
||Magnification level in cameras 1 & 2
||If an error occurred, the message is displayed here
||Interface choices available
||If using a personalised template, carry it over from previous
||Is the user trying to gain control or just watch
A list of the labels used in the telerobot interface
For example one line of an operator's template is: <INPUT TYPE="TEXT"
SIZE = 4 MAXLENGTH = 4 NAME="X" VALUE="XVal">. This is standard HTML
format for a form field that allows a maximum of four characters to be
entered with the default text being "Xval". When the form is submitted a
variable named X will be assigned the value of whatever text is entered
and the variable can be read by the telerobot CGI script. This HTML is
rendered by the browser as shown encircled in Figure 48.
An operator's page template viewed with a web browser. The blue oval
highlights the rendering of the HTML code: <INPUT TYPE="TEXT" SIZE = 4
MAXLENGTH = 4 NAME="X" VALUE="XVal">
However, the string "Xval" is not seen by the telerobot operator as
it is the name of a variable which is replaced by the current Cartesian x
coordinate of the robot gripper. That is, the software reads the
template file, replaces the tags in the file with the current value of
the variable represented by the tag and then writes the result to disk.
The transmission of the HTML file to the user is then carried out by the
web server. The template files reside in a directory that is accessible
from the World Wide Web so that they can be viewed by template
designers with the tags rather than the variables as shown in Figure 48.
There are four different interface templates:
- New Operator - Seen only once when an operator first gains
control of the telerobot.
- Operator - Seen when operating the telerobot.
- Clicked - Seen after clicking on the first image and before
clicking on the second when moving the telerobot by image clicking.
- Observer - Seen when not in control of the telerobot.
There are several critical components in a template including the
hidden fields that specify the OperatorID and SessionID. These numbers
identify an operator so they can remain in control of the telerobot.
When a new operator first gains control, they are assigned an OperatorID
and the telerobot program uses this ID to determine whether it should
execute a submitted request. If any of the form fields are missing,
default values will be used by the telerobot program and this will
restrict the commands and options available to an operator. If labels
are omitted, details concerning the position of the telerobot and other
useful information will not be displayed.
A facility is provided to enable users to upload and use their
own templates to control the interface layout. The user creates a
template file with an editor on his or her own computer then uploads
that file with the HTTPD upload procedure. A small script on the
telerobot server moves the uploaded file into the appropriate template
directory and the operator or observer then uses this template by
entering the template name in a field on the operator's or observer's
page. If a set of template files is uploaded with the same name, that
set will remain in use as the person moves from observer to operator
status or vice versa. An operator selects a standard template from a
drop down list on the operators page, or where an operator has submitted
their own template, selects it by filling in the template name in a
Tracking telerobot operators
The HTTP protocol by which operators communicate with the telerobot
is stateless and a method is required to determine whether requests are
made by a current operator or someone classed as an observer. As well,
by tracking individual visitors to the telerobot one can ascertain how
many visitors there are, how often people come back, how long they stay
and what they do. This is not a straightforward task and it is a problem
shared by all web services that seek to track individuals.
There are four methods that can be used with the telerobot:-
- Tracking internet addresses. These are the numbers that are used
to route data through the Internet and are necessarily provided to each
web server to enable requested information to be returned. The numbers
are usually associated with a domain name with which they can be easily
interchanged by referring to a domain name service.
- Session Identifiers. Whenever an operator first gets control of
the telerobot a unique identifier is generated and stored in a hidden
field on the returned page. This value is returned every time an
operator makes a request and will remain unchanged until the person
loses control of the telerobot to another operator.
- Logging on. People are offered the opportunity to register. A
password is automatically emailed to them and they can be tracked by
their login name.
- Cookies. These are short messages left on the client computer
that can only be read by the web server which placed them.
The first method, tracking internet addresses, is available to all
web sites as web servers routinely log this data. The other methods
require more effort to implement. Each has some advantages and
disadvantages which make it suitable for different circumstances. It is
also necessary to understand the sources of error in each and the levels
of accuracy that can be achieved.
Tracking internet addresses
This is commonly used by researchers, for example Saucy and Mondada
(1998), as it is always available and requires no action by the
telerobot operator. An internet address also known as an Internet
Protocol (IP) number identifies a computer, but not always uniquely.
Many people access the Web through a proxy server, which means a request
for a web page is always made to the one proxy server regardless of the
server that holds the page requested. The proxy server then gets the
page from the web server where it resides and forwards it on to the
computer from which it was requested. In this case, the internet
address of the proxy server is recorded and different people using the
one proxy will have the same internet address. In addition, many people
use dynamic addressing. In this case, every time the computer is
connected to the Internet a different internet address is assigned to
it. This is common with dial up connections to commercial internet
service providers as it allows a small pool of internet addresses to be
used by a large number of people.
A session identifier is generated on the first request of a session
and is based on the time and date in order to guarantee uniqueness. The
same technique is used to identify the current operator every time a
request is made to the telerobot. Each request received by the telerobot
server is classified as being from an operator if the correct session
identifier is present. This method is absolutely reliable and has been
used in this research for identifying requests as being in a single
session rather than tracking internet addresses.
User registration and
A person can register by completing a form which allows some
information to be collected. They are emailed a PIN number which,
together with a logon name, is entered before gaining control of the
telerobot. This has been the method used in this research for
identifying individuals across sessions. There is no cost in registering
so it seems unlikely that a person would share their account. The
incentive to register is that registered users have preference over
unregistered users for control of the telerobot. It seems quite likely
that many registered users will have played with the telerobot prior to
registration and there is no certainty that every time a registered user
operates the telerobot they will log on. There is a high degree of
assurance that the data identified by a user name is from a single
person but less certainty that all the actions of that person are
Cookies are short messages left on the client computer which can only
be read by the web server from which they came. They identify
computers, not people, so those who use more than one computer or those
who share a computer will cause statistical errors. A further
disadvantage with cookies is that they are sometimes disabled by
operators and do not work with all browsers, although they do work with
Netscape and Microsoft browsers, which most people use. Cookies have
only used on the telerobot for counting unique visitors by the Webside
Story (1998c) counting service discussed in section 5.4.
Analysis of web server log files was the first method used to monitor
operator behaviour. The web server logs contain:-
- Internet address of visitor;
- Date and time of request;
- Item requested;
- Page transmission time. However the value recorded is subject
to error as the winsock (see section 4.4 regarding winsocks) buffers the
data to be sent and this figure only measures the interval from the
time the request has been received to the time that the data has been
transferred to the winsock buffer;
- Referring site, which is the address of the web site from which
a hyperlink was followed to the telerobot.
When the “Get” method of HTML form submission (see section 4.4
Software) was used, the server log also included the query string. To
make use of the query string recorded in the server log, it was
necessary to write software that could decode it and write in a format
that was suitable for importing into a database. This method of logging
telerobot activity had some limitations. Tracking operator activity
becomes quite complex as requests from people with "observer" status are
intermingled with operator requests and there is no record of the
response to requests. Determining which requests come from an operator
and which come from an observer requires applying the control algorithm
to the log file data.
After changing to "Post" HTML form submission the method of
monitoring operators was changed. An operator log file was created with
data appended by the telerobot program after each request. The data
recorded has varied from time to time but that recorded in 1998 is shown
in Table 7. Requests to gain control of the telerobot by non-operators
are recorded in a separate file.
||Domain name of the computer accessing the telerobot.
||A unique identifier is created for the first request in a
session and recorded for each request in the current session. A session
remains current until another operator takes control of the telerobot.
The number is included as a hidden field in the form returned to the
||The number of request received for this session.
||The time when processing this request was started.
||The time taken since processing the previous request was
||The file name of the template used to generate the html page
||X coordinate after the request is executed.
||Y coordinate after the request is executed.
||Z coordinate after the request is executed.
||The spin angle after the request is executed in degrees.
||The tilt angle after the request is executed in degrees.
||0 for closed and 1 for open.
||A value recording the success of a request. This could indicate
a successful request, robot offline, emergency stop due to collision,
or joint limit exceeded.
||The error message that appears on the operators page when a
request has been unsuccessful.
||There are four cameras and this field is repeated for each with
X replaced by the camera number. 0 indicates operator switched this
camera on, 1 indicates operator switched this camera off.
||Image size requested by operator.
||This was the number of shades of grey in the image when gif
image coding was used but has been changed to the jpeg image quality
expressed as a percentage.
||The login name of the registered user and "guest" if not being
operated by a registered user.
||The series of moves requested for a multiple move request.
Data recorded for each request from an operator to the
The data recorded includes all aspects of an operator's request, the
robots response and fields that make it easy to identify sessions and
the sequence of requests in a session. This makes more complete
interpretation of operator behaviour easier than was possible when
relying on the web server logs.
Approaches to interface
With the supervisory control scheme employed on a web telerobot, a
human is an intimate part of the control loop. Human-robot interactions
then have a higher importance than is usually the case with conventional
robots which are often expected to operate autonomously for most of
their life. Therefore, the operator interface design is an issue
deserving of attention. With the introduction of Java applets and
web browser framework and from observation it is apparent that small
changes in layout can affect operator behaviour.
Interface development has been highly iterative, as it was
usually easier to understand operator reaction to different interface
designs in retrospect than to anticipate them. This is partly because
those involved in system development are unable to view the interface
from the same perspective as telerobot operators. The dilemma has been
recognised by Lewis and Rieman who state: “you may read our examples of
problem interfaces and say to yourself, ‘Well, that's an obvious
problem. No one would really design something like that. Certainly I
wouldn't.’ (Lewis and Rieman 1993b:5)". Then after giving several
examples they go on to say “the flip side of the ‘it's obvious’ coin is
that most actions used to accomplish tasks with an interface are quite
obvious to people who know them, including, of course, the software
designer. But the actions are often not obvious to the first-time
The following sections provide an overview of the cognitive
framework for interface design, its significance for the web telerobot
interface and the interface interaction styles available. This is
followed by a review of the methods available for interface development
and the reasoning for the particular methods adopted for this research.
frameworks for human computer interaction
The dominant intellectual framework that has characterised human
computer interaction (HCI) theory has been cognitive (Preece et al.
1994). In general, cognition refers to the processes by which we become
acquainted with things or how we gain knowledge. The major paradigm used
to describe human cognition has been to characterise humans as
information processors whereby information enters and exits the human
mind through a series of ordered processing stages (Lindsay and Norman
1977), as shown in Figure 49.
Model of human information processing stages showing the basic and
extended version. Adapted from Barber 1988 (1988)
The four stages involved are encoding, comparison, response selection
and response execution. Extensions of the basic information processing
model include the processes of attention and memory. The human processor
model provides a means of characterising the various cognitive
processes that are assumed to underlie the performance of a task. From
this conceptual model, other models have been abstracted, often using
the computer as a metaphor. Concepts such as buffers, memory stores and
storage systems have provided psychologists with a means of developing
more advanced models of information processing. According to Preece et
al (1994:63) there has, since the 1980’s, been a move away from the
information processing framework with the main problem being that it
oversimplifies human behaviour and cannot therefore be used to predict
responses. Preece et al (1994:67) argues that “it is becoming
increasingly recognised that a cognitive perspective of the individual
user performing various tasks at the interface is an inadequate
conceptual framework for HCI.” In view of this, Landauer (1987) feels
there is no sense in which we can study cognition meaningfully divorced
from the task context in which it finds itself and Preece et al
(1994:67) take the view that a central focus of research in HCI must be
the environment in which users carry out their tasks. A similar view is
provided by Winograd and Flores (1987) who state that: “it is clear that
(and has been widely recognised) that one cannot understand a
technology without having a functional understanding of how it is used.
Furthermore, that understanding must incorporate a holistic view of the
network of technologies and activities into which it fits, rather than
treating the technological devices in isolation.”
Therefore, there is no strong cognitive theory for the web
telerobot interface which provides a prescriptive prediction of the
ideal interface design or which provides a mechanistic model in the
sense that Newtonian mechanics does in the field of dynamics. Ideas can
be drawn from interface designs for other applications but they are
unlikely to be directly transferable to the web telerobot interface. In
addition, controlling a teleoperable device is so different from using
the web for reading that lessons learned with web interfaces generally
are unlikely to be directly transferable. The preceding paragraphs
indicate that there is no satisfactory alternative to developing the
interface with an actual web telerobot and, sadly, they imply that it is
difficult to apply the lessons learned from these web telerobots to
others performing different tasks.
The most basic form of human computer interaction is the
command-driven application. Here a computer is instructed from a command
line with no indication, prior to entering the command, of what the
response will be. A user must know what commands are available and what
they will do. This provides a general interface that works for a wide
range of applications, for example, the DOS or Unix computer operating
The next level of sophistication is the menu and form fill
approach intended initially for clerical workers and mimicking paper
forms. A menu and form fill interface provides guidance to users as to
what is required and is less general than the command line style which
requires redesign for each task. According to Preece et al (1994:118)
one of the most well established findings in memory research is the fact
that we can recognise material far more easily than we can recall it
The capacity to recognise is the principal advantage of menus and
form fills in interfaces. The user has only to select from the options
offered by a menu and is prompted for the required information with a
form fill. Menus and form fills are usually used together. Menus are
ideal when a user must select from a limited number of options. The
advantage of form fills is that they prompt for the required information
and allow for a large range of values to be entered.
Natural language provides another method of human computer
interaction where a user types a command in natural language. However,
the system needs to cope with vagueness, ambiguity and multiple methods
of conveying the same information. Such systems will usually require
more typing than a command system and they are complex to develop and
therefore not often used. Engleberger had intended to provide the
HelpMate robotic nursing assistant (discussed in section 2.4.10) with a
voice recognition natural language interface but was not able to do so.
The final type of human computer interaction is direct
manipulation which involves replacement of a command language by direct
manipulation of the object of interest. Usually this means manipulating
representations of objects on the screen, which can be moved around and
manipulated with a mouse, but for telerobots, it would include master
slave systems. The Apple Macintosh and the more recent Microsoft Windows
operating systems are examples of direct manipulation interfaces which
are based on the metaphor of a desktop. Files are kept in folders,
information is copied to a clipboard, etc. Direct manipulation
interfaces are frequently used in computer aided drawing (CAD) systems
and robot simulation systems. The success of video games demonstrates
the enthusiasm many people have for this type of interface. Shneiderman
(1983) believes this success is due to:
- Novices being able to learn the basic functionality quickly;
- Experienced users being able to work extremely rapidly to carry
out a wide range of tasks, even defining new functions and features;
- Knowledgable intermittent users being able to retain
- Error messages rarely being needed;
- Users being able to see immediately if their actions are
furthering their goals and, if they are not, being able to simply change
the direction of their activity;
- Users experiencing less anxiety because the system is
A major complication of using direct manipulation interfaces with web
telerobotics, and one that is shared with three dimensional CAD
systems, is representing and manipulating objects in three dimensional
space on a two dimensional computer screen.
Heuristics, also called guidelines, are general principles or rules
of thumb that can guide design decisions. In the opinion of Lewis and
Rieman no list of heuristics has been very effective “in improving the
design process, although they're usually effective for critiquing
favourite bad examples of someone else's design (Lewis and Rieman
1993c:25)”. The list of general heuristics developed by Molich and
Nielsen (1990) which was used in developing the telerobot interface
iterations and is shown in Table 8.
The list of general heuristics developed by Nielsen
and Molich (1990).
Nielsen and Molich also provide a technique for evaluating a design
with respect to these criteria and have been able to show that it can be
used to identify 75 percent of the heuristic problems. However, they
point out that many interface problems are not heuristically
identifiable. While the concept of controlling a device is quite
different from acquiring information through a web page, the heuristics
for web page design are also likely to be relevant. Nielsen (1997a)
provides three guidelines for web pages which are:-
- Be succinct: write no more than 50% of the text you would have
used in a hard copy publication.
- Write for scanability: don't require users to read long
continuous blocks of text.
- Use hypertext to split long information into multiple pages.
According to Nielsen, reading from a web page takes about 25% longer
than reading from paper and therefore a succinct writing style is
necessary. Scanability is important because useability studies have
shown this is the way the web is used. As well, avoid the need to scroll
because “we know from several user studies that users don't scroll”
Ratner et al recommend against giving too much credence to HTML
style guides as they have found “little consistency among the 21 HTML
style guides assessed, with 75% of recommendations appearing in only one
style guide (Ratner et al. 1996)". Ratner et al argue that the web
"represents a unique HCI environment (Ratner et al. 1996:368)" but point
out that HTML style guides address web information content pages,
largely ignoring issues such as help, message boxes and data entry that
are important with web-based applications.
Development of the interface
An approach to interface development recommended by Preece et al
(1994) is user-centred design. The underlying principles are:-
- Involve users as much as possible, so that they can influence
- Integrate knowledge and expertise from the different
disciplines that contribute to HCI design;
- Be highly iterative, so that testing can be done to check that
the design does indeed meet users' requirements.
Lewis and Rieman (1993a) propose a variation on user-centred design
which they term-task centred design. This method was adopted for the
interface development of the web telerobots in this research, however,
with some variations. The steps involved in the task-centred design
process and how they were applied in the course of this project are
described below. This list was taken from Lewis and Rieman (1993a).
1. Figure out who's going to use the system to do what.
At the time of project commencement (May 1994), there were no web
devices and therefore there was no information available regarding who
would use the system or what they would do. These issues were addressed
during the course of the research and are considered in Chapter 5
including the demographics of operators and what operators build with
2. Choose representative tasks for task-centred design.
The task chosen was the manipulation of wooden blocks.
There was little potential for plagiarisation as there no devices with
web interfaces when the project began. However, some ideas were gained
by reviewing operator stations for non-web telerobots. Later web
telerobot implementations have been able copy elements of their
interface from the web telerobots built for this project.
4. Rough out a design.
5. Think about it.
6. Create a mock-up or prototype.
This step was considered unnecessary as the interface could be easily
modified, it seemed advantageous to test it with operators as early in
the development process as possible and testing it with operators was
easy to do.
7. Test it with users.
No testing was done prior to implementation.
There were no iterations prior to implementation.
9. Build it.
10. Track it.
Observing the actions of operators is enough to provide considerable
insight into the changes that need to be made. For example, in the first
interface version the robot gripper could frequently be seen opening
and closing in front of (with respect to the camera) or behind blocks
indicating that operators could not perceive depth with the feedback
provided. As the interface developed, operator behaviour was analysed
and quantitative data was also used to develop the interface design.
11. Change it.
The interface has gone through many revisions during the course of the
project and steps 10 and 11 repeated many times. The heuristic
guidelines discussed in section 4.8.3 were used to evaluate proposed
changes at this point, for example, a procedure was added to
automatically reset the robot after collisions rather than to improve
the error messages advising of collision. An additional step not
mentioned by Lewis and Rieman was added to the interface development
procedure as well. This step was to seek interface ideas from others,
including University of Western Australia students and the telerobot
The menu/form fill interaction style was chosen from the
interaction styles discussed in section 4.8.2 as it is the easiest for
novice operators to understand and web browsers include the
functionality required to implement menus and form fills. As well, web
users are familiar with web forms so have less to learn to understand
the interface than is required with other techniques. In 1998, the
advent of Java has provided opportunities to develop direct manipulation
interfaces but this was not available when the project commenced in
Having outlined the procedure adopted for interface development
and the reasons for using it, the following sections describe issues
which emerged during the actual development and provide some
quantitative data that was used to develop the interface design.
Cues for location
Providing cues for an operator to form an understanding of the robot
arm position and block positions has required considerable development
time. The problem is to represent three dimensional space on a two
dimensional screen with the additional restriction of the limited
bandwidth provided by the Internet. This restriction requires an
emphasis on maximum information for minimum data.
The cues provided for location are:-
- A grid marked on the table with 100 millimetre squares.
- Four camera views in black and white, three from fixed cameras
and one from a camera slung under the robot arm.
- Gripper location in x,y,z coordinates.
- Gripper orientation as spin and tilt.
Initially there was only one camera and fluorescent lighting which
provided no shadows. This made the perception of depth very difficult
and operators had great difficulty in locating blocks. As a result
lighting which caused shadows was added and this which improved depth
perception. The addition of a second camera in December 1994 with a view
from a direction at 90 degrees to the first has provided the best
perception of depth. Discussion with operators and my own observation
suggested that, with the second view available, the best lighting then
became fluorescent lighting without shadows. However, while the
perception of depth was excellent with this layout it became difficult
to recognise the same object in both images and this complicated scene
interpretation. Moving the cameras closer together made scene
interpretation easier but reduced the depth information. The current
layout of four cameras is shown in Figure 50.
An illustration of the current camera positions. Reducing the angle
between cameras 1 and 2 makes recognising the same objects in either
view easier but reduces depth information.
The optimum angle remains undetermined but operators prefer an angle
between cameras of less than 90 degrees. This can be seen from Figure
51, which shows the cameras that operators chose to switch off. The
advantage of switching off a camera is faster page updates but most
operators will accept the default settings. The least popular camera was
the camera positioned orthogonally, followed by that slung under the
Operators are able to turn off cameras for faster page updates. This
shows the cameras turned off with the camera layout of Figure 50 for a
sample up to 15 December 1996.
A similar consideration applies in the vertical plane. Cameras aligned
horizontally provide excellent height information but no plan view.
Again the optimum is somewhere between a plan and horizontal view.
Cameras 1 and 2 are set at different heights to provide a variety of
height versus plan views as can be seen in Figure 52.
Another approach adopted to improve perception of pose was to remove
some of the unnecessary system functionality. Initially operators were
able to specify position and orientation in the full six degrees of
freedom. Orientation was specified as roll, pitch and yaw. It was noted
by a colleague, Barney Dalton that for all useful block manipulations
only two orientation specifications were required, termed spin, and
tilt. Spin is defined as rotation about the table's Z axis and tilt is
the angle the gripper makes with the Z-axis. This ensures that a line
drawn through the gripper end points is always parallel to the XY plane
as shown in Figure 53.
Spin and Tilt orientations for the same coordinate position. Note that
the jaws of the gripper are in a parallel plane to the table at all
Therefore, in subsequent versions of the interface for the ABB1400
telerobot, roll pitch and yaw were replaced by spin and tilt which is
much easier for the operator to understand and simplifies the use of the
telerobot without losing any useful functionality. Neither roll pitch
and yaw or spin and tilt could be used with the he IRb6/L2 Carnegie
telerobot as it is a five axis machine unable to adopt the required
poses. The solution adopted for the IRb6/L2 Carnegie telerobot was lean
defined as a rotation from vertical about the fourth axis and spin
defined as rotation about the axis of the gripper. Spin only provides
useful orientations for block manipulations when tilt is zero.
Camera image quality
Initially images were provided in GIF format and the operator
controlled the quality by means of adjusting the number of shades of
grey and the resolution. Some data was collected on operator imaging
preferences which at the time were altered by clicking a button on the
telerobot control page labelled “Change Image”. The page returned in
response to the click allowed the operator to select three aspects of
the image quality; size, resolution and number of shades of grey (See
Image quality options.
The defaults were: 16 shades of grey, 256 x 256 pixels, full
resolution. With GIF coding reducing the number of shades of grey or the
image size provides far greater savings in image data than reducing the
resolution. For example, reducing the image size by half reduces the
number of unique pixels by three quarters and this reduces the image
data size by 60%. Reducing the resolution by half also reduces the
number of unique pixels by three quarters but the reduction in image
data size is only 3%.
Operator image selections for a sample of 30,800 requests between
16 March 1995 and 14 May 1995 are shown in Table 9.
Choice of image quality by operators 92% of requests
to the telerobot specified the default image specification. * Default
It can be seen that 93.1% of requests were made with 16 shades of
grey, 97.1% with an image size of 256 pixels, 99.4% at full resolution
and that overall 92% of requests were made with all of the default image
specifications. It was anticipated that most operators would adjust the
images to match their preference and that the preferences selected
could be used to determine operator requirements. However, operators
overwhelmingly accepted the defaults. Where selections were made they
generally favoured increasing image quality and as far as can be
determined the selections did seem rational. For example, 32 shades of
grey significantly improves image quality over 16 shades but 64 shades
provides only a marginal improvement over 32 shades and on some displays
no improvement. Very few requests were made with settings larger than
32 shades. As well, reducing resolution provided significant loss of
image quality with only a small saving in image data size and only 0.5
of requests were made with reduced resolution.
It is apparent that willingness to accept image defaults is
dependant on interface complexity and layout. Many operators complained
of aspects of the image quality that they could control and a few had
even suggested that image quality should be adjustable. These users were
obviously unaware of the capacity to control image quality. When
changing from the IRb6/L2-6 Perth robot to the ABB1400 telerobot image
quality selections were moved to the bottom of operator's normal page.
For the Carnegie IRb6/L2 telerobot colour images were provided with only
the size being adjustable and after changing to the Windows NT
operating system imaging on the ABB1400 telerobot was also converted to
Jpeg format with the image quality, expressed as a percentage, being
controlled by the telerobot operator.
Having established a session an operator may select a number of
different commands. Most of the commands available for the IRb6/L2-6
telerobot in Perth are shown in Figure 55.
Telerobot controls on the IRb6/L2-6 Perth telerobot. Roll. pitch and yaw
was later replaced by spin and tilt..
The possible commands were:-
- Move Telerobot
- Relative to the current position.
- To an absolute position.
- Absolutely by clicking on a wire frame image as described in
section 4.4 and illustrated in Figure 45. In this case an operator is
limited to maintaining the gripper orientation vertical and varying any
two of the three spatial coordinates.
- Change the image specifications.
- Reset the robot. If the robot was brought down on top of an
object it would overload the motors and trip out. It could not be moved
again until it was reset.
- Access Telerobot. This is a bid to control the telerobot or the
first page returned after gaining control.
Requests by operators were analysed to determine which commands
operators were using and which method of moving the robot was most
favoured. The number of times each command was used and the percentage
of each command of the total requests from 16 March 1995 to 1 September
1995 is shown in Table 10.
Commands issued by operators to the telerobot. The different
methods of moving are expressed as a percentage of the total requests
and the sum of the percentages is greater than 100 as gripper operations
can be performed in the same request as a move.
The interface was being developed while this data was acquired in
response to observation of operator reaction with these statistics
acquired to quantify operator behaviour as a guide to further
development. The data in Table 10 reveals a problem which operators had
in using the telerobot. The number of reset requests was only 1 percent
of all requests. Since 53% of requests are to take control of the
telerobot, this equates to roughly 2 percent of requests made after an
operator is in control of the telerobot. From watching operators
interacting with the telerobot it was observed that an operator is
likely to trip out the robot every 5 to 10 requests depending on the
experience of the operator. Therefore if reset requests were in the
correct proportion, they should be in the range of 4 to 7 percent of all
requests. This means many move requests went unsatisfied because the
robot required resetting. To eliminate this problem this reset function
was automated. The software was modified to test the state of the robot
prior to issuing a command to move it and to automatically reset the
robot if the motors had been overloaded on the previous request. The
information that the robot has tripped out is conveyed by a text message
in the page returned to the operator associated with the request that
caused the over load.
The method of moving, by clicking on a wire frame model was not
available for the full period covered by the data in Table 10. Using
operator request data from 10 June 1995 to 1 September 1995 when all
methods of moving the telerobot were available and the default is move
absolute over move relative. The move requests are given Table 11.
The data in Table 11 shows a strong preference for moves to an
absolute location rather than moves relative to the current location and
also shows very little enthusiasm for moving the telerobot by clicking
on the wire frame. This is despite clicking on the wire frame being a
quicker method of specifying a move than calculating the move values
from the grid in the image and entering the coordinates.
To speed up requests, it is possible to “check a box” to not
update either of the two images or to not show the wire frame model. It
is most practical not to update the images when the telerobot is not
altering the environment for example when an operator is lifting the
gripper from a block placed in a previous request. This is because the
wire frame and numerical feedback provide the robot's position, this
being the only aspect of the existing images that has changed. Even when
modifying the environment the effect can usually be judged from the
primary image so an operator could get a faster result if he elected not
to update the secondary image. Operators' enthusiasm for switching off
images to speed up response was examined. From 30,800 requests from
users in control of the telerobot from 10 June 1995 to 1 September 1995
the percentage of image combinations requested is given in Table 12.
Percentage of 30,800 requests where options to not
update an image or not show the wire frame was selected. Also in 0.5% of
cases all of these feedback options were turned off.
The default is to display all images and in 0.5% of cases operators
chose to switch off all imaging which left only the telerobot position
information in text. Almost no requests were made with the primary image
switched off and in only about 4.5% of requests was the secondary image
switched off. However, the wire frame image was turned off in nearly
half of all requests, which indicates a strong operator dislike of the
wire frame model, particularly given that for most options, operators
accepted the defaults. This is despite the data content of the default
camera images being about two and a half times the size of the wire
frame. The wire frame contains no information on the blocks but the six
views of the wire frame provides a comprehensive description of the
Many operators express difficulty in interpreting the wire frame
so perhaps an alternative layout would improve its usefulness. Most
operators are inexperienced so another factor influencing operator
preference may be that the camera images are more familiar. The strategy
of using the wire frame and not updating one or both camera images was
almost never adopted though a few users were content to work from the
primary image only. For this reason, the wire frame method of specifying
moves was abandoned in later versions of the interface.
Seeking interface ideas
The templates provide a method of modifying interface layout and were
used to seek out ideas for further interface designs from students and
telerobot operators so that more changes could be made in accordance
with step 11 of Lewis and Rieman's (1993a) task-centred design process
described in section 4.9. On average thirteen hours twenty minutes per
day the Perth telerobot alone (as measured over a seven-week period to
10 August 1997) was being operated. Therefore a considerable
intellectual effort is expended on it. As well, telerobot operators
frequently contributed suggestions on how the interface could be
improved. So it was thought that enabling operators to develop, up load
and use their own interface designs would provide an opportunity for
telerobot operators to contribution further to the project. Contrary to
expectations this facility, described in section 4.5, did not prove
popular with operators. Writing templates seemed an easy process for
those involved in developing the telerobot but it became apparent from
teaching this process to students and from telerobot operators' comments
that it was not simple for those unfamiliar the technique. Some of the
ideas which emerged from the students Student Reports (Taylor 1998)
- An inverse pyramid style, where the immediately attractive and
most used functions are placed at the top of the page to reduce
- Short text at the top of the page is hyperlinked to more
comprehensive descriptions placed further down the page.
- Minimising page content to achieve response times as short as
- The use of frames for non dynamic information.
- Consistent designs for each template.
- Hypertext links to important information relevant to that
Some of the designs developed by students are shown in Figure 56 and
The telerobot operators page developed by Simeon Bartley and Melissa
The telerobot operators page developed by Clayton Deegan, Haytham
El-Ansary and Scott Fillery.
A interface design competition was run with a prize of US$100. The
competition was open to allusers other than University of Western
Australia students. The criteria for awarding the prize were the best
ideas presented and later entries were encouraged to look at, and build
on previously submitted designs. The competition did not prove popular.
Quite a few people attempted a design but only three actually submitted
entries. The low number of entries is attributed to the difficulty
people had in understanding the process for designing a template and the
time required to develop one. The entry judged as being the best is
shown in Figure 58 and the other entries in Figure 59 and Figure 60.
The winning operators interface design by Bernat Plana.
An entry in the interface design competition submitted by
An entry in the interface design competition submitted by
Many of the ideas that emerged in these exercises were subsequently
used in other versions of the operator's interface.