AN INVESTIGATION OF CURRENT
VIRTUAL REALITY INTERFACES
Christopher M. Smith
Lynellen D.S.P. Smith
A Final Project
Submitted to the Faculty of
Mississippi State University
in Partial Fulfillment of the Requirements
in the Department of Computer Science
Mississippi State, Mississippi
April 25, 1995
Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
Visual Aspects of Virtual Reality. . . . . . . . . . . . . . . . . . . . 1
Types of Visual Display . . . . . . . . . . . . . . . . . . . . . . 2
LCD Flicker Lens . . . . . . . . . . . . . . . . . . . . . . . 2
Head Mounted Displays. . . . . . . . . . . . . . . . . . . . . 3
LCD display HMD. . . . . . . . . . . . . . . . . . . . . . . . 3
Projected HMD. . . . . . . . . . . . . . . . . . . . . . . . . 3
Small CRT HMD. . . . . . . . . . . . . . . . . . . . . . . . . 4
Single Column LED HMD. . . . . . . . . . . . . . . . . . . . . 4
Binocular Omni-Orientation Monitor (BOOM). . . . . . . . . . . 4
Advantages and Disadvantages. . . . . . . . . . . . . . . . . . . . 5
Graphic Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
Depth Cueing. . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
Lighting Models . . . . . . . . . . . . . . . . . . . . . . . . . . 8
Shading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
Radiosity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
Ray Casting . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3-D Audio. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Evolution of 3D Sound . . . . . . . . . . . . . . . . . . . . . . . 11
Realistic Sound . . . . . . . . . . . . . . . . . . . . . . . . . . 13
Tactile and Force Feedback . . . . . . . . . . . . . . . . . . . . . . . 15
Force Feedback. . . . . . . . . . . . . . . . . . . . . . . . . . . 16
Motion Platforms . . . . . . . . . . . . . . . . . . . . . . . 16
Gloves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
Exoskeletons . . . . . . . . . . . . . . . . . . . . . . . . . 17
Butlers. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
Tactile Feedback. . . . . . . . . . . . . . . . . . . . . . . . . . 18
Texture. . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
Navigation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
Tracking Devices. . . . . . . . . . . . . . . . . . . . . . . . . . 19
Mechanical Trackers. . . . . . . . . . . . . . . . . . . . . . 21
Electromagnetic Trackers . . . . . . . . . . . . . . . . . . . 21
Ultrasonic Trackers. . . . . . . . . . . . . . . . . . . . . . 22
Infrared Trackers. . . . . . . . . . . . . . . . . . . . . . . 23
Inertial Trackers. . . . . . . . . . . . . . . . . . . . . . . 24
Interaction Devices . . . . . . . . . . . . . . . . . . . . . . . . 25
Gloves . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3D Mice. . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
Joysticks. . . . . . . . . . . . . . . . . . . . . . . . . . . 28
Human Factors in Virtual Environments. . . . . . . . . . . . . . . . . . 28
Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
Virtual Reality hype is becoming a large part of
everyday life. This paper explores the components of actual
virtual reality systems, critiqueing each in terms of human
factors. The hardware and software of visual, aural, and
haptic input and feedback are considered. Technical and
human factor difficulties are discussed and some potential
solutions are offered.
Virtual reality is a new technology for simulation,
design, entertainment, and many other pursuits. Simulation
applications range from testing a non-existent object before
it is created, to military training, to practicing a
maneuver for the next shuttle mission. The purpose of our
paper is to identify weaknesses in virtual reality
interfaces. To accomplish this task, we have divided the
typical virtual reality interface into four specific areas:
audio, visual, tactile, and navigation. We will point out
the limitations of current solutions to problems in these
areas, possible areas of improvement, and those problems
that remain completely unsolved at this point.
Visual Aspects of Virtual Reality
The main tradeoffs in this area are image detail versus
rendering speed, and monoscopic versus stereoscopic vision.
In most applications of virtual reality, visual feedback is
required. In fact, visual cues are perhaps the most
important feedback required in the virtual reality system.
To achieve reality, the pictures sent to the display have to
be real-time to avoid discontinuity. Therefore, the
trade-off between the rendering time and the graphic
resolution for the 3-dimensional graphic scene and the
2-dimensional graphic scene are investigated from both the
software and hardware perspective.
Types of Visual Display
LCD Flicker Lens
The LCD (liquid crystal display) flicker lens has the
appearance of a pair of glasses. A photosensor is mounted
on these LCD glasses with the sole purpose of reading a
signal from the computer. This signal tells the LCD glasses
whether to allow light to pass through the left lens or the
right lens. When light is allowed to pass through the left
lens, the computer screen will be showing a left eye scene,
which corresponds to what the user will see through his left
eye. When light passes through the right lens, the scene on
the computer screen is a slightly offset version of the view
from the left eye. The glasses switch between the two
lenses at 60 Hertz, which causes the user to perceive a
continual 3D view via the mechanics of parallax (Blanchard
Head Mounted Displays
Head mounted displays place a display screen in front
of each of the viewer's eyes at all times. The view, the
segment of the virtual environment generated and displayed,
is controlled by orientation sensors mounted on the
"helmet". Head movement is recognized by the computer, and
a new perspective of the scene is generated. In most cases,
a set of optical lens and mirrors are used to enlarge the
view to fill the field of view and to direct the scene to
the eyes (Lane). Four types of Head Mounted Displays (HMDs)
will be discussed below.
LCD display HMD
This type of HMD uses LCD technology to display the
scene. When a liquid crystal pixel is activated, it blocks
the passage of light through it. Thousands of these pixels
are arranged in a two dimensional matrix for each display.
Since liquid crystals block the passage of light, to display
the scene a light must be shone from behind the LCD matrix
toward the eye to provide brightness for the scene
(Aukstakalnis and Blatner 1992).
This type of HMD uses fiber optic cables to transmit
the scene to the screen. The screen is similar to a cathode
ray tube (CRT) except the phosphor is illuminated by the
light transmitted through fiber optic cables. Ideally, each
fiber would control one pixel. But due to the limitation in
cost and manufacturing, each fiber controls a honeycomb
section of pixels (Lane).
Small CRT HMD
This type of HMD uses two CRTs that are positioned on
the side of the HMD. Mirrors are used to direct the scene
to the viewers eye. Unlike the projected HMD where the
phosphor is illuminated by fiber optic cables, here the
phosphor is illuminated by an electron gun as usual (Lane).
Single Column LED HMD
This type of HMD uses one column of 280 LEDs. A mirror
rapidly oscillates opposite from the LEDs, reflecting the
image to the user's eye. The LEDs are updated 720 times per
oscillation of the mirror. As the LED column updates for
each column of the virtual screen, the mirror redirects the
light to the viewers eye, one column at a time, to form the
image of the entire virtual screen (Aukstakalnis and Blatner
Binocular Omni-Orientation Monitor (BOOM)
The binocular omni-orientation monitor is mounted on a
jointed mechanical arm with tracking sensors located at the
joints. A counterbalance is used to stabilize the monitor,
so that when the user releases the monitor, it remains in
place. To view the virtual environment, the user must take
hold of the monitor and put her face up to it. The computer
will generate an appropriate scene based on the position and
orientation of the joints on the mechanical arm
(Aukstakalnis and Blatner 1992).
Advantages and Disadvantages
LCD flicker lens' are light weight and cordless. These
two features makes them easy to wear and to remove.
Unfortunately, the user would have to stare only at the
computer screen in order to see the 3-D scene. Since the
field of view is limited to the size of the computer screen,
the surrounding real world environment can also be noticed.
This does not provide an immersive effect.
The LCD display HMD is lighter than most HMDs. As with
most HMDs, it does provide an immersive effect, but the
resolution and the contrast is low. The problem associated
with low resolution is inability to identify objects and
inability to locate the exact position of objects. Since
the crystals are polarized to control the color of a pixel,
the actual polarizing of the crystal creates a small delay
while forming the image on the screen. Such a delay may
cause the viewer to misjudge the position of objects (Bolas
The projected HMD provides better resolution and
contrast than LCD displays. This HMD is also light weight.
Higher resolution and contrast means that the viewer is able
to see an image with greater detail. The downside of this
type of HMD is that it is expensive and difficult to
manufacture (Bolas 1994).
The CRT HMD is in many ways similar to the projected
HMD. This type of HMD is heavier than most other types
because of added electronic components. These electronic
parts also generate large amounts of heat. The user wearing
this type of HMD may feel discomfort due to the heat and the
weight of the HMD (Bolas 1994).
Single Column LED HMDs allow the user to interact with
a virtual world and the real world simultaneously. This
type of display can be used to create a virtual screen that
seems to float in the real world.
One of the common problems of HMDs is that the cable
connecting the HMD and a computer restricts the mobility of
the user. The user can only move as far as the cable
allows. If the cable is not properly managed, one could
trip over it or become entangled in it. Another problem
occurs when switching frequently between a virtual world and
the real world. Continually wearing and removing the HMD is
tedious and tiresome.
Some of the problems associated with HMDs can be solved
by using a BOOM display. The user does not have to wear a
BOOM display as in the case of an HMD. This means that
crossing the boundary between a virtual world and the real
world is simply a matter of moving the eyes away from the
Depth Cueing is a technique to provide a 3-D
perspective of a scene. Basically, it adds depth to a 2-D
picture. There are two ways to add such perspective. The
first method is to change the color of pixels. For example,
a light source located farther away looks dimmer than one
located nearer the viewer. When the scene is generated, the
red, green, and blue color values of the light source
located farther away can be reduced by the same amount to
represent its distance. The second method is to add color
to the scene. For example, fog may exist between a light
source and the viewpoint. The color value corresponding to
that light source in the fog is accumulative from the
viewpoint. The deeper the light is in the fog, the whiter
its light. In this manner the user can judge the depth of
the scene and navigate through the virtual world with less
effort (Foley et al. 1992).
Lighting is essential in creating a realistic 3-D
image. In an unlit atmosphere, a sphere looks the same as a
disk. In other words, a 3-D scene becomes a 2-D scene. As
a light shines on a sphere, some of the light is reflected
diffusely and thereby gives an indication of three-
dimensionality. The other reflected light allows the viewer
to see the sphere's specular reflection and shadow, which
provide further spatial information. Due to the complexity
of the lighting formula, computing the intensity of light at
each pixel reduces the screen update rate (Foley et al.
1992). When the update rate falls too low, flickers are
noticeable and can cause visually induced motion sickness.
Motion sickness occurs whenever the body receives
conflicting information from the eyes and the brain (Bolas
Shading is used in conjunction with lighting. As the
lighting model is computed, the intensity of the light at
each vertex of a polygon is calculated. Two types of
shading can be used to render a scene. The first is flat
shading, where the entire polygon takes on the value of the
color of one of the vertices. The second is Gouraud
shading, where pixel color within the polygon is linearly
interpolated from the color at each vertex to create a
smoothly varying shade. Since linear interpolation requires
little computational power, it is often used to render
realistic scenes quickly (Foley et al. 1992).
Radiosity is most often used for rendering a static
scene. This technique pre-calculates the light interaction
among the objects in the 3-D environment. The calculation
of light interaction for a complex environment usually takes
a long time. However, upon completion of the calculation,
the user can view this environment from any angle, on the
fly, with no further exhaustive computation (Foley et al.
Ray casting is a real-time technique for rendering a
scene. A ray is traced from the viewpoint through a pixel
of the screen to intersect an object defined in the virtual
space. The color of the closest object intersected is
recorded for that pixel of the screen. If an object is
moved, ray casting can trace the new position of the object,
thus creating a dynamic scene (Foley et al. 1992).
Distortion correction is needed in most of HMDs. The
optic lenses used in the HMD magnifies the image displayed
on the screen. This magnification causes the image to be
distorted radially. Since this distorted image is not what
one would see in the real world, the viewer may feel nausea
in the virtual world (Watson and Hodges).
The main research area in audio is the simulation of
sound origin. "It has been demonstrated that using sound to
supply alternative or supplementary information to a
computer user can greatly increase the amount of information
they can ingest" (Aukstakalnis and Blatner 1992). This is
no less true in the virtual world. Although a virtual
environment provides a multitude of visual cues, a person
can't see what is behind him/her. However, in reality
someone can hear what is behind them and it should be so in
a virtual world.
The main problem with producing sound is that it is
impossible to replay previously recorded sound in a manner
that moves a sound from behind you to in front of you as you
turn your head. Crystal River Engineering has developed a
process for producing a sound such that it seems like it is
coming from a particular direction. Since these sounds are
computed and produced in real time, there is not a problem
In addition to visual output, a complete virtual world
must incorporate a three dimensional sound field that
reflects the conditions modeled in the virtual environment.
This sound field has to react to walls, multiple sound
sources, and background noise, as well as the absence of
them. This requires massive computational power and speed
because hearing is a complex system which uses the shape of
the outer ear and microsecond delays in the arrival of sound
to the two ears to determine position and location of the
source of the sound.
Evolution of 3D Sound
The evolution of 3D sound can be traced through time
and technology starting with monophonic sound. "Mono", the
latin word for "one", sends one signal to every speaker. It
appears that all of the sounds of the environment are coming
from each individual speaker. If there is only one speaker,
then all sounds seem to be coming from that point.
Stereophonic sound is the next step in the development
of realistic sound production. It allows for the sounds to
seem as if they are coming from anywhere between two
speakers. This is accomplished by delaying the signals
between the two speakers by a few microseconds. The smaller
the delay, the closer to the center the source appears to be
Surround sound, used in many theaters, uses the idea of
stereo but with more speakers. They delays can be set so
that a sound can seem to move from behind the listener to in
front of the listener. An example problem with this system
is that a plane taking off behind the listener will appear
to go by the listener's elbow instead of overhead (Crystal
River Engineering 1994).
A solution to the problem of creating a three
dimensional sound field comes from production of sound which
is tuned to an individual's head. When sound reaches the
outer ear, the outer ear bends the sound wavefront and
channels it down the ear canal. The sound that actually
reaches the eardrum is different for each person
(Aukstakalnis and Blatner 1992). To resolve this problem,
the computer must create a sound that is custom designed for
a particular user. This is done by placing small
microphones inside the ear canal, then creating reference
sounds from various locations around the listener. Then the
computer solves a set of mathematical relationships that
describe how the sound is changed from being produced to
being received inside the ear canal. These mathematical
relationships are called Head Related Transfer Functions
(HRTFs) (Crystal River Engineering 1994).
To simulate a virtual sound environment, a computer
must first determine the position of the source relative to
the listener. It also must calculate the effects of the
environment. For example, to simulate an echo due to a
wall, the computer must first determine the source's
location relative to the subject and the wall, then place
another source the at the appropriate distance and location
on the opposite side of the wall (Tonnesen and Steinmetz).
An additional computational burden is the production of
background noise. This is very important if the person in
the virtual environment wishes to be immersed in a
"believable" world. However, since the noise is background,
it does not need to take advantage of 3D sound technology.
This limits the interactivity of the user with the virtual
environment. In the real world, a person can pick out
sounds from the background. This ability is commonly called
the "cocktail party affect", because of the ability of a
person to focus on different conversations from the
background noise. This can only be done in a 3D acoustic
field (Steinmentz and Lee) and background noise in the
virtual world does not use a 3D field.
Some researchers suggest the use of prerecorded sounds
so that all computational power is devoted to determining
the location and direction of the source. This, however,
can not work in a 3D sound field. Although the sounds can
be accurately placed in a 3-D sound field, the listener can
not interact with the environment--he/she can only observe
it. In a 3D acoustic field played through headphones, when
the listener turns around, the sounds that were behind the
listener should now be in front. However, with
prerecorded/playback methods, the sounds that were behind
the listener are still behind the listener (Aukstakalnis and
HRTF measurements can not accurately simulate the
acoustical environment when used alone. The problem lies in
trying to nonintrusively make measurements. When the
microphone is placed in the ear canal, it changes the
acoustical track, thereby changing the HRTF. Also, this
method does not attempt to take into consideration the
middle or inner ear (Steinmetz and Lee).
A realistic sound environment has great potential to be
an interface for visually impaired or blind people. For
example, a virtual environment can be created where the
objects in it are application software. Then the users can
learn their way around the environment, much like they can
learn their way from home to the store without ever having
Tactile and Force Feedback
One of the biggest complaints about top of the line
virtual environment packages is the "lack of tangibility."
Although the area of tactile feedback is only a few years
old, it has produced some impressive results. These
solutions are critiqued below. There is no interface
currently built that will simulate the interactions of
shape, texture, temperature, firmness, and force.
Being able to produce a realistic interface means
having to produce tactile and force feedback to correspond
to the objects in the virtual world. Dr. Fred Brooks of the
University of North Carolina at Chapel Hill is noted for
introducing the problem of "shin-knockers" (Brooks 1995).
This was originally in reference to modeling of a submarine.
"How are you going to let the person know when he knocks his
shin on a pipe that is sticking out in his way?"
The area of touch has been broken down into two
different areas. Force feedback deals with how the virtual
environment affects a user. For example, walls should stop
someone instead of letting him/her pass through, and pipes
should knock a user in the shin to let him/her know that
they are there. Tactile feedback deals with how a virtual
object feels. Temperature, size, shape, firmness, and
texture are some of the bits of information gained through
the sense of touch.
There are several different types of devices that allow
a user to "feel" certain aspects of the virtual environment.
Motion platforms for simulators and simulated rides, force
feedback gloves, exoskeletons, and butlers are all forms of
The motion platform was originally designed for use in
flight simulators which train pilots. A platform is bolted
to a set of hydraulic lift arms. As the motion from a
visual display changes, the platform tilts and moves in a
synchronous path to give the user a "feeling" that he/she is
actually flying. This has a serious limitation in that it
can only go so far. If the user's visual cues are that the
plane is upside down, the hydraulics can not simulate this.
However, it does give the middle ear sensations that
correspond to the visual scene, making the simulation more
For interaction with small objects in a virtual world,
the user can use one of several gloves designed to give
feedback on the characteristics of the object. This can be
done with pneumatic pistons which are mounted on the palm of
the glove, as in the Rutgers Master II (Gomez, Burden and
Langrana 1995). When a virtual object is placed in the
virtual hand, the users hand can close around it. When the
fingers would meet resistance from the object in reality,
the pressure in the pistons is increased, giving the
sensation of resistance from the virtual object.
Exoskeletons are also employed to simulate the
resistance of objects in a virtual world. An exoskeleton is
basically a robotic arm strapped to a person. At the
University of Utah, researchers have developed a robotic arm
which has 10 degrees of freedom. The robot continuously
updates the force to each of its ten joints, and can make it
appear that the 50 pound arm is weightless. "However, when
the operator touches something, the virtual forces become
actual forces felt through the exoskeleton" (Lane and
Smith). This would make the operator's arm stop when it hit
a virtual wall or feel the weight of a virtual object.
The butler is a robot that basically gets in the way
whenever you try to move through an object. If a user
reaches out his/her hand to touch a wall, desk, or any other
virtual object, the butler robot will place a real object at
the location where the virtual object is supposed to be.
This technique is currently being researched at the
University of Tokyo by Susumu Tachi (Tachi 1995). The
butler being worked on "provides mechanical impedance of the
environment, i.e., inertia, viscosity and stiffness" (Tachi
1995). The major drawback of the butler robot is that it
can only present these properties for a single point at a
The butler robot under development can give the
impression of stiffness and viscosity, but it can't present
the information needed by a human to know what the object
feels like. The temperature and texture are totally unknown
to the user. It is possible to display the temperature by
heat resistive wires sown into the lining of a glove.
The texture of a surface is probably the hardest
feature of tactile feedback to simulate. The closest
documented attempt is the Sandpaper system. This system,
developed by a research group which includes members from
MIT and UNC, can accurately simulate several different
grades of sandpaper (Aukstakalnis and Blatner 1992). Other
systems, like the Teletact Commander, use either airfilled
bladders sown into a glove, or piezo-electric transducers to
provide the sensation of pressure or vibrations. These
systems have problems with the unreliability of compressors
and interference between the piezo-electric transducer
electromagnetic fields and the electromagnetic field used by
a Polhemus tracking system (Stone 1993).
Any attempt to model the texture of a surface faces
tremendous challenges because of the way the human haptic
system functions. There are several types of nerves which
serve different functions, including: temperature sensors,
pressure sensors, rapid-varying pressure sensors, sensors to
detect force exerted by muscles, and sensors to detect hair
movements on the skin. All of these human factors must be
taken into consideration when attempting to develop a
tactile human-machine interface.
The purpose of a tracking device is to determine the x,
y, and z position, and the orientation (yaw, pitch, and
roll) of some part of the user's body in reference to a
fixed point. Most types of virtual reality interaction
devices will have a tracker on them. HMDs need a tracker so
that the view can be updated for the current orientation of
the user's head. Datagloves and flying joysticks also
usually have trackers so that the virtual "hand" icon will
follow the position and orientation changes of the user's
real hand. Full body datasuits will have several trackers
on them so that virtual feet, waist, hands, and head are all
slaved to the human user.
When designing or evaluating a virtual reality system
that will receive tracking information, it is important to
pay attention to the latency (lag), update rate, resolution,
and accuracy of the tracking system. Latency is the "delay
between the change of the position and orientation of the
target being tracked and the report of the change to the
computer" (Baratoff and Blanksteen). If the latency is
greater than 50 milliseconds, it will be noticed by the user
and can even cause nausea or vertigo. Update rate is the
rate at which the tracker reports data to the computer, and
is typically between 30 and 60 updates per second.
Resolution will depend on the type of tracker used, and
accuracy will usually decrease as the user moves farther
from the fixed reference point (Baratoff and Blanksteen).
Six-degree-of-freedom tracking devices come in several basic
types of technology: mechanical, electromagnetic,
ultrasonic, infra-red, and inertial.
A mechanical tracker is similar to a robot arm and
consists of a jointed structure with rigid links, a
supporting base, and an "active end" which is attached to
the body part being tracked (Sowizral 1995), often the hand.
This type of tracker is fast, accurate, and is not
susceptible to jitter. However, it also tends to encumber
the movement of the user, has a restricted area of
operation, and the technical problem of tracking the head
and two hands at the same time is still difficult.
An electromagnetic tracker allows several body parts to
be tracked simultaneously and will function correctly if
objects come between the source and the detector. In this
type of tracker, the source produces three electromagnetic
fields each of which is perpendicular to the others. The
detector on the user's body then measures field attenuation
(the strength and direction of the electromagnetic field)
and sends this information back to a computer. The computer
triangulates the distance and orientation of the three
perpendicular axies in the detector relative to the three
electromagnetic fields produced by the source (Baratoff and
Electromagnetic trackers are popular, but they are
inaccurate. They suffer from latency problems, distortion
of data, and they can be thrown off by large amounts of
metal in the surrounding work area or by other
electromagnetic fields, such as those from other pieces of
large computer equipment. In addition, the detector must be
within a restricted range from the source or it will not be
able to send back accurate information (Sowizral 1995), so
the user has a limited working volume.
Ultrasonic tracking devices consist of three high
frequency sound wave emitters in a rigid formation that form
the source for three receivers that are also in a rigid
arrangement on the user. There are two ways to calculate
position and orientation using acoustic trackers. The first
is called "phase coherence". Position and orientation is
detected by computing the difference in the phases of the
soundwaves that reach the receivers from the emitters as
compared to soundwaves produced by the receiver. "As long
as the distance traveled by the target is less than one
wavelength between updates, the system can update the
position of the target" (Baratoff and Blanksteen). The
second method is "time-of-flight", which measures the time
for sound, emitted by the transmitters at known moments, to
reach the sensors. Only one transmitter is need to
calculate position, but the calculation of orientation
requires finding the differences between three sensors
(Baratoff and Blanksteen).
Unlike electromagnetic trackers that are affected by
large amounts of metal, ultrasonic trackers do not suffer
from this problem. However, ultrasonic trackers also have a
restricted workspace volume and, worse, must have a direct
line-of-sight from the emitter to the detector. Time-of-
flight trackers usually have a low update rate, and phase-
coherence trackers are subject to error accumulation over
time (Baratoff and Blanksteen). Additionally, both types
are affected by temperature and pressure changes (Sowizral
1995), and the humidity level of the work environment
(Baratoff and Blanksteen).
Infrared (optical) trackers utilize several emitters
fixed in a rigid arrangement while cameras or "quad cells"
receive the IR light. To fix the position of the tracker, a
computer must triangulate a position based on the data from
the cameras. This type of tracker is not affected by large
amounts of metal, has a high update rate, and low latency
(Baratoff and Blanksteen). However, the emitters must be
directly in the line-of-sight of the cameras or quad cells.
In addition, any other sources of infrared light, high-
intensity light, or other glare will affect the correctness
of the measurement (Sowizral 1995).
Finally, there are several types of inertial tracking
devices which allow the user to move about in a
comparatively large working volume because there is no
hardware or cabling between a computer and the tracker.
Inertial trackers apply the principle of conservation of
angular momentum (Baratoff and Blanksteen). Miniature
gyroscopes can be attached to HMDs, but they tend to drift
(up to 10 degrees per minute) and to be sensitive to
vibration. Yaw, pitch, and roll are calculated by measuring
the resistance of the gyroscope to a change in orientation.
If tracking of position is desired, an additional type of
tracker must be used (Baratoff and Blanksteen).
Accelerometers are another option, but they also drift and
their output is distorted by the gravity field (Sowizral
Virtual reality and virtual environments go far beyond
typical interfaces in the realism of the visual metaphor.
Point and click with a table-top mouse is wonderful in some
situations, but not nearly sufficient for an immersive
environment. So instead of a keyboard and mouse,
researchers are developing gloves, 3D mice, floating
joysticks, and voice recognition. This paper will not
attempt to cover voice recognition because it is such a
For sensing the flexion of the fingers, three types of
glove technology have arisen: optical fiber sensors,
mechanical measurement, and strain gauges. The Dataglove
(originally developed by VPL Research) is a neoprene fabric
glove with two fiber optic loops on each finger. Each loop
is dedicated to one knuckle and this can be a problem. If a
user has extra large or small hands, the loops will not
correspond very well to the actual knuckle position and the
user will not be able to produce very accurate gestures. At
one end of each loop is an LED and at the other end is a
photosensor. The fiber optic cable has small cuts along its
length. When the user bends a finger, light escapes from
the fiber optic cable through these cuts. The amount of
light reaching the photosensor is measured and converted
into a measure of how much the finger is bent (Aukstakalnis
and Blatner 1992). The Dataglove requires recalibration for
each user (Hsu). "The implications for longer term use of
devices such as the Dataglove--fatigue effects,
recalibration during a session--remain to be explored"
(Wilson and Conway 1991).
The Powerglove was originally sold by Mattel for the
Nintendo Home Entertainment System but, due to its low
price, has been used widely in research (Aukstakalnis and
Blatner 1992). This Powerglove is less accurate than the
Dataglove, and also needs recalibration for each user, but
is more rugged than the Dataglove. The Powerglove uses
strain gauges to measure the flexion of each finger.
A small strip of mylar plastic is coated with an
electrically conductive ink and placed along the length
of each finger. When the fingers are kept straight, a
small electrical current passing through the ink
remains stable. When a finger is bent, the computer
can measure the change in the ink's electrical
resistance (Aukstakalnis and Blatner 1992).
The dexterous hand master (DHM) is not exactly a glove
but a exoskeleton that attaches to the fingers with velcro
straps. A mechanical sensor measures the flexion of the
finger. Unlike the Dataglove and Powerglove, the DHM is
able to detect and measure the side-to-side movement of a
finger. The other gloves only measure finger flexion. The
DHM is more accurate than either of the gloves and less
sensitive to the user's hand size, but can be awkward to
work with (Hsu).
The main strength of the various types of gloves is
that they provide a more intuitive interaction device than a
mouse or a joystick. This is because the gloves allow the
computer to read and represent hand gestures. Objects in
the environment can therefore be "grasped" and manipulated,
the user can point in the direction of desired movement,
windows can be dismissed, etc (Wilson and Conway 1991).
"Gestures should be natural and intuitive in the particular
virtual environment. Actions should be represented visually
and be incremental, immediate, and reversible to give a
person the impression of acting directly in an environment"
(Dennehy). Wilson and Conway (1991) say that a basic set of
command gestures for gloves has been developed, but that
more work is needed to expand the set beyond the current
simple mapping. Another area of improvement is feedback
for the user to aid hand-eye coordination and proprioceptive
feedback to let a user know when an object has been
successfully grasped (Wilson and Conway 1991).
There are several brands of 3D mice available, all with
basically the same technology: A mouse or trackball has been
modified to include a position and orientation tracker of
some kind (Aukstakalnis and Blatner 1992). This modified
mouse is fairly familiar and intuitive to users--simply push
the mouse in the direction you want to move. However, these
mice are not very useful for interactions other than
navigation and selection of objects (Hsu).
The final category of interaction device is the wand or
floating joystick. Basically, this device works exactly the
same as a conventional joystick, but it is not attached to a
base that sits on a table top. Instead, the joystick is
equipped with an orientation tracker so the user simply
holds it in their hand and tilts it. Most flying joysticks
also have some buttons on the stick for "clicking" or
selecting, similar to a mouse (Hsu).
Human Factors in Virtual Environments
Kay Stanney (1995) has written an excellent critique of
the areas that still need to be researched in order to make
virtual environments a safe place to work. These include
health concerns such as "flicker vertigo" which can induce a
seizure, auditory and inner ear damage from high volume
audio, prolonged repetitive movements which cause overuse
injuries (for example, carpal tunnel syndrome), and head,
neck, and spine damage from the weight or position of HMDs.
Safety factors also need to be considered. For example,
when a user's vision is restricted by an HMD, they are
likely to trip and fall over cables or other real world
objects. Also, how safe is the user from harm in the event
of system failure? Hands and arms might be pinched or over
extended if a haptic feedback device fails; the user might
be disoriented or harmed if the computer crashes and
suddenly dumps the user into reality, disrupting the sense
Virtual reality holds promises of being the "ultimate
human-computer interface". This would incorporate a
natural, intuitive interface between a human and a machine
generated work environment.
An intuitive interface between man and machine is
one which requires little training . . . and proffers a
working style most like that used by the human being to
interact with environments and objects in his day-to-
day life. In other words, the human interacts with
elements of this task by looking, holding,
manipulating, speaking, listening, and moving, using as
many of his natural skills as are appropriate, or can
reasonable be expected to be applied to as task (Stone
This paper has overviewed the technology currently in
use, in addition to the open areas of research. Visual
display devices, graphics display techniques, 3D audio,
haptic feedback, navigation, and interaction devices are all
in need of more development. Large areas of concern about
the health and safety of the user are still in focus, not to
mention the unsolved technical problems standing in the way
of an intuitive immersive environment. As the public market
for virtual reality grows in the coming years, more money
will be spent on quality interface improvements and some of
these problems may be solved.
Aukstakalnis, Steve, and David Blatner. 1992. Silicon
Mirage The Art and Science of Virtual Reality.
Berkeley, California: Peachpit Press, Inc.
Baratoff, Gregory, and Scott Blanksteen. Unknown.
Encyclopedia of Virtual Environments. World Wide Web URL
Blanchard, Jim, and Reiko Tsuneto. Unknown. Stereoscopic
viewing. Encyclopedia of Virtual Environments. World
Wide Web URL:
Bolas, Mark T. 1994. Human factors in the design of an
immersive display. IEEE Computer Graphics and
Applications January: 55-59.
Brooks, Fred. 1995. Panel Discussion on March 13, 1995 at
the IEEE Virtual Reality Annual International Symposium
in Research Triangle Park, North Carolina.
Dennehy, Michael. Unknown. Encyclopedia of Virtual
Environments. World Wide Web URL:
Foley, J., A. van Dam, S. Feiner, and J. Hughes. 1992.
Computer Graphics Principles and Application, second
edition. Reading, Massachusetts: Addison-Wesley.
Gomez, Daniel, Grigore Burdea, and Noshir Langrana. 1995.
Integration of the Rutgers Master II in a Virtual
Reality Simulation. In IEEE: Proceedings of the Virtual
Reality Annual International Symposium in Research
Triangle Park, NC, March 11-15, 1995, IEEE Computer
Society Press. Washington: IEEE Computer Society Press,
Hsu, Jack. Unknown. Encyclopedia of Virtual Environments.
World Wide Web URL:
Lane, Corde, and Jerry Smith. Unknown. Encyclopedia of
Virtual Environments. World Wide Web URL:
Lane, Corde. Unknown. Displays. Encyclopedia of Virtual
Environments. World Wide Web URL:
Sowizral, Henry. 1995. Tutorial: An Introduction to Virtual
Reality. Virtual Reality Annual International
Stanney, Kay. 1995. Realizing the Full Potential of Virtual
Reality: Human Factors Issues That Could Stand in the
Way. In IEEE: Proceedings of the Virtual Reality Annual
International Symposium in Research Triangle Park, NC,
March 11-15, 1995, IEEE Computer Society Press.
Washington: IEEE Computer Society Press, 28-34.
Steinmetz, Joe, and Glen Lee. Unknown. Encyclopedia of
Virtual Environments. World Wide Web URL:
Stereo is dead. The AudioReality story. . Palo Alto,
California: Crystal River Engineering.
Stone, Robert J. 1993. Virtual reality systems. Edited by
R.A. Earnshaw, M.A. Gigante, and H. Jones. Virtual
Reality: A tool for telepresence and human factors
research. London: Academic Press.
Tonnesen, Cindy, and Joe Steinmetz. Unknown. Encyclopedia of
Virtual Environments. World Wide Web URL:
Tachi, Susumu. 1995. Whither Force Feedback?. In IEEE:
Proceedings of the Virtual Reality Annual International
Symposium in Research Triangle Park, NC, March 11-15,
1995, IEEE Computer Society Press. Washington: IEEE
Computer Society Press, 227.
Watson, Benjamin A., and Larry F. Hodges. Unknown. Using
Texture Maps to Correct for Optical Distortion in
Head-Mounted Displays. World Wide Web URL:
Wilson, Michael, and Anthony Conway. 1991. Enhanced
Interaction Styles for User Interfaces. IEEE Computer
Graphics and Applications March: 79-89.