






 |
Testing the Usability of Synchronous Computer-Supported
Cooperative Work Products
Lynellen Perry
July 25, 1994
Dr. Carter
CS 9253 Topics in Software Engineering: Usability
10 week Summer Term
"The needs of a group using a tool collaboratively, are
different from those of an individual user" John C. Tang
"The needs of the many out-weigh the needs of the few, or
the one..." Star Trek II: The Wrath of Khan
Introduction
Computer-supported Cooperative Work (CSCW) goes by many
names: groupware, computer-supported collaboration, workflow,
group decision-support systems (Palmer, 15), electronic
meeting systems (Valacich, 261), and probably several others.
There are nearly as many definitions of CSCW as there are
authors on the subject, and include the following. Palmer et
al. define CSCW as "people working together on a product,
research area, topic, or scholarly endeavor with help from
computers" (Palmer, 15), but also as "A system that integrates
information processing and communications activities to help
individuals work together as a group" in the same paper
(Palmer, 16). Palmer does not make a distinction between the
term "CSCW" and any of the other terms mentioned above.
Greenberg, however, states that 'groupware' is merely
software that "supports and augments group work" while 'CSCW'
is "the scientific discipline that motivates and validates
groupware design . . . the study and theory of how people work
together, and how the computer and related technologies affect
group behavior" (Greenberg, 133). In this view, CSCW collects
research from scientists in the Computer Science, Cognitive
Science, Psychology, Sociology, Anthropology, Ethnography,
Management, and Management Information Systems fields.
Many software products can fit into the 'groupware'
concept: email, bulletin boards, asynchronous conferencing,
group schedulers, group decision support systems,
collaborative authoring tools, screen-sharing software,
computer equivalents to whiteboards, video conferencing
(Greenberg, 133), multigroup decision-support systems (Palmer,
16), computer-assisted design/computer-assisted manufacturing
(CAD/CAM), computer-assisted software engineering (CASE),
concurrent engineering, workflow management, distance
learning, telemedicine, real-time network conferences (MUDs
and MUSHs) (Grudin, 20), and even spreadsheet programs (Nardi,
161).
Each of the software types above fits its users into one
of the space/time categories which are shown in the table
below (Grudin, 25). There are many research papers describing
the testing of distributed systems, where the users are not in
the same place and/or not working at the same time. However,
this research review focuses of products where the intended
users are in the same place (same room), working together at
the same time on the same project. This is called synchronous
work.Grudin's Space/Time Categories
same time;
same place
different and
predictable time;
same place
different but
unpredictable
time; same place
same time;
different and
predictable place
different and
predictable time;
different and
predictable place
different but
unpredictable
time; different
and predictable
place
same time;
different but
unpredictable
place
different and
predictable time;
different but
unpredictable
place
different but
unpredictable
place and time
Research on the Usability of Synchronous CSCW Systems
Now that some of the terms in the title of this paper
have been defined, let us turn to the usability testing
aspect. Systematic, formal, scientific usability testing is
still a rather new research area. The basic methods of
testing usability include heuristic evaluation, user testing,
and cognitive walkthroughs. Nielson describes five usability
attributes that testing should measure: learnability,
efficiency, memorability, errors, and satisfaction (Nielson,
26). From the literature, it appears that user testing is the
most widely used form of usability testing and that efficiency
and satisfaction are the usability attributes most often used
as measures. All papers reviewed below implement some variety
of a user test to study the usability of various groupware
products.
Cognoter
Tatar et al. performed usability tests on Cognoter, a
software package designed to aid small work groups (two to
five people) in the creation of a plan or outline. To test
the usability of this software, Tator and colleagues ran
several user tests. These user tests took the form of a
series of two hour working sessions of two groups, each of
which consisted of three users who were experienced with
computers.
The test room contained a workstation for each test user,
and a large screen which was configured to display to the
group the shared work area available in the software. The
Cognoter software is divided into a private editing area and
a shared work area. Any individual may type in a brief note
in the private area and then release it to the shared area.
These notes appear randomly in the shared area as icons with
a keyword. Any icon may be clicked upon to reveal the full
annotation that was entered.
The experiment was run two times. The first time, a
pilot test, the experimenters had severe problems observing
the user groups as they worked. Given the way the experiment
was set up, it was impossible for the observers to see the
details of work because each user had a separate machine.
Also, the observers were people who were very familiar with
the performance characteristics of the Cognoter software, and
tended to compensate for any problems the users had, thus
biasing the results.
So for the actual experiment, Tatar et al. videotaped
each test session and also logged all messages sent between
machines. This solved the problem of not being able to
observe everything that went on during the test, but the other
problem mentioned above was not corrected. The three users
chosen to be tested in each group were expert users of
Cognoter. They were long-term collaborators who were familiar
with the editor, window system, and mouse conventions used in
the Cognoter software. Three developers were available to
help the test users with any problems that arose.
The tasks for each of the two groups tested were not the
same. Each group was asked to use the Cognoter software to
brainstorm about a subject of their own choosing that would be
useful for their own work. By not specifying the task for
both groups, another variable has been introduced into this
experiment, yet the authors did not comment on this fact or
analyze what effect this had on the test. There were no
usability goals set for this test, and no methods of measuring
the usability were described. The goal appears to have been
simply the discovery usability problems, though problems found
were not rated by severity after the test.
The results reported were that users expressed "extreme
frustration and reduced efficiency" (as compared to working
with traditional paper and/or whiteboard). Though classed as
"experts", neither group understood the software well enough
to use its full potential. In Group A, everyone first worked
on their own, using the private editing area but not looking
at each other's work or talking or sharing ideas in any way.
They then left the computer and worked together on paper. The
effect of this was that the software was used as a sort of
word processor, not as a tool for group interaction.
Group B, on the other hand, figured out how to use the
video capability of the software so that the whole group could
see the work of whoever was typing. They didn't understand
though, that there could be more than one typist at a time.
Despite this bit of success in using the software for group
interaction, Group B also expressed their frustration visibly,
with people putting their head in their hands, raising their
voices, and threatening to walk out.
This rather disastrous user test of the Cognoter
software indicated that the users were experiencing two types
of problems. First, users wanted to see things in the
workspace that the system would not let them see. Second,
users mistook references in one another's speech or actions
(pointing to individual screen and saying "there", "this",
"that", etc.) and could not resolve the difficulty
satisfactorily.
These problems lead to the decision that eight design
decisions were at fault for the user difficulties (Tatar,
198).
1. Separate screens -- gaze and gestures were missed by
group members because each was busy looking at their own
screen.
2. Lack of sequentiality -- there was no way to know
where the next icon would appear, or to know in what order the
icons had been created.
3. Short labels on the icons limited the information the
group could see.
4. Anonymity.
5. Private editing allowed someone to change a previous
contribution, thus losing information.
6. There was unpredictable delay between the release of
a privately edited item and the time it appeared on other
user's screens.
7. Private moving of icons caused an icon to change
position suddenly on other user's screens and thus lose its
identifiable position.
8. Individually tailorable windows caused confusion when
attempting to reference a particular item on the screen.
These design decisions "made Cognoter items more
difficult both to create and to use than whiteboard objects"
(Tatar, 203). In a redesign of the software, only the last
four of the above design decisions were changed.
In addition to the above software usability problems,
Tatar's paper also discusses what they learned about groups
and modes of conversation. These topics, while interesting and
necessary to understanding groups in order to build worthwhile
software to support groups, are beyond the scope of this
research review.
GroupSystems
Another electronic meeting support (EMS) system,
GroupSystems, is described in Valacich et al. EMS systems can
be used to support distributed groups. However, The
University of Arizona (where this research was conducted) has
focused on face-to-face (synchronous) meetings. Tasks that
can be accomplished with the GroupSystems facilities at The
University of Arizona include "communication, planning, idea
generation, negotiation, conflict resolution, systems analysis
and design, and collaborative group activities such as
document preparation and sharing" (Valacich, 261).
Valacich et al. measure the usability of the facility in
terms of the productivity of the meeting, as manifested by the
reduction or elimination of the "dysfunctions of the group
interaction (i.e. process losses), so that a group reaches or
exceeds (i.e. process gains) its task potential" (Valacich,
262). There are many process losses, but Appendix A discusses
the ones relevant to the GroupSystems environment at the
University of Arizona. Variables researched that affect the
productivity of a group include group size, group task,
anonymity, and proximity.
The hardware setup at the GroupSystems facility is as
follows: each participant has a work area, all of which are
arranged to focus on the front of the room. Each work area
has a separate color graphics microcomputer networked to the
others. There is a facilitator's console to control the EMS,
at least one large screen video display, and other audio-
visual support such as white boards and overhead projectors at
the front of the room. A control room next door to the
meeting room has a laser printer and a copier.
Valacich et al. summarize seven laboratory studies
conducted in the GroupSystems environment where the task for
the group was idea generation. They also review two
laboratory studies where the group task was decision making.
Six GroupSystems field and case studies are then presented.
All of these studies are conducted via user testing.
Results of these user tests validated the high usability
of the GroupSystems software and hardware environment.
Participants in the studies, which included several groups
from real-world corporations, stated that meetings supported
by GroupSystems were much more satisfying, effective, and
productive than traditional meetings. IBM has even installed
more than 36 electronic meeting rooms around the world, using
the software and hardware environment of GroupSystems.
In addition to validating the usability of GroupSystems
for synchronous group work, these studies uncovered and
confirmed data about how groups work. Again, details on this
subject are interesting but are beyond the scope of this
paper. Briefly, the studies indicate that CSCW software
should support large groups (9+ people) over small groups (2-5
people), and that group members should be anonymous for the
highest productivity and satisfaction ratings.
Amsterdam Conversation Environment
Dykstra and Carasik discuss the "theory and concepts in
designing a synchronous shared workspace to support human
interaction" (Dykstra, 419). They describe an implementation
of such a system, the Amsterdam Conversation Environment, or
ACE. In agreement with Palmer's definition of CSCW (above),
the authors feel that technology should support groups rather
than replace or automate activities. To this end, ACE is not
task specific. It is meant to provide users with "a common
workspace through which they can share and manipulate
individual products, where the focus is on stimulating
interaction rather than on producing a product" (Dykstra,
420).
Dykstra describes how ACE evolved via several iterations
of user tests. The current prototype of ACE runs on a network
of Macintoshs, with a main server maintaining links between
objects, keeping track of users, and controlling the
simultaneous update of the user's screens. However, it was
designed and user tested first by a physical model, and then
on overhead transparencies. If done carefully, these can be
cheap ways of performing user testing because no software has
been written at this point. Mistakes and redirections are
much cheaper when they occur in the design phase rather than
after code has been written.
The authors were still developing testing procedures for
the ACE software prototype at the time they wrote the paper,
so no results are available on that phase of user testing.
Dykstra does mention, though, the usability attributes they
wish to measure. These include user satisfaction (especially
in light of the fact that the software provides so few
restraints on group process), productivity (as indicated by
the amount of "process paralysis" that occurs during use), and
learnability (they hope to avoid needing a specially trained
facilitator for the software and that the documentation needed
will be minimal) (Dykstra, 433).
Spreadsheets
Nardi and Miller observe that spreadsheets are actually
developed by the cooperative work of several people most of
the time. They use a simple definition of cooperative work,
"multiple persons working together to produce a product or
service" (Nardi, 162), and state that there are two forms of
cooperative work central to CSCW that have not received much
attention. Most CSCW research, they argue, is focused on
computer systems that encourage communication between group
members. By having this focus, researchers have overlooked
the fact that collaboration in programming itself is very
common. The sharing of programming expertise and the sharing
of domain knowledge are obvious in real-world uses of
spreadsheets.
Though spreadsheets are usually considered to be a
single-user application, Nardi and Miller have found that
"spreadsheet co-development is the rule, not the exception",
and thus include spreadsheets in the category of synchronous
groupware. As further justification of this categorization,
they note that spreadsheet users:
"1) share programming expertise through exchanges of
code;
2) transfer domain knowledge via spreadsheet templates
and the direct editing of spreadsheets;
3) debug spreadsheets cooperatively;
4) use spreadsheets for cooperative work in meetings and
other group settings; and
5) train each other in new spreadsheet techniques"
(Nardi, 163).
Nardi and Miller did not perform usability testing of
synchronously developed spreadsheets in a traditional way.
They did not set up their own experiment, find test users, and
then evaluate the results. Instead, they tape recorded
interviews with experienced spreadsheet users in their own
offices and homes. In these interviews, they asked a fixed
set of open-ended questions in whatever order that they came
up during the conversation.
Through this process, they found that spreadsheet
programs are easier to learn cooperatively (spreadsheet
experts share their programming knowledge with novices),
spreadsheets are developed more efficiently when developed
cooperatively (domain knowledge is shared), and there are
fewer errors in cooperatively developed spreadsheets (the
collaborator can often spot a programming or logic mistake
faster than the author). In addition, novice spreadsheet
programmers feel more satisfied when they develop spreadsheets
cooperatively. These results cover four of the five
attributes that Nielson says are a part of usability. Using
this rather unorthodox method of experimenting, Nardi and
Miller thus found that a "single-user" application could be
used in a groupware manner, and they give a few suggestions to
software developers of "single-user" applications to help make
those products capable of being used effectively by small
groups as well as individual users.
Summary and Conclusions
Each of the research papers reviewed contains a slightly
different idea of the term CSCW, and uses different methods in
testing the usability of products that support groups. CSCW
and Usability testing are still relatively new fields of
research, so the vocabulary and the methods of testing have
not yet solidified. In addition, groupware products can be
divided into nine space/time areas, making the research arena
rather large.
This research review has focused on research that tests
the usability of synchronous computer-supported cooperative
work products. It appears that the most popular method of
conducting usability tests is the user test. User tests have
been used to validate the usability of rather different types
of groupware products, from spreadsheet applications
originally intended for use by a single user, to electronic
meeting rooms that provide both a software and hardware
environment to support decision making, brainstorming, and
other typical group activities.
In all of the above research, the authors show that
usability testing of CSCW or groupware products has many side
benefits besides the discovery of usability problems of a
particular implementation. We do not yet fully understand how
groups work and how computers can best support group
processes. Researchers are trying to understand, among many
other variables, how group members communicate among
themselves, how shared work areas (computer-supported and
traditional) are used, how important are gestures and other
visual and social cues to getting work done, and how proximity
and anonymity affect the productivity and satisfaction of the
group. Usability studies have, and probably will continue to
provide many insights into this area at the same time that
they show developers specific problems with particular
packages. Appendix A -- Valacich's Process Losses
Production blocking. "Refers to the fact that only one member
of a group can speak at a time during verbal communication"
(Valacich, 262). This has three effects on the meeting:
1. others waiting to speak may forget or suppress their
ideas because they eventually seem less relevant or original
(attenuation blocking)
2. while waiting to speak, group members may not be truly
paying attention to the speaker, rather they are focusing on
trying to remember their own idea (concentration blocking) and
3. while listening to a speaker, other group members are
not generating their own new ideas (attention blocking).
Unequal air time. As groups get larger, the amount of time
that each person could possibly use for verbal communication
gets smaller.
Evaluation apprehension. Individuals may shy away from
sharing ideas and comments for fear of negative evaluation by
the others present.
Free-riding. Individuals may try to make the other group
members accomplish the task without any contributions from
themselves. Free-riding may be caused by social loafing, but
can also be greater when individuals think that their
contributions are not as necessary for the success of the
group (i.e. it is a large group, so surely someone else will
think of the same things that I would have).
Cognitive inertia. This is the "tendency of discussions to
move along one line of thought without deviating from the
current topic" (Valacich, 263).
Socializing. Chatting, drinking coffee, eating refreshments,
and other non-task related activities.
Domination. "Occurs when some group member(s) exercise(s)
undue influence or monopolize(s) the group's time in an
inefficient manner" (Valacich, 263).
Failure to remember. Individuals do not pay attention to
and/or remember comments that others have said.
Incomplete analysis. This occurs when the group does not use
all the information available to it, or fails to challenge
assumptions.
WORKS CITED
Dykstra, E. A., and R. P. Carasik. 1991. Structure and
Support in Cooperative Environments: the Amsterdam
Conversation Environment. International Journal of Man-
Machine Studies 34:419-434.
Greenberg, S. 1991. Computer-Supported Cooperative Work and
Groupware: An Introduction to the Special Issues.
International Journal of Man-Machine Studies 34:133-141.
Grudin, J. 1994. Computer-Supported Cooperative Work: History
and Focus. Computer 27:19-26.
Nardi, B. A., and J. R. Miller. 1991. Twinkling Lights and
Nested Loops: Distributed Problem Solving and Spreadsheet
Development. International Journal of Man-Machine Studies
34:161-183.
Nielson, J. Usability Engineering. Boston: AP Professional,
1993.
Palmer, J. D., and N. A. Fields. 1994. Computer-Supported
Cooperative Work. Computer 27:15-16.
Tang, J. C. 1991. Findings from Observational Studies of
Collaborative Work. International Journal of Man-Machine
Studies 34:143-160.
Tator, D. G., G. Foster, and D. Bobrow. 1991. Design for
Conversation: Lessons from Cognoter. International
Journal of Man-Machine Studies 34:185-209.
Valacich, J. S., A. R. Dennis, and J.F. Nunamaker, Jr. 1991.
Electronic Meeting Support: the GroupSystems Concept.
International Journal of Man-Machine Studies 34:261-279.
|