![]() |
AC303/AC507 Notes 7 |
Evaluation And Evaluation Techniques
Evaluation
What Is Evaluation?
Evaluating is finding or
judging the quality of a system, taking into consideration the system
hardware, the functionality of the design and the user interface design.
This involves gathering information about the usability or potential
usability of the system.
Why Evaluate?
As
user interface designers, we evaluate the interface to improve it, to
compare it and to ensure that it is usable in real-life settings. Make
sure it is adequate for the purpose its been designed for and that
it meets the required standards.
When To Evaluate?
Evaluation can occur at many different points in the design and
development cycle. There are two kinds of evaluation - formative and
summative.
Formative evaluation goes on throughout development and
is used to monitor and change the design before the end. This type of
evaluation is highly specific.
Summative evaluation is more general
and is done at the end of stages of development. It records, describes and
summarises each stage.
Who Does Evaluation?
The evaluation process can be carried out by the user, the
designer or an outsider. It can be expensive to get the user involved in
the evaluation process, but the designer can only go so far. It is
unlikely that he or she will be a typical user. An outsider brought in to
do an evaluation can be a domain expert (someone who knows a lot about the
subject) or a professional evaluator (someone who knows how to carry out
an evaluation from a technical point of view).
Measuring
What Do We Measure?
When evaluating interface
design, we need to measure things about the user environment and the user.
In the environment, we can measure the presence and availability of
information.
Examples:
Colleagues,relevant professionals, manuals, on-line help, other pieces of software.
We can also measure the speed of response, the size of the screen,
ergonomic factors and compliance with guidelines.
When it comes to
the user, we can measure things like their knowledge, attitudes and
values. And what their long and short term goals are. In terms of how they
use the system, it is possible to take useful measurements.
Examples:
In computing, we can measure keystrokes, commands used, actions done, how long it takes to perform tasks, how quickly each action is learned (and retained) and how many errors occur.
We can also get an estimate from the user of how they think they
performed.
What Questions Do We Ask?
Typical
evaluation questions to gain measurements are:
· How easy is the
system to learn and remember?
· How well do users accomplish
task goals when using the system?
· How satisfied are users with
the operation of the system?
· How well does the system measure
up against current knowledge about design and performance?
How Do We Measure?
There are 3 ways we can take
measurements. We can get a device to do it, we can bring in an observer or
we can get the users to do it themselves.
Who Do We Do
Measurements On?
We can measure users who rarely use the
system, intermittent users and expert users. Within these categories the
users may be new or experienced.
What Type Of
Measurements?
There are two types of measurements that can
be obtained - qualitative and quantitative. Qualitative concerns opinions
and the quality of a product and this is expressed in words. This kind of
measuring takes place early on in the development cycle and helps the
designer get at whats really going on between the user and the
machine. It also helps identify the relevant quantitative measures.
Quantitative measuring provides data in a consistent way. This
allows performances to be compared and statistical analysis to take place.
Gathering quantitative data requires getting the user to tell you what you
want to know. You can then manipulate conditions and measure performance
changes directly.
Methods For Collecting Data -
Observation
Collecting data by observing people using
the system can be done in a lab or in the field i.e. the place where the
system will actually be used. Observing users in their normal environment
is an unobtrusive way of seeing how the system is operated in a real
life setting and these observations can be recorded. However, it is
important that the observer tries to blend into the background so that the
user is as unaware as possible of being monitored.
Strengths
Of Observation Method
As well as being unobtrusive
(hopefully) the great strength of the observation method for collecting
data is that extraneous factors can be independently recorded in a
realistic setting.
Examples:
Factors such as: what the environment is like; how many interruptions the user has to put up with; what the workplace atmosphere is like (noisy, stressful or calm); what other systems are in use; what other distractions there are; and what the management style is like.
Limitations Of The Observation Method
It may be
difficult for the observer to be as unobtrusive as we would like. Users
may feel they are being spied on. A further limitation of this method is
that there are unspoken things that are factors for the user but which are
missed.
Examples:
This method doesnt record the intentions or feelings of the user. If there is a sense of satisfaction with all or part of the system, this is also missed.
This method is good for recording what is happening, but not why.
Methods For Collecting Data - Think-Aloud Protocol
Think-aloud protocol involves getting the user to make spoken
observations while using the system. Again, this can be done in the lab or
the field and can be used to obtain a much wider range of information than
observation alone.
Strengths Of Think-Aloud Protocol
This method of collecting data is the closest we can get to
measuring whats going on in the users head. It enables us to
record many things which could otherwise be missed.
Examples:
We can record: the users intentions; the nature of the puzzles encountered; why errors occurred; why wrong selections were made; and how much the user has understood.
This method is immediate and good for finding problems. It is
particularly effective when used in the early stages and with new users.
Limitations Of Think-Aloud Protocol
Using the
think-aloud protocol method is unavoidably obtrusive because the user has
to speak out and so is constantly aware that performance is being
monitored. Therefore, this method is most likely to change the behaviour
of the user (known as the Hawthorne Effect). Another limitation is that
users often find it difficult to put thoughts into words while trying to
solve a difficult problem. Finally, as with observation, this method doesnt
give answers to pre-determined questions.
Example Recording Sheet
· What does the subject notice?
· What is the subject thinking now?
· What is the subject trying to do (goal)?
· How is the subject trying to do it (plan)?
· Is this achieved?
· Problems encountered.
· Suggestions made by the subject
Methods For Collecting Data - Feature Checklist
Creating a features checklist involves listing every possible
function that is available within the system which is being evaluated. The
users then tick off each feature as and when they use it. This lets us see
which functions are being used. And, if theyre not being used, are
they difficult to find or just not needed?
Strengths
Of Feature Checklist Method
This method produces a
quantitative estimate of features used. This enables us to check the
knowledge of the user and how needed various features are.
Limitations Of Feature Checklist Method
Unfortunately,
creating and using feature checklist can be long and cumbersome. Users can
find it very boring (and obtrusive) to keep the checklist up to date.
Methods For Collecting Data - Structured Diary
Another way of collecting data is to get users to keep a
structured diary. This involves getting them to describe, in writing, any
problems encountered.
Strengths Of Structured Diary
Method
This method helps uncover long term usage problems.
It is also an inexpensive way to collect highly specific information.
Limitations Of Structured Diary Method
Unlike
some other methods, you must have a working system to use structured
diaries. Also, it may be difficult to get users to complete these and
training may be required.
Methods For
Collecting Data - Questionnaires
To collect data from a
large number of people, it is helpful to send out questionnaires for
people to fill in anonymously and in their own time. Questionnaires are
made up of structured and uniform questions designed to make it as easy as
possible to answer. Questionnaires are used to gather information which is
both objective and subjective. Objective information is factual and
subjective information is based on opinions, beliefs, feelings, etc.
Examples:
Objective - How many children do you have?
Subjective - How interesting did you find the game?
Strengths Of Questionnaire Method
The good thing
about questionnaires is that it is a non labour intensive way of gathering
course data from large numbers of people. Direct questions can
be asked and the responses pre-categorised. The data gathered can then be
statistically analysed.
Limitations Of Questionnaire
Method
When producing questionnaires, it is very easy to ask
wrong or bad questions. Also, the data provided is course and
there is no interaction for elaboration purposes. It is also difficult to
enthuse people into completing and returning questionnaires. It is
therefore common to have a low response which can mean that a bias is
introduced.
Methods For Collecting Data -
Interviewing
Interviewing is carried out for the purpose
of information gathering in an environment where the interviewer controls
the situation, sets the purpose of the interview and controls the pacing.
An interview may involve just 2 people (interviewer and interviewee), or
an entire group of people.
Generally speaking, an interview
will be successful if the interviewee is clear about the aim of the
interview and has access to the right information to answer the questions.
Also, the interviewer should remain neutral (though interested) throughout
the interview and have a debriefing session with the interviewee at the
end.
Strengths Of Interviewing Method
Carrying
out interviews is a good way to collect high quality data. The interviewer
can ask direct questions and can probe further where required. This
interactive approach allows the interviewer to gain more information and
uncover any hidden agendas.
Limitations Of
Interviewing Method
This method is expensive, time consuming
and requires highly skilled individuals to be involved. Also, there is a
need to ensure a standardisation amongst interviewers. As with
questionnaires, there is the problem that a response bias may emerge.
Methods For Collecting Data - Experiments
Experiments are used to establish the general principles of
interface design. These are carried out in the lab setting and allow
specific aspects of the interface to be measured. However, to be of any
value, experiments must be properly prepared.
Examples:
Decide where and when to conduct the experiment and who will be included. Decide what to test, what to control for and how to test and log the data. Make sure the instructions are right and described at the right level of detail. Make sure the time of the tasks is correct and the data logging tools are adequate. Ensure that all the data you want is collected.
Strengths Of Experiments Method
Experiments allow
us to extract concise and convincing results. We can use this method to
measure very specific aspects of a system and we can generalise over many
users. It is also possible to isolate out any individual differences.
Limitations Of Experiments Method
This method
is very expensive to do and it is difficult to isolate relevant variables.
As mentioned before, the value of the results can be questionable, and the
ecological validity of such experiments is also doubted.
On to the next chapter ... Guidelines and standards