ANNEX B.1 Evaluation materials for testing with implementers

From wiki.gpii
Jump to: navigation, search

Table 14: Tools and Frameworks with Graphical User Interfaces for Development (IDEs)

High Level Evaluation Objective

Key Indicators/constructs

Evaluation techniques/Measuring ways

Measuring tools

Success targets/thresholds

Elimination of Usability Defects and Detailed Identification of General Usability Strengths and Weaknesses

 –        Efficiency

 –        Helpfulness

 –        Learnability

 –        Global (e.g. SUS,SUMI)

 –        Understandability/Comprehensibility

 –        Affect

 –        Control/Operability

Scenario-based usability testing (controlled/semi-controlled)

 

Quantitative/Subjective:

Standardised Usability scales/questionnaires, e.g. Software Usability Measurement Inventory (SUMI)1 and/or SUS2

1Kirakowski, J, and Corbett, M, 1990, Effective Methodology for the Study of HCI, North-Holland, Amsterdam.

2Brooke, John. SUS-A quick and dirty usability scale. Usability evaluation in industry 189 (1996): 194.

No major usability defects (less than 5%)

Weaknesses to account for less than 5% as identified

 –        Learnability

 –        Understandability/Comprehensibility

 –        Effectiveness

Performance testing

Quantitative/Subjective:

Post Task Subjective Mental Effort Questionnaire

Mental effort lower than 3 (out of 5)

 –        Affect

 –        Satisfaction/Attractiveness

 –        Understandability/Comprehensibility

 –        Control/Operability

 

Pluralistic Walkthrough3

3Bias, Randolph G., The Pluralistic Usability Walkthrough:Coordinated Emphathies, in Nielsen, Jakob, and Mack, R. eds, Usability Inspection Methods. New York, NY: John Wiley and Sons. 1994.

Qualitative/Subjective:

Paper prototypes are used for the walkthrough. Hard-copy panels of screens, dialog boxes, menus, etc. are presented in the same order in which they would appear online for each task.

All for attributes incorporated (column) > 3 (out of 5)

Improved User Experience

 

 –        Attractiveness

 –        Efficiency

 –        Perspicuity

 –        Dependability

 –        Stimulation

 –        Novelty

Multivariate self-reported testing in semi-controlled setting

Quantitative/Subjective:

User Experience Questionnaire (UEQ) (http://www.ueq-online.org/)

Above average performance in UEQ  Benchmarks particularly regarding Perspicuity and Stimulation

 

 –        Symbolic meaning might comprise:

  •   Personal meaning
  •  Utility meaning
  •   Tool meaning

Sentence completion technique/Survey/Interview

Qualitative/Subjective:

Questionnaire with sentences to complete4

4Kujala, S., Walsh, T., Nurkka, T., Crisan, M. (2013). Sentence Completion for Understanding Users and Evaluating User Experience. Interacting with Computers, 26, 3, 238-255

Above 70% of successful sentence completion (check instrument)

Technology acceptance within relevant user groups

 

All mentioned in TAM:

  • Perceived usefulness (U)
  • Perceived ease-of-use (E)
  • Attitude towards using (A)
  • Behavioural intention to use (BI)

 

Usability testing/Self-reported *

*Change over time (repeated)

 

Quantitative/Subjective:

Technology Acceptance Model questionnaires

Technology Acceptance Model (TAM); Weighting: anticipated usefulness to account 1.5 more times than ease-of-use.5

5 Davis, F. D. (1989). Perceived usefulness, perceived ease of use, and user acceptance of information technology. MIS Quarterly, 13(3), 319–340.

Above 5 (out of 7)

Technology acceptance within relevant user groups

 

Perceived voluntariness (Livari, 1996)6

6 Livari, J., 1996. Why are CASE tools not used? Communication of the ACM, 39, 94-103.

Usability testing /self-reported

Quantitative/Subjective:

Structured questionnaire (3-4 items)

Above 3 (out of 5)

Matching of Notations  and Graphical Elements regarding relevant user and development activities

 –        Abstraction gradient

 –        Closeness of mapping 

 –        Consistency 

 –        Diffuseness / terseness 

 –        Error-proneness 

 –        Hard mental operations

 –        Hidden dependencies 

 –        Juxtaposability

 –        Premature commitment 

 –        Progressive evaluation 

 –        Role-expressiveness 

 –        Secondary notation and escape from formalism

 –        Viscosity

Scenario based Cognitive dimensions analysis  by:

Focus groups/interviews

Personas, Expert Reviews/Heuristic evaluation

 

 

Qualitative/Subjective:

Cognitive dimensions Questionnaire

Included in:

Qualitative /Subjective:

Semi-structured interview

Qualitative/Subjective:

Heuristic form self-completion

 

Approximate matching of cognitive dimensions to most relevant personas

Free the developer to concentrate on creative aspects of the process

 

 –        Expectations measurement

 –        Automation achieved

 –        Use of shortcuts (familiarization)

 

Remote testing/Interview

 

Qualitative /Subjective: Facilitators’ diaries

Qualitative/Subjective: Self-reported questionnaire items (open-structure)

Positive/negative expectation ratio to be positive (higher than 2)

Positive regard for automation

Shortcuts use/no- use ratio to be positive (higher than 2)

 –        Existing features/actual features used ratio

 –        Automation achieved

 –        Unnecessary events vs. required events

 –        Use of shortcuts (familiarization)

 –        Installation effort

Ethnographic testing/observation

 

Qualitative/Subjective: Facilitators’ diaries

 

Ratio>2

Positive regard for automation

Ratio>2

Shortcuts use/no- use ratio to be positive (higher than 2)

Lower installation effort for more than 65% of participants

 –        Existing features/actual features used ratio

 –        Automation achieved

 –        Unnecessary events vs. required events

 –        Use of shortcuts (familiarization)

Remote testing/interaction metrics

 

Quantitative/Objective:

Time-on-task, keystrokes, mouse and gaze-path traversal recorded and analyzed7

7 Liam Feldman, Carl J. Mueller, Dan Tamir, and Oleg V. Komogortsev. 2009. Usability testing with total-effort metrics. In Proceedings of the 2009 3rd International Symposium on Empirical Software Engineering and Measurement (ESEM '09). IEEE Computer Society, Washington, DC, USA, 426-429.

 

Table 15: Building Blocks and Frameworks (with no graphical interfaces) for developers (API)

High Level Evaluation Objective

Key Indicators/constructs

Evaluation techniques/Measuring ways

Measuring tools

Success targets/thresholds

API Usability within relevant user groups

Abstraction level, Learning style, working framework, Work-step unit, progressive evaluation, Premature commitment, Penetrability, API elaboration, API viscosity, Consistency, Role expressiveness, Domain correspondence.

Scenario based Cognitive dimensions analysis/Persona model

 

Quantitative/Subjective: Cognitive dimensions Questionnaire

 

 

Approximate matching of cognitive dimensions to most relevant personas

 

Technology acceptance within relevant user groups

 

All mentioned in TAM:

  • Perceived usefulness (U)
  • Perceived ease-of-use (E)
  • Attitude towards using (A)
  • Behavioural intention to use (BI)

 

Usability testing/Self-reported *

*Change over time (repeated)

 

Quantitative/ Subjective:

Technology Acceptance Model questionnaires

Technology Acceptance Model (TAM); Weighting: anticipated usefulness to account 1.5 more times than ease-of-use]

Overall TAM > 60%

Assess motivations
or attitudes

 –        Symbolic meaning might comprise:

  •            Personal meaning
  •           Utility meaning
  •           Tool meaning

Sentence completion technique/Survey/Interview

Qualitative/Subjective:

Questionnaire with sentences to complete 8

8 Kujala, S. and Nurkka, P. (2012). Sentence Completion for Evaluating Symbolic Meaning. International Journal of Design, 6, 3, 15-25.

–        Symbolic meaning might comprise:

 Personal meaning

Utility meaning

Tool meaning

Efficiency, Productivity for Common tasks

 

Complexity

Thinking-Aloud Protocol

 

Qualitative/Subjective:

Facilitator’s diaries

Lower complexity for 65% of users

 ·         Task Completion

 ·         Complexity

Usability testing/interaction metrics

Quantitative/Objective:

  • Task completion times
  • Code Metrics (e.g. lines of code or code complexity)

Task completion duration ≤2.5 standard deviation * Duration( Expert user)

 

 

Table 16: Web-based Developer Resources

High Level Evaluation Objective

Key Indicators/constructs

Evaluation techniques/Measuring ways

Measuring tools

Success targets/thresholds

Global User Experience with focus on Perceived Attractiveness and Efficiency of Offering

 –        Attractiveness

 –        Efficiency

 –        Perspicuity

 –        Dependability

 –        Stimulation

 –        Novelty

Multivariate self-reported testing in online setting

Qualitative/ Subjective:

Personas

Quantitative/Subjective: User Experience Questionnaire (UEQ)

Good and excellent rating in UEQ  Benchmarks regarding Attractiveness and Dependability

Aspect-Differentiated User Experience with focus on Helpfulness

 –        Attractiveness

 –        Controllability

 –        Efficiency

 –        Helpfulness

 –        Learnability

Multivariate self-reported testing in online setting

Quantitative/Subjective:

Standardized Comparable Questionnaires ( e.g. WAMMI)9

9WAMMI:20-statement questionnaire Website Analysis and MeasureMent Inventory (www.wammi.com)

Above average ratings for predefined aspects.

Web Usability

 –        Navigability

 –        Discoverability

 –        Accessibility

 –        Consistency

Heuristic Evaluation/interview

Qualitative/Subjective:

·         Pluralistic Usability Walkthrough completion form

·         Expert Reviews

·         Checklists

No major defects