Presentation
DH17 - Usability of a Healthcare App Among Different User Groups
SessionPoster Session 2
DescriptionImportance of accessibility to healthcare has increased around the world as the aging populations continue to grow and current healthcare is not fully equipped to meet their needs. In addition, among older adults (55+) in the U.S., about 85% have at least one chronic health condition or disease. The growing need for accessible healthcare has led to a great increase in telehealth use, but such technologies still have a long way to go when it comes to safety, effectiveness, usability, and recognizing the diversity of the users. In particular, older adults may lack experience or cognitive skills to successfully use digital technologies.
Purpose of the Study
The aim of this study was to understand key usability problems that people experience when using telehealth applications. The hope is that this study and others like it may encourage redesign of telehealth apps and procedures to aid all populations in need of easy access to healthcare. Out of the three age groups included in this study, it was expected that the oldest (65+) would score lowest in digital literacy and in the observed usability. However, we also predicted that usability problems would be encountered by all groups.
Method
Participants
The participants involved in the study were younger adults (mean age = 21.17, range = 20-24), middle-aged adults (mean age = 53.33, range = 45-63), and older adults (mean age = 72.60, range = 65-80).
Materials
The materials were a printed consent form, demographic survey, and debriefing form, along with a pen, the experimenter’s laptop, and the experimenter’s phone. The laptop was used for participants to complete the Northstar Digital Literacy Survey and the phone was used for the symptom checker service on the WebMD app.
Predictor Variables
Participants’ age group and digital literacy scores were predictor variables. From the observations during the completion of the symptom checker, the researcher coded each participant’s results based on a heuristic analysis following Jakob Nielsen’s 10 general principles for interaction design. Each principle was rated on a scale of 1-3 with 1 meaning the participants had no issues, 2 there was at least one issue, and 3 there were multiple issues that interfered drastically.
Criterion Variables
The criterion variables were the observed usability problems, which were scored by the number of issues that participants experienced using the WebMD app and the types of problems.
Design
This was a correlational study comparing participants’ digital literacy results with their WebMD symptom checker results (observation heuristics). The digital literacy results were quantitative, while the symptom checker results were qualitative and quantified through an observation-rated heuristic analysis following Jakob Nielsen’s 10 general principles for interaction design. The results between age groups were correlated. The reoccurring types of qualitative usability problems were analyzed as well.
Results
Heuristic Analysis
Before interviewing participants, the experimenter created a heuristics analysis on the WebMD symptom checker and took notes on what issues participants might encounter. Based on Jakob Nielsen’s 10 general principles for interaction design, the average rating for every step is as follows: (1) 1.50, (2) 1.75, (3) 2.14, (4) 1.50, (5) 1.75, (6) 1.25, (7) 1.43, (8) 1.13, (9) 2.60, (10) 1.71. The lowest rating overall was for principle 9: “Help users recognize, diagnose, and recover from errors.”
Observation Heuristics
The average ratings of observation heuristics for each participant were compared to the participants’ digital literacy results. Kruskal-Wallis test was run for both digital literacy and observation heuristics. The observation heuristic Kruskal-Wallis test, x2 = 4.64, df = 2, p = 0.098, revealed no significant difference between the age groups, while the digital literacy Kruskal-Wallis test, x2 = 9.74, df = 2, p = 0.008, demonstrated a significant difference between age groups. A Dwass-Steel-Critchlow-Flinger pairwise comparison of digital literacy revealed a significant difference for the older group compared to both the younger (W = 3.873, p = 0.017) and middle-age groups (W = -3.704, p = 0.024). There was no significant difference between the younger and middle-age groups: W = 0.130, p = 0.995.
Issues
In total, participants experienced 28 different issues with the symptom checker. All age groups experienced various navigation issues such as scrolling and general searching for symptoms. Most of the participants had the planned condition appear somewhere within the first 15 conditions on the list but few had it as the first option. Younger participants only recognized 52% of the conditions on the “Possible Conditions” page while middle-age adults recognized 62% and older adults recognized 68%. When asked about their confidence level in the results, middle-aged adults felt the least confident, and all middle-age and older adults mentioned contacting their doctor in their next steps compared to 1/3 of younger adults.
Nielsen’s Principles
The average rating for every step is as follows: (1) 1.17, (2) 1.70 (3) 1.37 (4) 1.40, (5) 1.62 (6) 1.40, (7) 1.41, (8) 1.46, (9) 1.67, (10) 1.70. The lowest rating overall was for principle 2: “Match between the system and the real world.” This is mostly affected by the medical jargon of the symptoms and conditions in the app. Both younger and middle-age adults’ lowest ratings were for principle 2 (1.85 and 1.82 respectively), while older adults’ lowest rating (1.62) was for principle 10: “Help and documentation”.
Discussion
As predicted, older adults scored lower on the digital literacy test and had less experience with technology compared to middle-age and younger adults. However, older adults scored very similar to both groups for the observation heuristic. The fact that all groups had usability issues with the symptom checker suggests there are important issues to address that are not a result of low digital literacy or lack of technology experience. Some issues that could be addressed are the medical jargon, scrolling, and the help and documentation.
Limitations
Only 17 participants were involved in the study, so it is possible that these results are not an accurate representation of the differences between these age groups. Most of the adults had a higher level of education and were currently holding a job, meaning they most likely have had more access to technology than the average person, which may have skewed results. In addition, across literature, older adults are defined differently (55+, 65+, etc.) as there is not a specific age that one automatically becomes “older”. When it comes to the observation heuristics, it is possible that some issues were not noticed because they were verbally addressed by participants. The symptom checker portion of the study involved participants talking aloud through the steps. Some participants talked very little and the interviewer may not have gotten a full grasp of what occurred. Older adults on average talked less compared to the other groups. This may explain some results such as one of the older participants that scored the highest on the observation heuristic (OH = 1.26, DL = 58.4).
Purpose of the Study
The aim of this study was to understand key usability problems that people experience when using telehealth applications. The hope is that this study and others like it may encourage redesign of telehealth apps and procedures to aid all populations in need of easy access to healthcare. Out of the three age groups included in this study, it was expected that the oldest (65+) would score lowest in digital literacy and in the observed usability. However, we also predicted that usability problems would be encountered by all groups.
Method
Participants
The participants involved in the study were younger adults (mean age = 21.17, range = 20-24), middle-aged adults (mean age = 53.33, range = 45-63), and older adults (mean age = 72.60, range = 65-80).
Materials
The materials were a printed consent form, demographic survey, and debriefing form, along with a pen, the experimenter’s laptop, and the experimenter’s phone. The laptop was used for participants to complete the Northstar Digital Literacy Survey and the phone was used for the symptom checker service on the WebMD app.
Predictor Variables
Participants’ age group and digital literacy scores were predictor variables. From the observations during the completion of the symptom checker, the researcher coded each participant’s results based on a heuristic analysis following Jakob Nielsen’s 10 general principles for interaction design. Each principle was rated on a scale of 1-3 with 1 meaning the participants had no issues, 2 there was at least one issue, and 3 there were multiple issues that interfered drastically.
Criterion Variables
The criterion variables were the observed usability problems, which were scored by the number of issues that participants experienced using the WebMD app and the types of problems.
Design
This was a correlational study comparing participants’ digital literacy results with their WebMD symptom checker results (observation heuristics). The digital literacy results were quantitative, while the symptom checker results were qualitative and quantified through an observation-rated heuristic analysis following Jakob Nielsen’s 10 general principles for interaction design. The results between age groups were correlated. The reoccurring types of qualitative usability problems were analyzed as well.
Results
Heuristic Analysis
Before interviewing participants, the experimenter created a heuristics analysis on the WebMD symptom checker and took notes on what issues participants might encounter. Based on Jakob Nielsen’s 10 general principles for interaction design, the average rating for every step is as follows: (1) 1.50, (2) 1.75, (3) 2.14, (4) 1.50, (5) 1.75, (6) 1.25, (7) 1.43, (8) 1.13, (9) 2.60, (10) 1.71. The lowest rating overall was for principle 9: “Help users recognize, diagnose, and recover from errors.”
Observation Heuristics
The average ratings of observation heuristics for each participant were compared to the participants’ digital literacy results. Kruskal-Wallis test was run for both digital literacy and observation heuristics. The observation heuristic Kruskal-Wallis test, x2 = 4.64, df = 2, p = 0.098, revealed no significant difference between the age groups, while the digital literacy Kruskal-Wallis test, x2 = 9.74, df = 2, p = 0.008, demonstrated a significant difference between age groups. A Dwass-Steel-Critchlow-Flinger pairwise comparison of digital literacy revealed a significant difference for the older group compared to both the younger (W = 3.873, p = 0.017) and middle-age groups (W = -3.704, p = 0.024). There was no significant difference between the younger and middle-age groups: W = 0.130, p = 0.995.
Issues
In total, participants experienced 28 different issues with the symptom checker. All age groups experienced various navigation issues such as scrolling and general searching for symptoms. Most of the participants had the planned condition appear somewhere within the first 15 conditions on the list but few had it as the first option. Younger participants only recognized 52% of the conditions on the “Possible Conditions” page while middle-age adults recognized 62% and older adults recognized 68%. When asked about their confidence level in the results, middle-aged adults felt the least confident, and all middle-age and older adults mentioned contacting their doctor in their next steps compared to 1/3 of younger adults.
Nielsen’s Principles
The average rating for every step is as follows: (1) 1.17, (2) 1.70 (3) 1.37 (4) 1.40, (5) 1.62 (6) 1.40, (7) 1.41, (8) 1.46, (9) 1.67, (10) 1.70. The lowest rating overall was for principle 2: “Match between the system and the real world.” This is mostly affected by the medical jargon of the symptoms and conditions in the app. Both younger and middle-age adults’ lowest ratings were for principle 2 (1.85 and 1.82 respectively), while older adults’ lowest rating (1.62) was for principle 10: “Help and documentation”.
Discussion
As predicted, older adults scored lower on the digital literacy test and had less experience with technology compared to middle-age and younger adults. However, older adults scored very similar to both groups for the observation heuristic. The fact that all groups had usability issues with the symptom checker suggests there are important issues to address that are not a result of low digital literacy or lack of technology experience. Some issues that could be addressed are the medical jargon, scrolling, and the help and documentation.
Limitations
Only 17 participants were involved in the study, so it is possible that these results are not an accurate representation of the differences between these age groups. Most of the adults had a higher level of education and were currently holding a job, meaning they most likely have had more access to technology than the average person, which may have skewed results. In addition, across literature, older adults are defined differently (55+, 65+, etc.) as there is not a specific age that one automatically becomes “older”. When it comes to the observation heuristics, it is possible that some issues were not noticed because they were verbally addressed by participants. The symptom checker portion of the study involved participants talking aloud through the steps. Some participants talked very little and the interviewer may not have gotten a full grasp of what occurred. Older adults on average talked less compared to the other groups. This may explain some results such as one of the older participants that scored the highest on the observation heuristic (OH = 1.26, DL = 58.4).
Event Type
Poster Presentation
TimeTuesday, April 14:45pm - 6:15pm EDT
LocationFrontenac Foyer
