Study design and population
This multicentric study was carried out in two ICUs and one clinical investigation center for healthy volunteers, in France. Inclusions took place between March 7, 2019, and September 25, 2020.
The healthy volunteers were included if they were over 18 years old and French speakers. Recruitment was distributed across age ranges to proportionally represent all age groups. Twenty volunteers aged 18 to 39 years, twenty aged 40 to 65 and twenty over 65 were included. ICU patients were included if they were over 18 years old, French speakers, calm and awake as evaluated by the Richmond Agitation and Sedation Scale (RASS) score [13] between − 1 and + 1, if they had no known neurological disorder prior to hospitalization and had proper hearing and vision with correction if needed. Patients undergoing invasive mechanical ventilation or breathing spontaneously were included. The non-inclusion criteria were limited to patient or healthy volunteer under legal protection or who refused participation. All participants gave informed consent to participate in the study. Participant blinding was not possible due to the very nature of the intervention. The study protocol was approved by the regional ethics committee (Espace de reflexion éthique région centre val de Loire, Tours, France, no. 2018_090) in accordance with national regulations. The protocol was registered (clinicaltrials.gov: NCT05078632, 2021/10/14, retrospectively registered).
The device
Eye-tracking technology is used in several fields like medicine [17,18,19,20,21], surgery [22, 23], psychology or aerospace [24]. This technology relies on a video-based eye tracker which determines gaze direction by measuring the position of the corneal reflection of an infrared light relative to the pupil. These reflections are then analyzed to determine with a high degree of accuracy the gaze motion. This enables to calculate in real time gaze motion over the computer screen and determine what the user is looking at on the screen. We specifically developed a device: a computer with an eye-tracker device disposed on an articulated bracket in order to be ideally adjusted in front of ICU patient (Fig. 1). The screen must be placed so that it can detect the eye position and perform a visual calibration, at 60–80 cm of the patients face. Calibration (measuring characteristics of the user’s eyes to derive gaze calculation based on a physiological 3D eye model) was performed on five gaze positions on the screen [25]. The eye tracker frequency was 60 Hz, enabling the sensor to catch 60 corneal reflections per second. The eye tracker used in our study was the Tobii X2-30 compact (Tobii Pro, Danderyd, Sweden), and data were analyzed with Tobii Pro Studio (Tobii Pro).
The comprehension test
The Montreal Toulouse test (MT-86) is a neuropsychological test created to assess language disorders [26]. It comprises 19 examinations and lasts 3 h when performed comprehensively by a speech therapist. It evaluates different components of oral comprehension. The oral comprehension test is evaluated with 47 sheets of pictures. For the present study, with the help of a team comprising speech therapists and ICU physicians and nurses, we specifically developed an adapted version of the test implemented in the eye-tracking device to be used with ICU patients. We divided the different sheets into three tests (15 sheets of pictures for each) to enable the reuse of the test at different moments during the patient’s hospital stay without learning phenomena. Each 15 sheets test evaluated three levels of comprehension: words (lexical comprehension), simple sentences (active sentence comprehension) and complex sentences (for example: long sentence, passive sentence comprehension). Each sheet contained 2 to 4 pictures as in the original test, some plates evaluating lexical comprehension comprising originally 6 pictures were simplified to 4 pictures and had been digitalized. A registered human voice was used to instruct patients in order to be as close as possible to reality considering the prosodic aspects of comprehension. Only the original pictures of the MT-86 were used without any modification.
Test procedure
Patients underwent the test over three consecutive weekdays (one test a day, always in the same order) with a research nurse or a speech therapist (called here the assessor) according to their availability. The healthy volunteers passed the 3 tests in a room with a research nurse, with a 5-min break between each test (always in the same order). The assessor placed the computer in front of the participant, and the screen position was adjusted. After a successful calibration, the comprehension test automatically began. Several sheets with pictures were presented to the participant with vocal instruction concerning the pictures to look at (Additional file 1). The instruction given to participants was: “watch as long as possible the pictures as asked by the voice,” for example, “the peacock” or “the horse pulls the boy” (Fig. 2). During the instruction time, a white square on a black background was displayed on the screen, and thereafter, the picture sheet was presented during 6 s to the participant on the screen and gaze motion was recorded. Transition between sheets was automatic. The total duration of the test was 2 and a half minutes. During the test, the assessor was placed behind the participant to pick up any particular event which occurred impeding proper test completion. Furthermore, a camera recorded the patient’s face allowing to identify eye closure induced gaze-tracking deficiencies.
Gaze movement analysis
Every test sheet was divided into different areas of interest, including the area of the right answer, and other areas with distractors (Additional file 2). For instructions comprising an object, several distractors were used: semantic distractors (a word with a close meaning), phonologic distractors (a word with a close sound in the French language) and distractors without any link with the answer. For instructions including a whole sentence with a situation to recognize, the distractors contained a close but different action and/or subject. For each sheet and participant, the total time spent watching the sheet and the total time of gaze fixation on the right picture were calculated in order to define the gaze time proportion spent on the right picture. The answer to each sheet was classified as “right,” “wrong” or “not interpretable.” An answer was classified as right if the subject watched the sheet for more than 3 s (half of the 6 s total sheet presentation time) and if he/she spent more than 50% of this sheet watching time, with the gaze on the right picture. If a subject spent less than 50% of the sheet watching time on the right picture, the answer was classified as wrong. We considered an answer as not interpretable if the subject spent less than 3 s watching the sheet presented. (Those cutoff values were derived from preliminary feasibility evaluations.)
Data collection
For all participants, demographic information was gathered (age, gender, right or left-handed, education level), medical history (hearing, visual or neurological impairment), and current medication. For patients, additional information was collected: CAM-ICU scale (Confusion Assessment Method for the ICU) [11, 12], Simplified Acute Physiology Score II (SAPSII), primary diagnosis [27, 28] and organ support during ICU stay and at the time of performing the test (ventilation, dialysis, norepinephrine, etc.).
Statistical analysis
Groups were described with absolute numbers and percentages for qualitative variables, and median and interquartile range for quantitative variables. For each participant, the rate of right answers for each test was defined as the percentage of sheets with the right answer among the sheets which were watched (with a gaze motion recorded). Boxplots were used to represent right answers among subgroups based on age, education level, SAPSII at inclusion, mode of ventilation and sedation, comprehension level evaluated by each sheet. We performed Wilcoxon tests to compare the different populations. All the analyses were performed using the software R [29]. Given the lack of gold standard to evaluate oral comprehension among ICU patients, we hypothesized a priori that patients who were older, with lower education level, higher severity of disease, ventilated or who had received sedation would potentially have lower oral comprehension capabilities, also comparatively to healthy volunteers. A higher right answer rate among those subpopulations would cast doubt about the validity of the test, whereas lower rates would appear coherent in the framework of construct validity.