DEVELOPMENT OF THE BRIEF BEDSIDE DYSPHAGIA SCREENING TEST – REVISED : A CROSS-SECTIONAL CZECH STUDY

Faculty of Health Studies, University of Pardubice, Czech Republic1; Department of Neurology, Pardubice Hospital, Hospitals of the Pardubice Region, Czech Republic2; Department of Otorhinolaryngology and Head and Neck Surgery, Pardubice Hospital, Hospitals of the Pardubice Region, Czech Republic3; Department of Otorhinolaryngology and Head and Neck Surgery, University Hospital Hradec Králové, Charles University in Prague, Faculty of Medicine in Hradec Králové, Czech Republic4


Introduction
Dysphagia (impaired swallowing) is a potentially serious health care problem that can lead to various complications.They can include malnutrition and dehydration due to decreased efficacy of swallowing (1) and aspiration, pneumonia (2), and death (3) due to impaired safety of swallow (1).
Dysphagia is a relatively frequent condition.However, the reported prevalence rates vary, mainly due to the exact definition of dysphagia (4) and the used diagnostic instrument (5).It is estimated to occur in up to 52.7% of the elderly (2), 44-53.6% of patients with stroke (6,7), up to 32% of patients with Parkinson's disease (8), and 48.4% of patients with head and neck cancer (9).
A timely and accurate diagnosis of dysphagia is an important prerequisite for planning and implementing effective compensatory and rehabilitative interventions (3).In many settings, bedside dysphagia screening is an important first step in the diagnostic process (3).Ideally, dysphagia screening is a quick, minimally invasive procedure that enables the identification of patients who need further assessment (3,10).
Numerous dysphagia screening instruments (DSI) exist, and several literature reviews focusing on their quality have been published (3).Most of them have focused on dyspha-gia screening in patients with neurological disorders (3,11), mainly stroke (12,13).Some DSI are intended for patients with head and neck cancer (14) or for heterogeneous patient groups (15,16).
Despite the plethora of the published DSI, none of them has been endorsed as the most useful screening method (11), partly due to a lack of consensus as to what constitutes a good DSI (10).It should be valid; therefore, many studies aiming to develop DSI use an "objective" reference test (a gold standard) that determines the presence or absence of dysphagia (10).The most widely recognized gold standards are videofluoroscopy and flexible endoscopic evaluation of swallowing (FEES) (3).The use of a gold standard enables comparisons with bedside assessments and subsequent selection of assessment items for the DSI (17).
A good DSI should be sensitive to the condition being observed (dysphagia).In fact, obtaining the highest possible sensitivity is a priority of most DSI (10).The reason is that missing (not capturing) someone who has dysphagia can lead to serious complications (10).Conversely, high specificity, i.e. correctly ruling out the patients who do not have dysphagia, is secondary as the consequences of incorrectly identifying someone as having a swallowing problem are not as serious (10).Other experts believe that in addition to high sensitivity, a good DSI should have high negative predictive value (12).It is important that patients with a normal screening result have normal swallowing function (which is reflected in high negative predictive value) because patients with a normal screening result are not referred for a detailed clinical swallow examination.
Despite abundant literature regarding the requirements for DSI, the issue of an a priori sample size calculation in studies that aim to develop a DSI is rarely discussed (18).Yet, it is important to calculate the number of subjects needed to define an expected level of the sought-after diagnostic parameter, for example sensitivity, together with the precision of that estimate (that is, the confidence intervals) (19).In other words, researchers should aim for a high point estimate of the diagnostic parameter as well as a reasonably narrow CI, which reflects greater precision of the estimate than wider CI (18).
The aim of the study was to produce a revised version of the mentioned BBDS Test by expanding Mandysova et al.'s (17) pilot study.The revised study was based on the same premise and had the same aim as the pilot study; the patient inclusion criteria and data collection method did not change.However, we present a novel approach in four areas.First, criteria were developed for assessing whether the patient group was "sufficiently homogenous" -this became a requirement for the development of a generic DSI.If this were not the case, the aim would be to develop a DSI for only the bigger patient subgroup (neurological or ENT).Second, the sought-after diagnostic parameters were clearly spelled out: priority was given to achieving the highest possible sensitivity and negative predictive value.Third, an expected level of sensitivity and the precision of that estimate were set a priori and the number of subjects needed was calculated.Fourth, part of the aim was to identify the optimal cut-off point of the instrument by calculating diagnostic parameters for all the possible scores.

Materials and Methods
Design.This cross-sectional study was conducted in a regional Czech hospital in the department of neurology, ENT, and geriatrics, with a purposive sample of 157 acutely or chronically ill patients (inpatients and outpatients) who were prone to dysphagia based on their primary neurological or ENT diagnosis.It was part of a larger study that started in 2009 and spanned 60 months; this particular phase lasted 31 months.
Two sets of investigations were carried out: a detailed bedside assessment of the swallowing function by a nurse screener and FEES by a physician.Based on a comparison of these two investigations, selected items of the bedside assessment were combined into the BBDS Test-Revised (BBDST-R).A preliminary analysis of the results, leading to the development of the BBDS Test, was conducted after the first 18 months of data collection and was described by Mandysova et al. (17).This period was the pilot study; the collected data were incorporated into the revised study.
Subjects.Inpatients were recruited via nurses or physicians on the wards; outpatients were recruited during their visits to the dysphagia clinic (17).The inclusion criteria were: prone to dysphagia based on the primary neurological or ENT diagnosis; receiving care in one of the mentioned departments; medically stable (not receiving care in the intensive care unit); sufficiently alert (able to follow simple commands); able to maintain a sitting position; able to sign an informed consent (17).
Definitions.Sufficient homogeneity was defined in the following way: the difference between relative frequencies of abnormal results in the neurological subgroup and those in the ENT subgroup should be < 5% for > 50% of the bedside assessment items and FEES (criterion 1) and simultaneously, it should be < 10% for all bedside assessment items and FEES (criterion 2).If both criteria were not met, a DSI would be developed for only the bigger patient subgroup (neurological or ENT).
Sample size calculation.Flahault et al.'s (20) guidelines for sample size calculation in diagnostic test studies were followed to ensure a given precision of the sensitivity estimate.The first step entails an assumption on the expected value of the new diagnostic test sensitivity (20).Next, the minimum acceptable lower confidence limit is determined, together with the required probability that this limit is not violated (20).The minimal sample size for the group of "cases" (N cases ) is read from the provided tables, and the corresponding number of "controls" (N controls ) is obtained from an equation that assumes disease prevalence (Prev) < 50%: 1) (20).For the purpose of this study, the disease (dysphagia) prevalence assumption was based on the relative frequency of abnormal FEES in the pilot study, which was 35.6% for all patients and 29.2% for the neurological subgroup (17).Therefore, the above Equation 1 could be used.For Prev > 50%, Flahault et al. (20) recommend using the same equation with N controls in place of N cases (Equation 2), which would be relevant for the development of an ENT instrument based on dysphagia prevalence in the ENT subgroup (66.7%) in the pilot study (17).Expected sensitivity of the instrument was set at 95%, and the lower 95% confidence limit was set at 75% with 0.95 probability.N cases (= 34) was obtained from the table and N controls (= 60) from Equation 1 (20); the total required Some patients did not complete certain bedside assessment items (patient refused or did not understand more detailed instructions; the swallow test was terminated).Such occurrences were treated as missing items.Conversely, patients who did not complete both the bedside assessment and FEES were excluded.
Data analysis.Each bedside assessment item result was dichotomized: normal/negative versus abnormal/positive, and so were the PAS scores: 1 = normal/negative versus 2-8 = abnormal/positive.Patients with a normal FEES and bedside assessment in all 32 items were excluded from further analysis because their data would not contribute to the explanation of observed variation in the results.
To address the issue of sufficient homogeneity, absolute and relative frequencies of abnormal bedside assessment and FEES results in both patient subgroups (neurological and ENT) were calculated and compared, using the two mentioned criteria.This step determined whether all the subsequent steps of data analysis should use the results of all the patients (for the development of a generic DSI) or the results of only the bigger patient subgroup.
Next, the results between individual bedside assessment items and FEES were compared, using the association coefficient φ (phi).The φ coefficient indicates the tightness and direction (positive or negative) of the association between two dichotomous variables (24).Zero indicates no association and ±1 indicates a perfect association if the frequency of both variables in a 2×2 contingency table is evenly split (24).Calculations were performed with SPSS 19.0 statistical software (IBM SPSS, Inc., Chicago, Illinois).Bedside assessment items with a positive and statistically significant association (p-value ≤ 0.05) with FEES were combined into a DSI if the bedside assessment item contained ≤ 5% of missing values, as recommended by Schafer (25).
The total score of the developed DSI was obtained by summing up the scores of all of its items (normal/negative item = 0; abnormal/positive item = 1).The total score was dichotomized (normal/"pass" versus abnormal/"fail") using all the possible cut-off scores.Next, it was examined which cut-off score resulted in the highest sensitivity and negative predictive value.A 2 × 2 contingency table was created, and using the patients' results, the number of truly positive (FEES and bedside assessment abnormal), truly negative (FEES and bedside assessment normal), falsely positive (FEES normal and bedside assessment abnormal), and falsely negative (FEES abnormal and bedside assessment normal) cases was determined for each possible cut-off score.Subsequently, sensitivity, specificity, and positive and negative predictive values were computed using the Clinical Calculator 1 (26).For sensitivity, the results were compared with the a priori determined expected sensitivity and its lower 95% confidence limit.
Ethical considerations.The study was conducted according to the Declaration of Helsinki and was approved by the hospital ethics committee.Prior to enrolling the sample size was 94 patients.A 20% reserve was planned for patients who might drop out as recommended by Bochmann et al. (18) or whose data would be eliminated in the data analysis phase.Therefore, the necessary sample size was 112-113 patients.Since the type of the DSI (generic versus for a homogenous patient group) was determined only in the data analysis phase, patient enrollment was to continue until one of the two subgroups (rather than the entire group) included 112-113 patients.
Instruments.For the data collection phase, a detailed 32-item bedside assessment instrument was developed.The instrument entailed physical assessment (20 items) and a swallow test (12 items).Twenty-four of the items were based on the Massey Bedside Swallowing Screen (21) and the Gugging Swallowing Screen (GUSS) (22).Additional 8 physical assessment items were based on discussions with dysphagia experts, as particularized by Mandysova et al. (17).Physical assessment focused mainly on the reflexes and the motor function of the muscles involved in swallowing (17).The swallow test comprised three sequential steps: swallowing 1) a thick liquid (pudding-like consistency, four teaspoons), 2) a thin liquid (four teaspoons), and 3) a thin liquid (60 mL, drinking from a cup) (17).Administering a thick liquid before a thin liquid was in accordance with the GUSS (22), and thin liquid testing was congruent with Massey and Jedlicka's procedure (17,21).Testing was terminated if the patient experienced cough, choking, wet/ gurgly voice, or the liquid dripping from the mouth during and for up to one minute following each step of the swallow test.The technique of the physical assessment and swallow test was described in depth by Mandysova et al. (17).
Penetration Aspiration Scale (PAS) was used to score selected aspects of swallowing assessed by the gold standard, FEES (23).PAS is an eight-point ordinal scale quantifying penetration (passage of material into the larynx to the level of the vocal folds) and aspiration (passage of material below the level of the vocal folds) (23).Furthermore, it reflects whether or not the material is expelled (23).
Procedure.Three investigators performed the bedside assessment.Two were university-educated nurses with extensive advanced practice experience in the field.The third one was a master-level nursing student and was involved for only a 12-month period between 03/2010 and 02/2011, which corresponded to the period of her master's degree project.Two specially trained physicians performed FEES.
The aim was to perform both assessments as soon as the patient met the inclusion criteria.At the same time, it was ensured that the period between both assessments was as short as possible.For the neurological subgroup (N = 106), the period was, on average, 1.22 days; for 96 (90.6%) patients, the period was ≤ 24 hours.For the ENT subgroup (N = 38), the period was, on average, 2.84 days; for 25 (65.7%)patients, it was ≤ 24 hours.The sequence of the two assessments depended mainly on the availability of the personnel involved in the study.participants, the researchers were required to obtain written informed consent while preventing undue influence on potential participants.

Results
Of the 180 patients who were approached, 2 refused to participate, 15 did not undergo FEES (7 refused; 8 were not transported for the examination by the staff for various reasons), and 6 did not undergo detailed bedside assessment (1 refused; 1 died; 4 were discharged).The mean age of the remaining 157 patients (99 men; 58 women) was 67.6 ± 13.6 years (range 21-91); 107 were inpatients and 50 were outpatients.One hundred and twelve patients had a neurological condition (stroke 54; myasthenia gravis 30; amyotrophic lateral sclerosis 6; cranial nerve palsy 5; Parkinson's disease 4 and other conditions 13).Forty-five patients had an ENT condition (head or neck cancer 23; dysphagia, unspecified 7; inflammation or infection 4 and other conditions 11).Of these, 12 (6 with a neurological and 6 with an ENT condition) were excluded from subsequent data analysis because all their results were normal.Another patient from the ENT subgroup was excluded as he obtained no PAS score due to a permanent separation of the airway from the oral and pharyngeal pathway after a laryngectomy.
A comparison of the results using the φ coefficient showed a statistically significant, positive association (p-value ≤ 0.05) between FEES and 10 bedside assessment items (Table 1).Two of the bedside assessment items contained significantly > 5% of missing values.Therefore, only the remaining 8 were combined into an 8-item DSI (Figure 1).
The diagnostic parameters of the BBDST-R depended on the number of true and false negative and true and false positive results.This in turn depended on the relevant cut-off score (Table 2).Cut-off score 1 produced 95.5% sensitivity (95% CI 84.9-98.7%)and 88.9% negative predictive value (95% CI 67.2-96.9%).The higher the cut-off score, the lower the value of both of these parameters.

Discussion
A comparison of relative frequencies of abnormal results obtained for the two subgroups revealed that neither of the two predetermined criteria was met.The two subgroups differed by 16.4-26.9%on 4 bedside assessment items.In all four cases, the frequency of abnormal results was higher in the neurological subgroup.Moreover, they differed by 19% to a larger patient sample (N = 106 after data reduction) than required (N = 94) based on Flahault et al.'s Equation 1 (20).Flahault et al.'s disease prevalence assumption (Prev < 50%) (20) was satisfied: for the neurological subgroup, FEES was abnormal in 41.5% of the cases.
Implementation of the BBDST-R in practice.Screening is the essential first step in identifying risk of dysphagia that expedites referral to speech pathology for evaluation and treatment (31).As for the BBDST-R, it has already been implemented in clinical practice.For example, it is used in the neurology department of a large university hospital in Ostrava, Czech Republic (32).In the first year of its implementation (in 2013), a total of 1051 patients were screened (32).The result was negative in 662 of the patients (33).The remaining 389 patients (with positive screening result) were referred to a speech-language pathologist for detailed clinical examination of swallowing, who confirmed the diagnosis of dysphagia in 165 patients (33).Almost two thirds of these patients required rehabilitation of swallowing; only 12 were referred for FEES (33).In this particular hospital, implementation of the BBDST-R has promoted multidisciplinary collaboration and has facilitated the identification of patients with swallowing difficulties (32).
Limitations of the study.The first limitation is that the study was not entirely blinded as the physician performing FEES had access to bedside assessment results.In our opinion, this was important for ethical reasons as the physician needed the information to support high-quality care.The second limitation is that the period between bedside assessment and FEES was short (≤ 24 hours) in "only" 96 (90.6%) patients; in 10 other patients, it ranged from 2-22 days.

Conclusion
Our research has led to the development of the eightitem Brief Bedside Dysphagia Screening Test-Revised.In patients with neurological conditions, the BBDST-R has high sensitivity and negative predictive value.As part of the implications for practice, we recommend it for use in departments caring for patients with these conditions.It is very simple; therefore, it does not require extensive training of the personnel (e.g.nurses) involved in the screening.However, dysphagia screening is only the initial part of the diagnostic algorithm, and multidisciplinary collaboration remains of paramount importance.In other words, simple dysphagia screening does not replace the role of other health care providers whose expertise in the area of dysphagia assessment is much broader and deeper.
Bedside assessment item results with a statistically significant association to FEES.