ACE RJ validation paper

(1)

Validation of the revised Addenbrooke’s Cognitive

Examination (ACE-R) for detecting mild cognitive impairment

and dementia in a Japanese population

...

Hidenori Yoshida,

^1,2

Seishi Terada,

¹

Hajime Honda,

¹

Yuki Kishimoto,

¹

Naoya Takeda,

¹

Etsuko Oshima,

¹

Keisuke Hirayama,

³

Osamu Yokota

¹

and Yosuke Uchitomi

¹

1Department of Neuropsychiatry, Okayama University Graduate School of Medicine, Dentistry and Pharmaceutical Sciences, Okayama, Japan

2Department of Neurology, National Hospital Organization Minami-Okayama Medical Center, Okayama, Japan

3Yura Hospital, Okayama, Japan

ABSTRACT

Background: Early detection of dementia will be important for implementation of disease-modifying treatments in the near future. We aimed to investigate the diagnostic validity and reliability of the Japanese version of the revised Addenbrooke’s Cognitive Examination (ACE-R J) for identifying mild cognitive impairment (MCI) and dementia.

Methods: We translated and adapted the original ACE-R for use with a Japanese population. Standard tests for evaluating cognitive decline and dementing disorders were applied. A total of 242 subjects (controls = 73, MCI = 39, dementia = 130) participated in this study.

Results: The optimal cut-off scores of ACE-R J for detecting MCI and dementia were 88/89 (sensitivity 0.87, specificity 0.92) and 82/83 (sensitivity 0.99, specificity 0.99) respectively. ACE-R J was superior to the Mini-Mental State Examination in the detection of MCI (area under the curve (AUC): 0.952 vs. 0.868), while the accuracy of the two instruments did not differ significantly in identifying dementia (AUC: 0.999 vs. 0.993). The inter-rater reliability (ICC = 0.999), test-retest reliability (ICC = 0.883), and internal consistency (Cronbach’s α = 0.903) of ACE-R J were excellent.

Conclusion: ACE-R J proved to be an accurate cognitive instrument for detecting MCI and mild dementia. Further neuropsychological evaluation is required for the differential diagnosis of dementia subtypes.

Key words:Addenbrooke’s Cognitive Examination revised, mild cognitive impairment, dementia, Japanese, validation study

Introduction

Early detection of dementia, especially in the prodromal stage, will play an important role in the utilization of disease-modifying treatments in the near future. From this point of view, the concept of mild cognitive impairment (MCI) was proposed and is now widely accepted (Petersen, 2004). MCI is defined as a condition comprising (1) memory complaint usually corroborated by an informant, (2) objective memory impairment for the patient’s age, (3) essentially preserved general cognitive function, (4) largely intact functional activities, and (5) no dementia. MCI is assumed to be a prodromal

Correspondence should be addressed to: Hidenori Yoshida, Department of Neuropsychiatry, Okayama University Graduate School of Medicine, Dentistry and Pharmaceutical Sciences, 2-5-1 Shikata-cho, Kita-ku, Okayama 700-8558, Japan. Phone: +81-86-235-7242, Fax: +81-86-235-7246. Email: [email protected]. Received 30 Oct 2010; revision requested 1 Dec 2010; revised version received 22 May 2011; accepted 24 May 2011.

stage of dementia, especially Alzheimer’s disease (AD).

Screening tests for MCI should be sensitive and specific to mild cognitive decline, and validated in various populations. However, a valid and reliable test for detecting MCI that can be used in busy clinical settings has not been established (Lonie et al., 2009). For example, the Mini-Mental State Examination (MMSE; Folstein et al., 1975) is the most commonly used cognitive screening test. However, a meta-analysis of the accuracy of MMSE revealed its very limited value in distinguishing MCI from healthy controls (Mitchell, 2009).

Addenbrooke’s Cognitive Examination (ACE) (Mathuranath et al., 2000) and its revised version (ACE-R) (Mioshi et al., 2006) were developed as a brief test of cognitive functions sensitive to the early stage of dementia. In addition to detection of dementia, the ACE and ACE-R are reported to

(2)

be useful for detecting MCI (Mioshi et al., 2006; Alexopoulos et al., 2010) and predicting conversion of MCI to dementia (Mitchell et al., 2009). To date, many different language versions of ACE-R (Alexopoulos et al., 2010; Carvalho et al., 2010; Konstantinopoulou et al., 2010; Kwak et al., 2010) have been validated, and we recently developed a Japanese version of ACE-R (ACE-R J).

In the present study, our hypothesis was that ACE-R J would be superior to the conventional MMSE in detecting MCI and dementia in a Japanese population. We therefore aimed to (1) investigate the ability of ACE-R J to differentiate between normal controls, MCI, and dementia patients; (2) investigate the ability to differentiate between subgroups of patients based on severity of dementia; (3) provide detailed normative data for the ACE-R J total and subscores, and allow direct comparison of a subject’s score in a certain cognitive domain against a normal control’s performance; (4) determine the optimal cut-off scores of ACE-R J for identifying MCI and dementia, and compare its diagnostic accuracy with that of MMSE; (5) investigate whether heterogeneity of the dementia patients and demographic parameters affected the results of this study; and (6) examine the inter-rater and test-retest reliabilities, internal consistency, and concurrent validity of ACE-R J.

Methods

The instrument

ACE-R, which incorporates the MMSE, consists of five domains, each representing a specific cognitive function: (1) attention and orientation (18 points), (2) memory (26 points), (3) fluency (14 points), (4) language (26 points), and (5) visuospatial ability (16 points). The total score of ACE-R is 100 points, which includes the MMSE score (30 points). Higher scores indicate better cognitive functioning.

We translated and modified ACE-R with advice from the authors of the original version. Regarding the items included in the MMSE, we referred to the Japanese version of MMSE that has recently been translated and adapted by Sugishita (2009). Other modifications concerned the name and address recall and recognition tests, retrograde memory test, letter fluency test, word repetition test, naming test, comprehension test, reading test, and perceptual ability (fragmented letters) test.

On the name and address recall and recognition tests, we changed the composition of the items from the English original (name of person, 2 points; name of street, 3 points; town, 1 point; and county, 1 point) to the Japanese style (prefecture, 1 point;

city, 1 point; town, 1 point; block number, 2 points; and name of person, 2 points). On the retrograde memory test, we replaced “the woman who was Prime Minister” with “the last Prime Minister”. On the letter fluency test, words beginning with the letter “P” were replaced by words beginning with the syllable “ka” because the Japanese language is based on syllables rather than letters. The score of the phonological fluency test was also changed because the Japanese phonological fluency test might be a more difficult task than the English letter fluency test. For example, the mean scores of letter fluency for the initial letter “P” were 11.3–13.9 words for cognitively normal elderly (65–74 year- olds) in an urban US community (Ganguli et al., 2010). Meanwhile, the mean scores of phonological fluency for the initial syllable “ka” were 7.9 words for normal Japanese elderly aged 60–69 years and 7.2 words for those aged 70–79 (Ito et al., 2004). In the letter fluency test of the original ACE-R, 7 points were given for >17 words, 6 points for 14– 17 words, 5 points for 11–13 words, 4 points for 8–10 words, 3 points for 6–7 words, 2 points for 4–5 words, and 1 point for 2–3 words. Therefore, we changed the scores of the Japanese phonological fluency test as follows: 7 points were given for >13 words, 6 points for 11–13 words, 5 points for 8– 10 words, 4 points for 6–7 words, 3 points for 4–5 words, 2 points for 3 words, and 1 point for 2 words. On the word repetition test, we replaced “hip- popotamus, eccentricity, unintelligible, statistician” with “kirigirisu, hototogisu, tororosoba, ikijibiki,” words that correspond to the original criteria (multi- syllabic words that are difficult to repeat and occur with low frequency). Souma and Tanabe (2003) suggested that these complex words were useful for detecting repetition disorder. On the naming test, according to the original criteria of words that anyone who has completed compulsory education should know, we replaced “barrel” and “harp” with

“light bulb” and “trumpet”, using pictures adopted from Snodgrass and Vanderwart (1980). On the comprehension test, we replaced the instruction

“Point to the one which is a marsupial (correct answer is kangaroo)” by “Point to the one which is a reptile (correct answer is crocodile)”, based on their familiarity to the Japanese elderly. On the reading test, irregularly pronounced English words were replaced by irregular Japanese words, all of which were kanji (ideographic script) compounds (what are called “jyukujikun”). On the perceptual ability (fragmented letters) test, we created new fragmented “katakana” (Japanese syllabic characters) to replace the original English letters, referring to the incomplete letters test of the Visual Object and Space Perception Battery (VOSP) (Warrington and James, 1991), because

(3)

many elderly individuals in Japan might not be familiar with alphabetic characters.

In the modification process, we took care that the Japanese version might be linguistically and culturally equivalent to the original English version. Several pilot versions were administered to various patients and controls before the final version of ACE-R J was given to the subjects reported here. Thereafter, a bilingual expert not familiar with the original ACE-R translated it back into English. The retranslated version was similar to the original one except for the modified items. Like the original, ACE-R J can be administered in approximately 15 minutes.

Participants

A total of 242 subjects participated and were placed in one of three groups: a dementia group (n = 130), a mild cognitive impairment (MCI) group (n = 39), and a control group (n = 73).

The dementia and MCI groups comprised consecutive patients referred to the Memory Clinic of Okayama University Hospital. All patients underwent general physical, neurological and psychiatric examinations, laboratory testing (including thyroid function tests, serum vitamin B1, B12, and folate analyses, and syphilis serology), and brain imaging (MRI or CT). Subjects with the potential causes of cognitive decline other than neurodegenerative or cerebrovascular disease (e.g. depression, schizophrenia, epilepsy, head injury, alcoholism) were excluded.

Neuropsychological examinations, including MMSE and ACE-R J, were administered by experienced clinical psychologists. The Clinical Dementia Rating (CDR; Morris, 1993) score was rated by other clinical psychologists who were blind to the neuropsychological test scores of the patients and based on interviews with the patients and informants. The CDR sum of boxes (CDR-SB) (Berg et al., 1992), a summation of the scores from the individual domains of CDR, was calculated. After all examinations had been carried out, the chief clinician established the clinical diagnoses independent of the performance on ACE-R J.

Patients in the dementia group were diagnosed with Alzheimer’s disease (AD; n = 106), Alzheimer’s disease with cerebrovascular disorder (AD with CVD; n = 7), vascular dementia (VaD; n = 3), frontotemporal dementia (FTD; n = 8), and dementia with Lewy bodies (DLB; n = 6). All patients with AD met the criteria for probable AD formulated by the NINCDS-ADRDA (McKhann et al., 1984). All patients with VaD and AD with CVD met the criteria for probable VaD and AD with CVD, respectively, formulated by the NINDS-

AIREN (Rom´an et al., 1993). All VaD and AD with CVD subjects had multiple lacunar infarcts of the basal gray matter and/or thalamus, in addition to periventricular and deep white matter lesions. FTD was diagnosed according to the international consensus criteria for FTD (Neary et al., 1998). No patient met the criteria for semantic dementia or progressive nonfluent aphasia. All patients with DLB met the criteria for probable DLB formulated by McKeith et al. (2005).

Patients of the MCI group met the criteria of (1) memory complaints, corroborated by an informant; (2) abnormal memory function, documented by delayed recall of one paragraph from the Logical Memory II subtest of the Wechsler Memory Scale Revised (Sugishita, 2001) (cutoff scores: <9 for

>15 years of education, <5 for 10–15 years of education, and <3 for 0–9 years of education (the maximum number of paragraph items possible to recall correctly is 25)); (3) normal general cognitive function and no or minimal impairment in activities of daily living, as determined by an interview with the patient and an informant (CDR score = 0.5); and (4) not sufficiently impaired, cognitively and functionally, to meet the NINCDS-ADRDA criteria for AD.

Control subjects were recruited from family members of patients attending the clinics. All controls underwent the same evaluation procedure as the patients without laboratory testing and brain imaging. They were independent in their activities of daily living (ADLs) and did not have any memory complaints (CDR score = 0). Subjects with serious physical, neurological, or psychiatric disorders that could affect cognitive functioning were excluded.

This study was approved by the Internal Ethical Committee of Okayama University Graduate School of Medicine, Dentistry, and Pharmaceutical Sciences. After a complete description of the study to the participants and their relatives, written informed consent was obtained.

Statistical analyses

Statistical analyses were performed using PASW Statistics 18.0 (SPSS, Chicago, IL, USA). A value of p < 0.05 was accepted as significant.

Demographic variables (age and education), MMSE score, CDR sum of boxes (CDR-SB), and total and subscores of ACE-R J were compared by one-way analysis of variance, followed by Bonferroni correction. χ² tests were employed for categorical data (gender). If intergroup differences between demographic variables attained statistical significance, a multiple regression analysis was carried out to investigate possible associations

(4)

between the demographic variables (gender, age, and education) and the total ACE-R J score.

Sensitivity and specificity of ACE-R J and MMSE were calculated using a receiver operating characteristic (ROC) curve that plotted sensitivity and specificity across the range of possible cut-off scores. The area under the curve (AUC) was used as a measure of the ability of each test to distinguish between groups of participants. In this study, we used StAR software (Vergara et al., 2008) to select optimal cut-off scores for identifying MCI and dementia. The optimal cut-off score was defined as the score that led to the maximal accuracy of classification. Then we estimated positive and negative predictive values at different prevalence rates (1%, 5%, 10%, 20%, and 40%) for each optimal cut-off score. The StAR program was also used to calculate the AUC for each test and to assess statistical differences between AUCs.

Inter-rater reliability was evaluated by determ- ining the intraclass correlation coefficient (ICC) of 31 participants (control, n = 2; MCI, n = 5; AD, n = 24). Two investigators evaluated patients at the same time but were blind to each other’s ratings. Sixteen participants were actively evaluated by one of them while the other passively observed, and their roles were reversed for the other 15. Test- retest reliability was verified using 21 participants (control, n = 2; MCI, n = 4; AD, n = 15). The second rating for test-retest reliability was performed four weeks after the first rating. The ICC was used to determine the test-retest reliability. The internal consistency reliability within ACE-R J was assessed using Cronbach’s coefficient α (Cronbach and Meehl, 1955). Concurrent and convergent validity was calculated using Spearman’s correlation between ACE-R J total scores and dementia severity scores (CDR and CDR-SB).

Results

Demographics of control, MCI, and dementia groups

The demographic parameters, CDR-SB score, MMSE, and ACE-R J total and subscores of control, MCI, and dementia groups are shown in Table 1.

No significant differences in gender distribution between groups were detected. Age (F(2, 239) = 27.58, p < 0.001) and education (F(2, 239) = 11.80, p < 0.001) differed among groups, and post hoc analysis showed that the control group was significantly younger and more educated than the MCI and dementia groups, and that the dementia group was older than the MCI group. The multiple regression analysis using the total score of ACE-

R J as a dependent variable and demographic data (gender, age, and education) as independent variables revealed a significant impact of age (β; standardized partial regression coefficient =

−0.376, p < 0.001) and education (β = 0.209, p < 0.001) on the ACE-R J score. The same analysis limited to the control subjects (n = 73) also showed that age (β = −0.461, p < 0.001) and education (β = 0.322, p = 0.001) affected ACE-R J performance significantly.

CDR-SB (F(2, 239) = 181.78, p < 0.001), MMSE (F(2, 239) = 142.65, p < 0.001), and ACE- R J total (F(2, 239) = 248.12, p < 0.001) scores were significantly different between groups. On ACE-R J, subscores for attention and orientation (F(2, 239) = 93.15, p < 0.001), memory (F(2, 239) = 355.13, p < 0.001), fluency (F(2, 239) = 107.16, p < 0.001), language (F(2, 239) = 55.98, p < 0.001), and visuospatial ability (F(2, 239) = 33.02, p < 0.001) differed significantly among groups. Post hoc analysis revealed that the dementia group had lower scores in all five domains than the control and MCI groups (p < 0.001). The MCI group had lower scores than the control group in memory and fluency domains (p < 0.001), but there were no significant differences between the two groups in the attention and orientation (p = 0.058), language (p = 0.611), and visuospatial (p = 0.998) scores.

Demographics in dementia group (very mild, mild, and moderate dementia groups)

The dementia group (n = 130) was subdivided into three groups according to the CDR score: very mild (CDR = 0.5), mild (CDR = 1), and moderate dementia (CDR = 2) groups. In the very mild dementia group (n = 71), 68 patients were diagnosed with AD and three were diagnosed with AD with CVD, DLB, or FTD. In the mild dementia group (n = 52), 35 patients were diagnosed with AD, six with AD with CVD, three with VaD, four with DLB, and four with FTD. In the moderate dementia group (n = 7), three patients were diagnosed with AD, one with DLB, and three with FTD. Demographic parameters are shown in Table 2.

No significant differences in gender distribution, age, or education among groups were detected. CDR-SB (F(2, 127) = 277.25, p < 0.001), MMSE (F(2, 127) = 40.22, p < 0.001), and ACE-R J total score (F(2, 127) = 45.12, p < 0.001) were significantly different between groups (p < 0.001). In the ACE-R J subscores, attention and orientation (F(2, 127) = 36.28, p < 0.001), memory (F(2, 127) = 9.48, p < 0.001), fluency (F(2, 127) = 22.88, p < 0.001), language (F(2, 127) = 18.93,

(5)

Table 1. Comparison of demographic data, CDR sum of boxes (CDR-SB), MMSE, ACE-R J total and subscores in control, MCI, and dementia groups (n = 242, standard deviation in parenthesis)

D E M E N T I A V S. D E M E N T I A V S. M C I V S.

C O N T R O L M C I D E M E N T I A C O N T R O L M C I C O N T R O L

n = 73 n = 39 n = 130 P VA L U E S P VA L U E S P VA L U E S ...

Gender, male/female 27/46 17/22 42/88

Age 66.3 (10.0) 71.4 (9.2) 75.4 (7.0) ∗∗∗ ∗ ∗∗

Education, years 12.7 (2.3) 11.4 (2.1) 11.1 (2.7) ∗∗∗ n.s. ∗

CDR-SB 0.1 (0.2) 1.6 (1.2) 4.7 (2.2) ∗∗∗ ∗∗∗ ∗∗∗

MMSE 29.0 (1.2) 26.7 (1.8) 21.5 (4.1) ∗∗∗ ∗∗∗ ∗∗

ACE-R J total score 93.3 (3.9) 82.2 (6.4) 61.5 (12.9) ∗∗∗ ∗∗∗ ∗∗∗

100 points maximum

Attention and Orientation 17.9 (0.4) 16.8 (1.4) 13.4 (3.1) ∗∗∗ ∗∗∗ n.s. 18 points maximum

Memory 23.6 (1.9) 16.9 (4.3) 8.6 (4.5) ∗∗∗ ∗∗∗ ∗∗∗

26 points maximum

Fluency 11.0 (2.2) 8.8 (2.3) 5.7 (2.8) ∗∗∗ ∗∗∗ ∗∗∗

14 points maximum

Language 25.3 (0.9) 24.5 (1.5) 20.9 (3.9) ∗∗∗ ∗∗∗ n.s.

26 points maximum

Visuospatial 15.6 (0.8) 15.2 (1.5) 13.0 (3.1) ∗∗∗ ∗∗∗ n.s.

16 points maximum

n.s. = not significant.^∗p < 0.05.^∗∗p < 0.01.^∗∗∗p < 0.001. Pairwise comparisons were performed using Bonferroni’s test.

Table 2. Comparison of demographic data, CDR sum of boxes (CDR-SB), MMSE, ACE-R J total and subscores in very mild (CDR 0.5), mild (CDR 1), and moderate (CDR 2) dementia groups (n = 130, standard deviation in parenthesis)

V E R Y M I L D M I L D M O D E R AT E M O D E R AT E V S. M O D E R AT E V S. M I L D V S. (C D R 0 . 5 ) (C D R 1 ) (C D R 2 ) V E R Y M I L D M I L D V E R Y M I L D

n =71 n = 52 n = 7 P VA L U E S P VA L U E S P VA L U E S ...

Gender, male/female 19/52 21/31 2/5

Age 74.3 (7.4) 77.1 (6.1) 73.9 (8.2) n.s. n.s. n.s.

Education, years 11.0 (2.2) 11.0 (2.8) 12.4 (2.7) n.s. n.s. n.s.

CDR-SB 3.0 (0.6) 6.2 (1.3) 10.1 (0.7) ∗∗∗ ∗∗∗ ∗∗∗

MMSE 23.6 (2.2) 19.7 (3.9) 14.3 (6.2) ∗∗∗ ∗∗∗ ∗∗∗

ACE-R J total score 68.3 (8.3) 55.3 (11.5) 38.7 (12.9) ∗∗∗ ∗∗∗ ∗∗∗

100 points maximum

Attention and Orientation 14.9 (1.8) 12.0 (3.0) 8.1 (4.5) ∗∗∗ ∗∗ ∗∗∗

18 points maximum

Memory 9.9 (4.2) 7.3 (4.5) 4.3 (2.8) ∗∗ n.s. ∗∗

26 points maximum

Fluency 6.9 (2.5) 4.5 (2.3) 2.1 (2.1) ∗∗∗ ∗ ∗∗∗

14 points maximum

Language 22.5 (3.0) 19.4 (3.8) 16.0 (5.0) ∗∗∗ ∗ ∗∗∗

26 points maximum

Visuospatial 14.1 (2.4) 12.1 (3.0) 8.1 (4.2) ∗∗∗ ∗∗ ∗∗∗

16 points maximum

p < 0.001), and visuospatial (F(2, 127) = 18.60, p < 0.001) scores differed among groups. Post hoc analysis revealed that the very mild dementia group (CDR = 0.5) had significantly higher scores in all five subdomains than the mild (CDR = 1) and moderate (CDR = 2) dementia groups. The

moderate dementia group had lower scores than the mild dementia group in attention and orientation (p = 0.001), fluency (p = 0.049), language (p = 0.049), and visuospatial (p = 0.002) domains, but there was no significant difference between the two groups in the memory domain (p = 0.234).

(6)

Normative data

Control data were used to generate normative scores for the ACE-R J total and subdomain scores based upon the mean minus two standard deviations (lower limit of normal) for four age bands (50–59, 60–69, 70–79, and 80–86 years old) and all age groups, as shown in Table 3.

Among these four age groups, no significant differences in education were detected (F(3, 69) = 1.878, p = 0.141). ACE-R J total scores (F(3, 69) = 7.645, p < 0.001), attention and orientation (F(3, 69) = 3.043, p = 0.035), memory (F(3, 69) = 6.570, p = 0.001), fluency (F(3, 69) = 3.262, p = 0.027), and language subscores (F(3, 69) = 3.474, p = 0.021) differed among groups. There was no significant difference between groups in the visuospatial subscore (F(3, 69) = 1.135, p = 0.341). Post hoc analysis showed that the 70–79 and 80–86 age band groups had lower total scores than the 50– 59 and 60–69 age band groups (p = 0.006–0.009). In the memory subscore, the 80–86 age band group had a lower score than the 50–59 and 60–69 age band groups (respectively, p = 0.003 and 0.004). In the language subscore, the 70–79 age band group had a lower score than the 50–59 and 60–69 age band groups (respectively, p = 0.04 and 0.032). Pairwise comparisons using Bonferroni correction did not show significant differences between groups in the attention and orientation domain and fluency domain.

Diagnostic interpretation

ROC analyses revealed the cut-off scores and sensitivity and specificity of ACE-R J and MMSE for identifying MCI and dementia (Table 4). For discriminating between control and MCI groups, the optimal cut-off score of ACE-R J was 88/89 (sensitivity 0.87, specificity 0.92) and that of MMSE was 26/27 (sensitivity 0.41, specificity 0.99). The AUC of ACE-R J (= 0.952) was significantly (p = 0.009) larger than that of the MMSE (= 0.868). For discriminating between control and dementia groups, the optimal cut- off score of ACE-R J was 82/83 (sensitivity 0.99, specificity 0.99) and that of MMSE was 26/27 (sensitivity 0.97, specificity 0.99). The AUC of ACE-R J (= 0.999) was slightly but not significantly (p = 0.097) larger than that of the MMSE (= 0.993).

We did not assess the utility of the VLOM ratio (the ratio of scores of verbal fluency plus language to orientation plus name and address delayed recall memory), which was designed to differentiate AD from FTD patients in the original ACE-R, because

the number of FTD patients was small. ^Ta

ble3.Lowerlimitofnormal(cut-offscores)ofACE-RJtotalandsubscoresforfouragebands(50–59,60–69,70–79,80–86yearsold)andallage groups,showingcontrolmeanminustwostandarddeviations. AGERANGEEDUCATIONTOTALSCOREATTENTION/ORIENTATIONMEMORYFLUENCYLANGUAGEVISUOSPATIAL (MEANYEARS)(MAX.100)(MAX.18)(MAX.26)(MAX.14)(MAX.26)(MAX.16) ....................................... 50–59(n=18)13.688182282514 60–69(n=26)13.089182182515 70–79(n=21)12.186172072315 80–86(n=8)11.981171852416 Allcontrols(n=73)12.786182072415

(7)

Table4.Sensitivity,speciﬁcityandpositivepredictivevalues(PPV)atdifferentprevalenceratesofoptimalcut-offACE-RJandMMSEscoresforidentifying MCIanddementia.Valuesinparenthesesrepresenttherespectivenegativepredictivevalue. MCIPPVATDIFFERENTPREVALENCERATES OPTIMALCUT-OFF SCORESENSITIVITYSPECIFICITY1%5%10%20%40% ....................................... ACE-RJ88/890.870.920.10(1.00)0.36(0.99)0.54(0.98)0.73(0.97)0.88(0.91) MMSE26/270.410.990.23(0.99)0.61(0.97)0.77(0.94)0.88(0.87)0.95(0.72) ....................................... DEMENTIAPPVATDIFFERENTPREVALENCERATES OPTIMALCUT-OFF SCORESENSITIVITYSPECIFICITY1%5%10%20%40% ....................................... ACE-RJ82/830.990.990.42(1.00)0.79(1.00)0.89(1.00)0.95(1.00)0.98(0.99) MMSE26/270.970.990.42(1.00)0.79(1.00)0.89(1.00)0.95(0.99)0.98(0.98)

AD, MCI, and controls: performance on ACE-R J

To investigate whether the inclusion of the FTD patients affects the results of this study, we reanalyzed the data of the controls (n = 73) and patients excluding FTD (n = 122). For discriminating between these groups, the optimal cut-off score of ACE-R J was 82/83 (sensitivity 0.99, specificity 0.99) and that of MMSE was 26/27 (sensitivity 0.98, specificity 0.99). The AUC of ACE-R J (= 0.999) was not significantly (p = 0.168) larger than that of the MMSE (= 0.995). These results were similar to the findings of the analyses including the FTD patients.

Then, to investigate the influence of the heterogeneity of the dementia group, we reanalyzed the data limited to the patients with AD (n = 106), MCI (n = 39), and controls (n = 73). The data of AD patients was as follows: gender = 31 males and 75 females; age = 74.9 ± 7.2 years (mean ± standard deviation); years of education = 11.2 ± 2.4; CDR-SB score = 4.3 ± 2.0; MMSE score = 22.0 ± 3.7; ACE-R J total score = 63.3 ± 11.8; attention and orientation subscore = 13.8 ± 2.9; memory subscore = 8.7 ± 4.1; fluency subscore = 6.0 ± 2.7; language subscore = 21.5 ± 3.6; and visuospatial subscore = 13.3 ± 3.0.

The AD group was significantly older (p < 0.001) and less educated (p < 0.001) than the control group, but there were no significant differences between the AD and MCI groups in the demographic parameters. The AD group had higher CDR-SB scores (p < 0.001) and lower scores of MMSE, ACE-R J total, and all subdomains (p < 0.001) than the control and MCI groups. For discriminating between control and AD groups, the optimal cut-off score of ACE-R J was 82/83 (sensitivity 0.99, specificity 0.99) and that of MMSE was 26/27 (sensitivity 0.97, specificity 0.99). The AUC of ACE-R J (= 0.999) was not significantly (p = 0.167) larger than that of the MMSE (= 0.993). These findings suggested that the heterogeneity of the dementia group did not affect the results of this study.

Additionally, we took subgroups of AD (n = 24) and MCI (n = 24) patients individually matched to a control group (n = 24) for gender, age, and education to exclude the influences of these demographic parameters (Table 5). Intergroup comparison showed that the AD subgroup had lower scores than the control and MCI subgroups in ACE-R J total and all subdomain scores (p < 0.001). The MCI subgroup had lower scores than controls in total score and memory and fluency subdomains (p < 0.001), but there was no significant difference in attention and orientation

(8)

Table 5. Comparison of performance on ACE-R J (total and subscores) in controls, MCI, and AD subgroups matched for gender, age, and education (n = 24 per group). ACE-R J subscores shown as mean (proportion to the maximum subsore).

A D V S. A D V S. M C I V S.

C O N T R O L S M C I A D C O N T R O L S M C I C O N T R O L S

N = 2 4 N = 2 4 N = 2 4 P VA L U E S P VA L U E S P VA L U E S ...

ACE-R J total score 93.0 81.2 58.8 ∗∗∗ ∗∗∗ ∗∗∗

Attention and Orientation 17.9 (99.5%) 16.8 (93.5%) 12.8 (70.8%) ∗∗∗ ∗∗∗ n.s.

Memory 23.2 (89.3%) 16.1 (61.9%) 7.0 (26.9%) ∗∗∗ ∗∗∗ ∗∗∗

Fluency 11.2 (80.1%) 8.5 (60.4%) 5.3 (37.8%) ∗∗∗ ∗∗∗ ∗∗∗

Language 25.0 (96.3%) 24.5 (94.2%) 21.2 (81.4%) ∗∗∗ ∗∗∗ n.s.

Visuospatial 15.7 (97.9%) 15.3 (95.6%) 12.5 (78.4%) ∗∗∗ ∗∗∗ n.s.

(p = 0.292), language (p = 0.995), or visuospatial (p = 0.953) subdomains. ROC analyses of the matched groups revealed the cut-off scores for identifying MCI (88/89; sensitivity 0.88, specificity 0.96) and AD (83/84; sensitivity 0.99, specificity 0.99). These results were similar to the findings of the whole group analyses.

Reliability and validity

Inter-rater reliabilities of ACE-R J and MMSE were very good, as evaluated with ICC of 0.999 and 0.997, respectively. Test-retest reliabilities of the instruments were also good (ICC = 0.883 and 0.764 for ACE-R J and MMSE, respectively). The internal consistency reliability for ACE-R J (26 items) was very high (Cronbach’s coefficient α = 0.903).

Correlations of ACE-R J total score with CDR (Spearman’s ρ = −0.851, p < 0.001) and CDR-SB (Spearman’s ρ = −0.895, p < 0.001) scores were very good. The negative value reflects the fact that as CDR and CDR-SB scores increase, the ACE-R J total score decreases.

Discussion

The Japanese version of ACE-R proved to be a sensitive and specific cognitive instrument for the diagnosis of MCI and dementia in our sample. Particularly for identifying MCI, ACE-R J was superior to the conventional MMSE in accuracy. The inter-rater reliability, test-retest reliability, internal consistency, and concurrent validity of this instrument were also good. These findings suggest that ACE-R J is a valid and reliable screening tool.

ACE-R J and MMSE had equivalent diagnostic accuracies for detecting dementia. The AUC of the former was slightly, but not significantly, larger than that of the latter. Similar ratings of ACE-R were

reported for other language versions (Alexopoulos et al., 2010; Carvalho et al., 2010). Although ACE-R takes a few minutes more to administer than MMSE, the ACE-R assesses a broader range of cognitive abilities than MMSE, especially on memory, language, and visuospatial components. Further, MMSE lacks a component measuring executive function, but ACE-R includes tasks to measure verbal fluency, which is considered to be associated with frontal lobe function. Thus, we consider that ACE-R can provide more useful and precise information than MMSE for the differential diagnosis of dementia subtypes.

We modified ACE-R mainly in the memory and language subdomains for cultural adaptation, and changed the scores of the phonological fluency test. In spite of these modifications, the Japanese version of ACE-R was equivalent to the original ACE-R in important aspects as a cognitive screening test. The optimal cut-off scores for identifying MCI (88/89) and dementia (82/83) were similar to the original (higher (88) and lower (82) cut-off scores for identifying dementia). The ACE-R total and subscores of the control group were almost identical to each other in both studies (mean total score of ACE-R J vs. original ACE-R: 93.3 vs. 93.7; mean attention and orientation subscore: 17.9 vs. 17.7; mean memory subscore: 23.6 vs. 23.4; mean fluency subscore: 11.0 vs. 11.9; mean language subscore: 25.3 vs. 25.1; and mean visuospatial subscore: 15.6 vs. 15.7).

However, some minor differences were also found between the two studies. The participants of this study were somewhat older than those in the original study (mean age of control group: 66.3 vs. 64.4, MCI group: 71.4 vs. 68.8, dementia group: 75.4 vs. 65.7). In the MCI and dementia groups, the MMSE and ACE-R scores in our study were similar to those of the original, excluding the apparently lower score of the memory subdomain in the Japanese dementia group than the original (mean

(9)

memory subscore: 8.6 vs. 12.4). The difference in the memory subdomain might be due to the difficulty of the memory task in the modified Japanese version, or the larger number of AD and fewer FTD patients in our study than in the original study (106 AD and 8 FTD patients in our study compared to 67 AD and 55 FTD patients in the original study).

In both studies, the dementia groups had significantly lower scores in ACE-R total and all subdomains. However, the MCI group had a lower score than the control group only in the total score, memory, and fluency subdomains in this study. Moreover, in the comparison between matched groups, the MCI group had a slightly, but not significantly, lower score than the controls in the other three subdomains (attention and orientation, language, and visuospatial subdomains). On the other hand, in the original study, the MCI group had significantly lower scores in the memory, fluency, and language subdomains than the controls, and the matched MCI group had significantly lower scores than the matched controls in four subdomains without the visuospatial subdomain. The difference between the MCI and control groups in subdomain scores without the memory and fluency domains was unclear in our samples, unlike the original study. Because the Bonferroni correction is very conservative, the power to detect differentiation among groups might be diminished in our study (Bland and Altman, 1995).

Comparisons within the dementia group suggest that the ACE-R J might differentiate very mild dementia (CDR = 0.5) from more severe dementia, but the difference between mild (CDR = 1) and moderate (CDR = 2) dementia was somewhat unclear, and some subscores of ACE-R J, for example the memory subscore, might suffer from the floor effect. We have to examine whether ACE- R J can monitor progressive cognitive declines in advanced stages of dementia using more data from severe patients.

The heterogeneity of the dementia patients and demographic parameters did not affect the results of group comparisons and the cut-off scores in this study, but age and education had a significant influence on the performance of ACE-R J. Younger participants had higher total ACE-R J scores than older individuals, and those with higher levels of education had higher total scores on ACE-R J than participants with fewer years of education. Similar effects of age and education are reported for the Brazilian (Carvalho et al., 2010), Greek (Konstantinopoulou et al., 2010), and Korean (Kwak et al., 2010) versions of ACE-R. Moreover, we also found the same effects of age on the ACE-R

J total, memory, and language scores in the analysis of normative data, but the original study reports that there was not a significant effect of age (Mioshi et al., 2006). We hope to examine the detailed correlation of ACE-R J total and subscores with age and years of education in a future study.

Our study has other limitations. First, we could not examine the construct validity for subdomains of ACE-R J and its usefulness to discriminate AD patients from other dementing disorders (e.g. FTD, primary progressive aphasia, and DLB). Second, the participants were recruited at a university center. Thus, our results apply only to a clinic-based patient population. The applicability and reliability of ACE-R J in community samples require further investigation. Third, we used the clinical diagnosis based on a comprehensive diagnostic workup and international diagnostic criteria as the gold standard. Despite the high validity of the diagnostic criteria, clinical diagnoses are not always confirmed at autopsy. Thus, we should take into account the possibility of erroneous clinical assessments, and the validity of ACE-R J may be lower than our results suggest. However, the Japanese version of ACE-R seems to be an accurate instrument for detection of MCI and mild dementia, and should be widely used in clinical practice.

Conﬂict of interest None.

Description of authors’ roles

H. Yoshida designed the study, analyzed the data, and wrote the paper. H. Honda, Y. Kishimoto, N. Takeda, E. Oshima, K. Hirayama, and O. Yokota collected the data. S. Terada and Y. Uchitomi supervised the study design, participated in data analysis and assisted with writing the paper.

Acknowledgments

This work was supported by grants from the Japanese Ministry of Education, Culture, Sports, Science and Technology, and the Zikei Institute of Psychiatry. We would like to thank Professor John Hodges and Dr. Eneida Mioshi for granting permission to translate the ACE-R into Japanese, and Ms. Ido, Ms. Horiuchi, Ms. Imai, and Ms. Yabe for their skillful assistance with this study. Copies of ACE-R J can be obtained from the website of “FRONTIER – Frontotemporal Dementia Research Group” at http://www.ftdrg.org.

(10)

References

Alexopoulos, P. et al. (2010). Validation of the German revised Addenbrooke’s Cognitive Examination for detecting mild cognitive impairment, mild dementia in Alzheimer’s disease and frontotemporal lobar degeneration. Dementia and Geriatric Cognitive Disorders, 29, 448–456.

doi: 10.1159/000312685.

Berg, L., Miller, J. P., Baty, J., Rubin, E. H., Morris, J. C. and Figiel, G. (1992). Mild senile dementia of the Alzheimer type: 4. Evaluation of intervention. Annals of Neurology, 31, 242–249. doi: 10.1002/ana.410310303. Bland, J. M. and Altman, D. G. (1995). Multiple

significance tests: the Bonferroni method. BMJ, 310, 170.

Carvalho, V. A., Barbosa, M. T. and Caramelli, P. (2010). Brazilian version of the Addenbrooke’s Cognitive Examination-revised in the diagnosis of mild Alzheimer disease. Cognitive and Behavioral Neurology, 23, 8–13. doi: 10.1097/WNN.0b013e3181c5e2e5.

Cronbach, L. J. and Meehl, P. E. (1955). Construct validity in psychological tests. Psychological Bulletin, 52,

281–302.

Folstein, M. F., Folstein, S. E. and McHugh, P. R. (1975).

“Mini-mental state”: a practical method for grading the cognitive state of patients for the clinician. Journal of Psychiatric Research, 12, 189–198. doi: 10.1016/ 0022-3956(75)90026-6.

Ganguli, M., Snitz, B. E., Lee, C. W., Vanderbilt, J., Saxton, J. A. and Chang, C. C. (2010). Age and education effects and norms on a cognitive test battery from a population-based cohort: the

Monongahela-Youghiogheny Healthy Aging Team. Aging and Mental Health, 14, 100–107. doi: 10.1080/

13607860903071014.

Ito, E., Hatta, T., Ito, Y., Kogure, T. and Watanabe, H. (2004). Performance of verbal fluency tasks in Japanese healthy adults: effect of gender, age and education on the performance. Japanese Journal of Neuropsychology, 20, 254–263 (in Japanese).

Konstantinopoulou, E., Kosmidis, M. H., Ioannidis, P., Kiosseoglou, G., Karacostas, D. and Taskos, N. (2010). Adaptation of Addenbrooke’s Cognitive Examination-Revised for the Greek population. European Journal of Neurology, Epublished ahead of print: doi: 10.1111/j.1468–1331.2010.03173.x.

Kwak, Y. T., Yang, Y. and Kim, G. W. (2010). Korean Addenbrooke’s Cognitive Examination Revised (K-ACER) for differential diagnosis of Alzheimer’s disease and subcortical ischemic vascular dementia. Geriatrics and Gerontology International, 10, 295–301. doi: 10.1111/ j.1447–0594.2010.00624.x.

Lonie, J. A., Tierney, K. M. and Ebmeier, K. P. (2009). Screening for mild cognitive impairment: a systematic review. International Journal of Geriatric Psychiatry, 24, 902–915. doi: 10.1002/gps.2208.

Mathuranath, P. S., Nestor, P. J., Berrios, G. E., Rakowicz, W. and Hodges, J. R. (2000). A brief

cognitive test battery to differentiate Alzheimer’s disease and frontotemporal dementia. Neurology, 55, 1613–1620. McKeith, I. G. et al. (2005). Diagnosis and management of

dementia with Lewy bodies: third report of the DLB Consortium. Neurology, 65, 1863–1872.

McKhann, G., Drachman, D., Folstein, M., Katzman, R., Price, D. and Stadlan, E. M. (1984). Clinical diagnosis of Alzheimer’s disease: report of the NINCDS-ADRDA Work Group under the auspices of Department of Health and Human Services Task Force on Alzheimer’s Disease. Neurology, 34, 939–944.

Mioshi, E., Dawson, K., Mitchell, J., Arnold, R. and Hodges, J. R. (2006). The Addenbrooke’s Cognitive Examination Revised (ACE-R): a brief cognitive test battery for dementia screening. International Journal of Geriatric Psychiatry, 21, 1078–1085. doi: 10.1002/gps.1610. Mitchell, A. J. (2009). A meta-analysis of the accuracy of the

Mini-Mental State Examination in the detection of dementia and mild cognitive impairment. Journal of Psychiatric Research, 43, 411–431.

doi: 10.1016/j.jpsychires.2008.04.014.

Mitchell, J., Arnold, R., Dawson, K., Nestor, P. J. and Hodges, J. R. (2009). Outcome in subgroups of mild cognitive impairment (MCI) is highly predictable using a simple algorithm. Journal of Neurology, 256, 1500–1509. doi: 10.1007/s00415-009-5152-0.

Morris, J. C. (1993). The Clinical Dementia Rating (CDR): current version and scoring rules. Neurology, 43,

2412–2414.

Neary, D. et al. (1998). Frontotemporal lobar degeneration: a consensus on clinical diagnostic criteria. Neurology, 51, 1546–1554.

Petersen, R. C. (2004). Mild cognitive impairment as a diagnostic entity. Journal of Internal Medicine, 256, 183–194. doi:10.1111/j.1365–2796.2004.01388.x. Rom ´an, G. C. et al. (1993). Vascular dementia: diagnostic

criteria for research studies. Report on the NINDS-AIREN International Work Group. Neurology, 43, 250–260. Snodgrass, J. G. and Vanderwart, M. (1980). A

standardized set of 260 pictures: norms for name agreement, image agreement, familiarity, and visual complexity. Journal of Experimental Psychology: Human Learning and Memory, 6, 174–215.

Souma, Y. and Tanabe, H. (2003). Shitsugo no shoukougaku (Symptomatology of Aphasia). Tokyo: Igaku-Shoin Ltd. Sugishita, M. (2001). Wechsler Memory Scale-Revised

(WMS-R) Japanese version. Tokyo: Nihon Bunka Kagakusha Co. Ltd.

Sugishita, M. (2009). Present status and perspective of cognitive assessment in dementia. Dementia Japan, 23, 55–63 (in Japanese).

Vergara, I. A., Norambuena, T., Ferrada, E., Slater, A. W. and Melo, F. (2008). StAR: a simple tool for the statistical comparison of ROC curves. BMC Bioinformatics, 9, 265. doi: 10.1186/1471-2105-9-265.

Warrington, E. K. and James, M. (1991). Visual Object and Space Perception Battery. Bury St. Edmunds: Thames Valley Test Company.