## Calculating the number of people with Alzheimer’s disease in any country using saturated mutation models of brain cell loss that also

## predict widespread natural immunity to the disease

Ivan Kramer*

Department of Physics, University of Maryland Baltimore County, 1000 Hilltop Circle, Catonsville, MD 21250, USA

(Received 3 May 2007; final version received 16 March 2009)

The series of mutations that cause brain cells to spontaneously and randomly die leading to Alzheimer’s disease (AD) is modelled. The prevalence of AD as a function of age in males and females is calculated from two very different mutation models of brain cell death. Once the prevalence functions are determined, the number of people with AD in any country or city can be estimated.

The models developed here depend on three independent parameters: the number of mutations necessary for a brain cell associated with AD to spontaneously die, the average time between mutations, and the fraction of the risk population that is immune to developing the disease, if any. The values of these parameters are determined by fitting the model’s AD incidence function to the incidence data.

The best fits to the incidence rate data predict that as much as 74.1% of males and 79.5% of females may benaturally immuneto developing AD. Thus, the development of AD isnot a normal or inevitableresult of the aging process. These fits also predict that males and females develop AD through different pathways, requiring a different number of mutations to cause the disease. The number of people in the USA with AD in the year 2000 is estimated to be 451,000.

It is of paramount importance to determine the nature of the immunity to AD predicted here. Finding ways of blocking the mutations leading to the random, spontaneous death of memory brain cells would prevent AD from developing altogether.

Keywords:Alzheimer’s disease; mutation model; prevalence; incidence rate; brain cell loss

1. The ordered mutation model of late-onset Alzheimer’s disease

Alzheimer’s disease (AD) begins when neurons in the brain spontaneously begin dying in a disorganized way taking some memory with it. Research data suggests that AD results from a series ofrandommissense mutations in brain cells. In this section, the possibility thatallthese mutations occur one-by-one in a definite orderin a neuron until the final mutation causes the cell to die will be explored. For example, in thisorderedmutation model a normal, unmutated brain all can only mutate into the first mutation stateand no other, and cells that are in the first mutation state can only mutate into the second mutation stateand no other, etc. It is assumed that the characteristics of the cell in each mutation state are unique and different from the characteristics of the cell in every other mutation

ISSN 1748-670X print/ISSN 1748-6718 online
*q*2010 Taylor & Francis

DOI: 10.1080/17486700902910076 http://www.informaworld.com

*Email: kramer@umbc.edu Vol. 11, No. 2, June 2010, 119–159

state. It is further assumed that AD progresses as more and more neurons die. At some point, the disease impacts the central nervous system to such an extent that the patient dies.

Clearly, the number of mutations in a brain cell necessary to cause the onset of clinical symptoms of AD in males or females (risk populations) is important to determine. Equally important to ascertain is the average time a cell spends in any particular mutation state, a quantity that will be called the mutationlifetimeof the state. Perhaps the most important thing to calculate from modelling the AD incidence rate in a particular risk population is the prevalence of natural immunity to AD if, indeed, any exists. The model to be presented below will enable the values of all three of these quantities to be calculated by fitting the model’s incidence function to AD incidence data.

An important, novel feature of the AD model constructed here is that it is inherently saturated,i.e.the maximum percentage of a risk population that can develop AD can never exceed 100%. Thus, the model’s AD incidence rate function always increases, peaks, and subsequently declines towards zero as the age of the risk population increases to arbitrarily large values (in practice, the peak could lieabovethe maximum human lifespan).

In this model, it will be assumed that an ordered series of mutations, numberingm, of a brain cell is required for the cell to spontaneously die. At any aget, every brain cell of a person is in one of the mutation states.The probability or risk of a person developing AD is exactly equal to the probability that a random brain cell was mutated to death in this fashion.

Suppose thetotalnumber of brain cells in the average member of a risk population is
denoted by N_{T}. If a fraction f_{s} of these cells are susceptible to the ordered mutations
leading to death of this model, then the number of susceptible brain cells in the average
brain of the risk population isN_{s}¼f_{s}N_{T}.

Consider a random representative cohort of a risk population whose members all have the same aget. Here,t¼0 is coincident with birth. It will also be assumed that a total ofm mutations are required in a brain cell for it to spontaneously die.

Then, the number of brain cells in the average brain of the cohort that are in therth
mutation state at agetwill be denoted byN(r/m,t) orN_{r}(t) for short, wherer¼0,1,2,. . .,m.

Assuming that all cells in the average brain are in the zeroth mutation state at birth, then at
aget¼0,N_{r}(0)¼0 forr¼1,2,3,. . .,m, andN_{0}(0)¼N_{s}. A schematic representation of the
values of theN_{p}(t) at agetis shown in Figure 1 where the ordered mutations are represented
by the arrows connecting two sequential mutation states.

Consider the first mutation of a brain cell necessary to cause AD. The fraction of an
average brain with no mutated cells that experiences the first mutation per unit time will
be denoted byk_{1}and called the firstmutation rate. The average time required for a brain
cell to experience the first mutation will be defined as the firstmutation lifetimeand will
be denoted by T_{1}. The mutation lifetime and the mutation rate will be shown to be
reciprocals of each other so thatk1;T^{21}_{1} . Continuing in this way, the number of cells in
the average brainNr(t) who are in therth mutation state at agetdepends on the values of
thermutation ratesðk_{r}Þ;½k_{1};k2;. . .;kras depicted in Figure 1. It will be assumed that
all the mutation rates are constants so that the mutations experienced by the cohortoccur
randomly.

The Appendix contains a complete description on how the functions N_{r}(t) are
computed from the model just described. One of the most important features of the model
is that it allows the possibility that a fractionf_{i}of the brain cells arenaturally immuneto
the mutation process leading to death described here. The fractionf_{s}of brain cells that are
susceptible to mutation to death is related to the fraction that is immune to it since
f_{s}þf_{i}¼1.

The probabilityP(t) that a neuron will undergo the ordered set ofmmutations and die at agetis given by

PðtÞ ¼NmðtÞ NT

: ð1Þ

*N*_{0}*(t)*

*N*_{1}*(t)*

*N*_{2}*(t)*

*N*_{m–1}*(t)*

*N*_{m}*(t)*
*k*_{1}

*k*_{2}

*k*_{3}

*k*_{m–1}

*k*_{m}

*N*_{s}* = N*0*(t) + N*1*(t) + N*2*(t) + ... +**N*_{m−1}*(t) + N**m**(t)*
Alzheimer’s disease

Figure 1. Mutation model of brain cells leading to AD.

0 0.002 0.004 0.006 0.008 0.01 0.012 0.014 0.016 0.018 (a)

0 0.002 0.004 0.006 0.008 0.01 0.012 0.014 0.016 0.018

50 60 70 80 90 100

*IR(t) mean*

**Male AD ****IR(****t) mean v****alue &** **standard de****viation err****or**

**Age t in years**

Figure 1a. Male AD data for annual incidence rate IR(t) mean value and standard deviation error as a function of agetin years from Table 1(c).

This probability is also equal to the probability of developing AD at age t in the risk
population, a quantity also known as the prevalence of AD at aget. Thus, in a population
of a total number ofH_{T}humans, if the number of humans that have been diagnosed with
AD by agetis denoted byH_{AD}(t), then

PðtÞ ¼H_{AD}ðtÞ
HT

¼N_{m}ðtÞ
NT

: ð2Þ

The fraction of the population that comes down with AD per unit time at age t (the fractional AD incidence rate) is given by

IRðm;ðk_{m}Þ;tÞ; 1
HT

dH_{AD}ðtÞ

dt ¼dPðtÞ

dt ; ð3Þ

where m is the number of mutations necessary to cause an average brain cell to
spontaneously die, (k_{m}) is the set ofmmutation rates, andf_{s}is the fraction of brain cells
that are susceptible to mutation to death. All these model parameters are determined by
fitting the incident rate function to AD incidence data.

The AD incidence rate IR(t) and prevalenceP(t) functions arising from the ordered
mutation model described here are derived in the Appendix. For the particularly simple
case when all the mutation rates are equal to each other, the AD incidence rate is given by
the particular simple function given in (A8) in the Appendix; integrating this function
gives the AD prevalence functionP(t) given in (A7). Notice that as the agetof the cohort
becomes arbitrarily large, the prevalence function in (A7) approachesf_{s}in value. Since
f_{s}#1, the prevalence function isnaturally saturatedin that it can never exceed unity in
value. The incidence rate function given in (A7) characteristically monotonically rises

(b)

0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18

0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18

50 60 70 80 90 100

**Male ****AD mean pr****ev****alence ****P****(t****)** **& standard de****viation err****or**

**Age t in years***P(t) mean*

Figure 1b. Male AD data for mean prevalenceP(t) and standard deviation error as a function of age tin years from Table 1 (b2).

to a peak as the agetof the cohort increases from birth and then monotonically decreases
to zero astcontinues increasing. Thus, the area under the IR(t) curve, namelyP(t), is finite
and can never exceedf_{s}in value. These particularly simple functions will be shown to give
surprisingly good fits to the AD data.

Fitting the incidence rate function IR(t) to the AD incidence rate data determines the
values of the model parameters. If every member of the population is susceptible to
developing AD, thenf_{s}¼1 and the fit involves determining the values of the remaining
model parameters.

The saturated ordered mutation model constructed and solved in this paper is isomorphic to the physical model describing an ordered chain of radioactive nuclei decays with the exception that it allows for the possibility that a fraction of a risk population may beimmuneto developing AD.

2. Theunordered,independentmutation model of AD

A completely different model of the mutations in a brain cell that lead to AD postulates
that every mutation occurs independently of all the others. In Figure 1, only the first
transition is permitted for each mutation and from Equation (A3a) in the Appendix with
f_{s}¼1, the fraction of brain cells that have made the first mutation necessary to cause AD
as a function of age is

p_{1}ðtÞ;12expð2k_{1}tÞ; ð4Þ

where the transition constantk_{1}is related to average timeT_{1}necessary for this mutation to
occur in a cell byT_{1}¼1/k_{1}. Ifm independentmutations are required for a brain cell to be
destroyed, then the probability that a cell will be destroyed at agetis given by

PðtÞ ¼p_{1}ðtÞp_{2}ðtÞp_{3}ðtÞ. . .pm21ðtÞp_{m}ðtÞ; ð5Þ

where the values of themmutation constantsk_{1},k_{2},. . .,k_{m21},k_{n}are all independent of each
other in general. Notice that the mutations here are not ordered, as they where in the model
described in section 1 above, but are completely independent of each other. Thus, in this
model the mutations can occur in any order, simultaneously, or at completely different
times.

In the limit whenk_{i}tp1 for alli, (5) gives

PðtÞ ¼k1k2k3. . .km£t^{m}; kitp1: ð6Þ

This result is to be compared with (A4) in the Appendix obtained from the ordered model.

The simplest such independent mutation model occurs, if all the mutation constants have the same positive valuek. Under these circumstances, using Equations (4) and (5), the probability that a brain cell will be destroyed at timetbecomes

PðtÞ ¼ ½12expð2ktÞ^{m}: ð7Þ

The quantity k defined in (7) can be regarded as an effective mutation constant that reproduces the experimental value ofP(t) in a simplified simulation of the data.

In the general case, where only a fraction f_{s} of the population is susceptible to
developing AD, the prevalence of AD at agetis then given by

PðtÞ ¼f_{s}½12expð2ktÞ^{m}: ð8Þ

SinceP(t) also represents the probability of developing AD at agetin the population, the incidence rate of AD at agetis given by

IRðtÞ ¼dPðtÞ

dt ;I^{0}ðf_{s};k;mÞ ¼f_{s}mk½12expð2ktÞ^{ðm21Þ}expð2ktÞ: ð9Þ

Thus, in a simplified simulation of AD incidence rate data using the independent mutation
model, the values of the three independent parameters,m,kandf_{s}, are returned by the fit.

Notice that as the age t of the cohort increases, the prevalence function P(t) in
Equation (8) saturates at the value of f_{s}#1. Thus, the model prevalence function
automatically satisfies the physical requirement that its value never exceed one. Thus,both
the ordered and independent mutation models arenaturally saturated, as they must be to
be physically realistic.

3. AD incidence data

The most extensive analysis of the incidence of AD comes from the Rochester, Minnesota study [1,2]. The original data was compiled over a 25-year period (1960 – 1984) by Kokmenet al.[1]. This data was reanalysed and extended by Roccaet al.[2] for the last 10 years of the study (1975 – 1984). The modelling of AD in this paper will be based on the data in the Rocca [2] reanalysis.

The incidence of AD in men and women for Rochester, Minnesota in Ref. [2] is reproduced here in Tables 1(a) and 2(a), respectively, for convenience. The fractions that appear in these tables are incidences over 5-year intervals. To get annual incidences, which are required in the modelling in this paper, these fractions must be divided by 5.

Since females live longer than males, the female cohorts in Table 2(a) are substantially larger than the corresponding male cohorts appearing in Table 1(a), especially above 80 years of age; thus, the female AD data is more reliable than the male data. The incidence data in these tables differ from the incidence data in Ref. [1] in one important respect: the population numbers appearing in the denominators in Tables 1(a) and 2(a) were those within an age interval thatwere initially free of dementia. In the incidence data in Ref. [1], all residents of Rochester, Minnesota within an age interval were included in the denominators of the AD incidence tables,including those that were previously diagnosed with dementia. Since the goal of the modelling in this paper is to compute the prevalence of AD as a function of age, it will be necessary to convert the form of incidence data in Ref. [2] back into the form for the incidence data used in Ref. [1].

IfH_{T}is a random sample of a risk population at birth (aget¼0) andH_{AD}(t) is the
number within this population that has developed AD at age t, then the prevalence of
AD is defined as P(t)¼H_{AD}(t)/H_{T}. The incidence rate of AD at any age t used in
Ref. [1] was defined as (dH_{AD}(t)/dt)/H_{T}¼dP(t)/dt, which will be denoted by IR(t) for
short. If this expression is applied to any age interval for AD, then the denominatorH_{T}
can be approximated by the total number of city residents with ages falling within this
interval and the numerator is the annual number of these residents who developed AD
with ages within this interval. Although not all people diagnosed with dementia have
AD, in the case of Rochester, Minnesota, the great majority did. Thus, to a very good
approximation of the data, it will be assumed that all dementia cases are cases of AD.

The error induced in the modelling results by making this simplifying assumption is certainly far less than the observed variation in the incidence of AD from one year to the next.

Table1(a).MaleADincidencedataper5-yearageintervalfromRoccaetal.[2]. MaleADincidence*IR*(t)over5-yearageintervalsa 60–64years65–69years70–74years75–79years80–84years85–89years90–94years95–99years Yearu0;1u1;2u2;3u3;4u4,5u5;6u6,7u7,8 19751/838*1/6044/4772/3631/1992/931/320/6 19760/8480/6112/4812/3667/2015/932/320/6 19772/8593/6222/4862/3684/2012/970/310/6 19780/8731/6303/4881/3741/2043/1000/290/6 19790/8850/6401/4942/3741/2041/1044/290/6 19800/8951/6492/4981/3803/2064/1052/291/6 19811/9063/6762/5164/3905/2155/1110/320/6 19821/9181/7002/5363/4006/2213/1114/330/7 19830/9280/7281/5536/4062/2271/1180/320/8 19841/9392/7511/5724/4164/2333/1200/330/9 Average6.77£1024 1.81£1023 4.00£1023 6.92£1023 1.60£1022 2.79£1022 4.21£1022 1.66£1022 StandardDev.8.01£1024 1.72£1023 2.04£1023 3.73£1023 1.03£1022 1.44£1022 5.32£1022 5.27£1022 *Thedenominatorsinthistablearethepopulationswithinanageintervalthatwereinitiallyfreeofdementia. aTheannualincidencerateswithineach5-year(y)timeintervalisui,j/5.

In contrast to the definition of the incidence rate used in Ref. [1], the incidence rate
used in Ref. [2] replaced H_{T}, which includes previously diagnosed cases of AD, with
H^{*}_{AD}ðtÞ;HT2HADðtÞwhich does not. Thus, the incidence rate in Ref. [2] was defined as
ðdHADðtÞ=dt=H*

ADðtÞ, which will be denoted as IR*(t) for short. The connection between these two different incidence rates is given by

IRðtÞ ¼IR^{*}ðtÞH^{*}_{AD}ðtÞ
HT

¼IR^{*}ðtÞ½H_{T}2H_{AD}ðtÞ
HT

¼IR^{*}ðtÞ½12PðtÞ ¼dPðtÞ

dt : ð10Þ The last Equation in (10) can be rewritten as

IR^{*}ðtÞ ¼2d

dtln½12PðtÞ: ð11aÞ

Since the data shows thatP(t)¼0 fort#60 years, Equation (11a) can be integrated to give

PðtÞ ¼12exp½2uðtÞ; where uðtÞ; ðt

60 y

IR^{*}ðtÞdt: ð11bÞ

The following relationship between the two incidence rates then follows from (10) and (11b):

IRðtÞ;dPðtÞ

dt ¼IR^{*}ðtÞexp½2uðtÞ: ð12Þ

Since the dimensionless quantityu(t)$0, it is always true that IR(t)#IR^{*}(t). Using the
data in Tables 1(a) and 2(a), the male and female AD incidence rates IR(t) and prevalences
P(t) can be computed from the incidence rates IR^{*}(t) appearing in Ref. [2].

The male prevalencesP(t) are shown in Table 1(b1),(b2), while the female prevalences are shown in Table 2(b1),(b2). The prevalences appearing in these tables were calculated using Equation (11b) where the time-dependent parameter u(t) appearing in (11b) is computed from the fractions ui,j in these tables as follows: u(65)¼u0,1, u(70)¼u0,1þu1,2,u(75)¼u0,1þu1,2þu2,3, etc.

The maleannualincidence rates IR(t) are shown in Table 1(c) while the femaleannual
incidence rates are shown in Table 2(c). Theannualincidence rates appearing in these
tables were computed using Equation (12) where for example IR^{*}(62.5)¼u0,1/5 and
u(62.5)¼u0,1/2, so that IRð62:5Þ ¼u0;15^{21}exp½2u0;1=2. Similarly, IR^{*}(67.5)¼u1,2/5
andu(67.5)¼u0,1þu1,2/2, so that IRð67:5Þ ¼u1;25^{21}exp½2u0;12u1;2=2:Continuing in
this way generates all of the annual incidence rates appearing in Tables 1(c) and 2(c).

Multiplying the annualincidence rates in Tables 1(c) and 2(c) by 5 gives the 5-year
interval incidence rates IR(t) that can be compared with the 5-year interval incidences
rates IR^{*}(t) given in Tables 1(a) and 2(a); indeed, the ratio of 5-year interval incidence
rates IR(t)/IR^{*}(t) computed from these tables are in agreement with the IR(t) and IR^{*}(t)
curves appearing in Figure 1 in Ref. [2].

In the language of the models constructed here, remembering that m mutations are required to initiate AD, the prevalenceP(t) of AD at agetis given by the probability that a brain neuron has died (see (5), (7), or (A6) in the Appendix).

So, what do that annual AD incidence IR(t) and prevalence P(t) data appearing in tables say about the disease?

Table1(b1).MaleADprevalenceasafunctionofage(t)inyearscomputeddirectlyfromtheRoccadatainTable1(a)usingEquation(11b). MaleADprevalenceP(t)ataget(inyears)computedfromRoccadatainTable1(a) YearP(65)P(70)P(75)P(80)P(85)P(90)P(95)P(100) 19751.192£1023 0.0028440.011170.016600.021530.042350.071810.07181 1976000.0041490.0095760.043470.093540.14840.1484 19770.0023250.0071250.011200.016560.035940.055610.055610.05561 197800.0015860.0077050.010350.015190.044290.044290.04429 1979000.0020220.0073440.012190.021650.14770.1477 198000.0015390.0055410.0081550.022490.059030.12170.2565 19810.0011030.0055260.0093930.019480.042020.084210.084210.08421 19820.0010880.0025140.0062290.013650.040070.065670.17230.1723 1983000.0018060.016450.025070.033300.033300.03330 19840.0010640.0037210.0054610.014970.031740.055650.055650.05565 Average0.00067740.0024850.0064660.013310.028970.055530.093510.1070 StandardDeviation0.00080110.0024190.0033880.0041780.011270.021920.049930.07181

Table1(b2).MaleADprevalenceasafunctionofage(t)inyearscomputeddirectlyfromtheRoccadatainTable1(a)usingEquation(11b). MaleADprevalenceP(t)ataget(inyears)computedfromRoccadatainTable1(a) YearP(62.5)P(67.5)P(72.5)P(77.5)P(82.5)P(87.5)P(92.5)P(97.5) 19750.00059640.0020190.0070170.013890.019070.031990.057190.07181 1976000.0020760.0068660.026670.068840.12140.1484 19770.0011630.0047280.0091660.013880.026300.045820.055610.05561 197800.00079330.0046500.0090300.012770.029850.044290.04429 1979000.0010110.0046870.0097740.016930.086850.1477 198000.00077010.0035420.0068490.015350.040930.090920.1919 19810.00055170.00331720.0074510.014440.030810.063350.084210.08421 19820.00054450.0018020.0043740.0099490.026950.052950.12060.1723 1983000.00090370.0091550.020770.029200.033300.03330 19840.00053230.0023930.0045910.010230.023390.043770.055650.05565 Average0.00033880.0015820.0044780.0098980.021180.042360.075010.1005 StandardDeviation0.00040070.0015840.0027640.0033250.0068860.016170.030660.05853

Table1(c).MaleannualADincidencerateIR(t)computedfromIR*(t)datainTable1(a)usingEquation(12). MaleannualADincidencerateIR(t)atindicatedage(t)inyears Year/age62.567.572.577.582.587.592.597.5 19752.385£1024 3.304£1024 1.665£1023 1.086£1023 9.858£1024 4.163£1023 5.892£1023 0 1976008.298£1024 1.085£1023 6.779£1023 1.001£1022 1.098£1022 0 19774.651£1024 9.600£1024 8.155£1024 1.071£1023 3.875£1023 3.934£1023 00 197803.172£1024 1.223£1023 5.299£1024 9.678£1024 5.820£1023 00 1979004.044£1024 1.064£1023 9.708£1024 1.890£1023 2.519£1022 0 198003.079£1024 8.003£1024 5.227£1024 2.867£1023 7.307£1023 1.253£1022 2.693£1022 19812.206£1024 8.846£1024 7.694£1024 2.021£1023 4.507£1023 8.438£1023 00 19822.177£1024 2.852£1024 7.430£1024 1.485£1023 5.283£1023 5.119£1023 2.131£1022 0 1983003.613£1024 2.928£1023 1.725£1023 1.645£1023 00 19842.128£1024 5.313£1024 3.480£1024 1.903£1023 3.353£1023 4.781£1023 00 Average1.354£1024 3.616£1024 7.961£1024 1.370£1023 3.131£1023 5.311£1023 7.592£1023 2.693£1023 Stand.Dev.1.602£1024 3.439£1024 4.064£1024 7.392£1024 2.012£1023 2.683£1023 9.575£1023 8.517£1023

Table2(a).FemaleADincidencedataper5-yearageintervalfromRoccaetal.[2]. FemaleADincidence*IR*(t)over5-yearageintervala 60–64years65–69years70–74years75–79years80–84years85–89years90–94years95–99years Yearu0,1u1,2u2,3u3,4u4,5u5,6u6,7u7,8 19750/1086*0/9993/92310/73512/5098/2842/961/24 19760/10870/10055/9367/75115/52410/2902/1010/24 19770/10891/10136/95011/76910/53912/2962/1070/25 19781/10880/10191/9682/7859/55414/3025/1121/26 19790/10901/10253/9832/8047/5697/3088/1180/28 19801/10913/10334/99711/82116/58415/3146/1240/28 19810/11071/10444/10018/83414/5998/3263/1312/28 19820/11190/10568/10056/84411/61112/3394/1411/28 19831/11340/10681/10095/85914/62310/3494/1474/28 19840/11461/10780/10134/87016/63713/3626/1573/28 Average2.71£1024 6.75£1024 3.59£1023 8.27£1023 2.15£1022 3.44£1022 3.36£1022 4.37£1022 StandardDeviation4.37£1024 9.17£1024 2.50£1023 4.48£1023 5.16£1023 8.61£1023 1.59£1022 4.98£1022 *Thedenominatorsinthistablearethepopulationswithinanageintervalthatwereinitiallyfreeofdementia. aTheannualincidencerateswithineach5-year(y)timeintervalisui,j/5.

Table2(b1).FemaleADprevalenceasafunctionofage(t)inyearscomputeddirectlyfromtheRoccadatainTable2(a)usingEquation(11b). FemaleADprevalenceP(t)ataget(inyears)computedfromRoccadatainTable2(a) YearP(65)P(70)P(75)P(80)P(85)P(90)P(95)P(100) 1975003.245£1023 0.016710.039620.066300.085550.1228 1976005.327£1023 0.014550.042360.074820.092960.09296 197709.866£1024 7.276£1023 0.021370.039360.077530.094610.09461 19789.187£1024 9.187£1024 1.950£1023 0.0044890.020530.064900.10570.1394 197909.751£1024 4.019£1023 0.0064930.018640.040690.10350.1035 19809.161£1024 3.813£1023 7.802£1023 0.021000.047460.091890.13470.1347 198109.574£1024 4.941£1023 0.014440.037200.060540.081810.1451 1982007.928£1023 0.014950.032530.066170.092290.1241 19838.814£1024 8.814£1024 1.871£1023 0.0076640.029710.057120.082430.2045 198409.272£1024 9.272£1024 0.0055100.030170.064380.099470.1909 Average2.716£1024 9.460£1024 4.528£1023 0.012720.033760.066430.097320.1353 Stand.Dev.4.374£1024 1.101£1023 2.565£1023 0.0062790.0092720.013460.015530.03763

Table2(b2).FemaleADprevalenceasafunctionofage(t)inyearscomputeddirectlyfromtheRoccadatainTable2(a)usingEquation(11b). FemaleADprevalenceP(t)ataget(inyears)computedfromRoccadatainTable2(a) YearP(62.5)P(67.5)P(72.5)P(77.5)P(82.5)P(87.5)P(92.5)P(97.5) 1975000.0016230.010000.028230.053050.075970.1044 1976000.0026670.0099520.028560.058730.083930.09296 197700.00049340.0041360.014350.030410.058640.086110.09461 19780.00045940.00091870.0014340.0032200.012540.042970.085540.1227 197900.00048760.0024980.0052570.012580.029730.072660.1035 19800.00045810.0023650.0058090.014420.034320.069940.11360.1347 198100.00047880.0029510.0097060.025890.048950.071240.1140 1982000.0039720.011440.023780.049500.079330.1083 19830.00044080.00088140.0013760.0047710.018750.043510.069860.1456 198400.00046370.00092720.0032210.017920.047430.082090.1464 Average0.00013580.00060890.0027390.0086350.023300.050240.082030.1167 Stand.Dev.0.00021870.00070200.0015310.0042660.0075370.010890.012580.01983

Table2(c).FemaleannualADincidencerateIR(t)computedfromIR*(t)datainTable2(a)usingEquation(12). FemaleannualADincidencerateIR(t)atindicatedage(t)inyears Year/age62.567.572.577.582.587.592.597.5 1975006.490£1024 2.693£1023 0.0045820.0053340.0038500.007463 1976001.065£1023 1.845£1023 0.0055610.0064910.0036280 197701.973£1024 1.257£1023 2.819£1023 0.0035970.0076320.0034160 19781.837£1024 02.063£1024 5.079£1024 0.0032080.0088730.0081640.06748 197901.950£1024 6.088£1024 4.949£1024 0.0024290.0044100.012570 19801.832£1024 5.794£1024 7.977£1024 2.641£1023 0.0052910.0088850.0085780 198101.914£1024 7.968£1024 1.899£1023 0.0045530.0046670.0042530.01265 1982001.585£1023 1.405£1023 0.0035150.0067290.0052230.006368 19831.762£1024 01.979£1024 1.158£1023 0.0044100.0054810.0050620.02440 198401.854£1024 09.165£1024 0.0049330.0068410.0070150.01829 Average5.432£1025 1.348£1024 7.165£1024 1.638£1023 0.0042080.0065340.0061760.007593 Stand.Dev.8.749£1025 1.831£1024 4.991£1024 8.830£1024 9.915£1024 1.594£1023 0.0029190.008535

Figure 1(a) is a plot of the mean maleannualincidence rate data IR(t) appearing in Table 1(c) together with the standard deviation error of each data point. Figure 1(b) is a similar plot of the male AD prevalence P(t) data appearing in Table 1(b1),(b2).

Figure 1(a),(b) clearly shows that the errors in the male data above 90 years old are so great that it is impossible to extrapolate the IR(t) andP(t) data into the region above 100 years old with any confidence of accuracy. A very similar situation results from the female AD data.

Figure 2(a) is a plot of the mean femaleannualincidence rate data IR(t) appearing in Table 2(c) together with the standard deviation error of each data point. Figure 2(b) is a similar plot of the female AD prevalence P(t) data appearing in Table 2(b1),(b2).

Figure 2(a),(b) again shows that the errors in the data above 90 years old are so great that it is impossible to accurately extrapolate the IR(t) and P(t) data into the region above 100 years old.

This is a situation where modelling the disease may help answer the question of how the AD incidence rate IR(t) and prevalenceP(t) functions behave in the region above 100 years of age.

4. Other studies of age-related AD incidence rates

The results of other studies of age-related AD incidence rates is shown in Table 3 along with the data for the Rochester, Minnesota study that was used for the modelling in this paper. The Rochester study was clearly more detailed than the others and benefits from having a much greater cohort population. The cohort population in the Rochester Study was about ten times greater than that of the Baltimore Longitudinal Study. Moreover, by choosing an entire city for the study, the Rochester results avoid errors stemming from choosing a sample cohort that is not representative of the population as a whole. For example, the ratio of women to men in the Rochester study is about 1.5, a result that is in general agreement with the elderly population in the USA. However, the ratio of women to men in the Baltimore Longitudinal Study is 0.54, a ratio that is not representative of the USA population as a whole. Finally, the Rochester Study is the only one of the three studies shown in Table 3 that has detailed data for the 85 – 89, 90 – 94 and 95 – 99 year age Table 3. Age-specific AD incidence rates (% per year)in cohorts initially free of dementia.

Rochester, Minnesota study [2]

Baltimore Longitudinal

study [3]

Age range

(in years) Men Women Men Women

East Boston, Mass. study [4]

55 – 59 0 0 0 0

60 – 64 0.067 0.027 0 0.25

65 – 69 0.181 0.067 0.09 0.22 0.6

70 – 74 0.400 0.359 0.55 0.17 1.0

75 – 79 0.692 0.827 0.75 1.10 2.0

80 – 84 1.60 2.5 1.25 3.61 3.3

85 – 89/(85þ) 2.79/(3.00) 3.44/(3.49) (7.2) (5.27) (8.4)

90 – 94 4.21 3.36

95 – 99 1.66 4.37

Study population size/(year)

4,680/(1975) to 5,484/(1984)

7,140/(1975) to 7,933/(1984)

802 434 2,313

0 0.002 0.004 0.006 0.008 0.01 0.012 0.014 0.016

0 0.002 0.004 0.006 0.008 0.01 0.012 0.014 0.016

50 60 70 80 90 100

*IR(t) mean data points*

**F****emale AD ****IR(****t) mean v****alue &** **standard de****viation err****or**

**Age t in years**

0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 (b) (a)

0 0.02 0.04 0.06 0.08 0.1 0.12 0.14

50 60 70 80 90 100

*P(t) mean*

**F****emale ****AD mean pr****ev****alence ****P****(t****)** **& standard de****viation err****or**

**Age t in years**

Figure 2. (a) Female AD data for annual incidence rate IR(t) mean value and standard deviation error as a function oftin years from Table 2 (c) and (b) Female AD data for mean prevalenceP(t) and standard deviation error as a function of agetin years from Table 2 (b2).

groups. For all of these reasons, the data of the Rochester Study was deemed more reliable and was used in the modelling in this paper. Nonetheless, the corresponding AD incidence rate data between any two studies in Table 3 are within a factor of 2 of each other in general. Thus, the general features of the model results using any of these three data sets are very similar and differ only in how fast the AD prevalence rate rises with age.

5. Modelling the AD incidence rate curve

The AD incidence rate curve produced by the ordered mutation model is given by
Equation (A8) in the Appendix. Setting the susceptibility fraction parameterfsequal to 1
and fitting this function to the male AD data in Table 1(c) yields the fit shown in
Figure 3(a). To achieve this fit, the mean data point at 97.5 years was ignored since this
point was deemed too unreliable. Nonetheless, the model incidence rate function IR(t)
returned by the fit falls within one standard deviation of not only the 97.5 year data point
but ofall of the data pointsas shown in Figure 3(a). In this fit, the entire male population
was assumed to be vulnerable to developing AD (f_{s}¼1). The values ofkandm, the only
remaining independent parameters, were determined by the fit and are shown in
Figure 3(a). The projection of the IR(t) model function into the region above 100 years old
is also shown in Figure 3(a). Notice that the IR(t) curve peaks at around 120 years old and
monotonically declines thereafter towards zero. The projected values of the model
incidence rate function IR(t) in the region t,60 years old are vanishingly small, in
agreement with experiment; indeed, this is one of the strongest features ofboththe ordered
and independent mutation models.

Lettingf_{s}become an independent parameter and repeating the ordered mutation model
fit gives the result shown in Figure 3(b). The fit to the data here is better than the one in
Figure 3(a), with the (chisq) error [see (A11) in the Appendix] reduced from what it is in
Figure 3(a) by 35.5%. As before, the model incidence rate curve falls within one standard
deviation of all of the data points. However, this fit predicts that only 45.9% of men are
susceptible to acquiring AD (f_{s}¼0.459), and, therefore, that 54.1% are immune to
getting it. This fit also predicts that the average number of mutations necessary for a brain
cell associated with AD to spontaneously die ism¼45, and the average elapsed time
between consecutive mutations is T¼k^{21}¼(0.414)^{21} y¼2.41 years. The incidence
rate function IR(t) extrapolated into the non-fitted region peaks around 105 years and
monotonically declines towards zero thereafter.

Looking carefully at the fit in Figure 3(b) shows that it is particularly poor for the early incidence data points. To improve the fit to this data, a more sophisticated ordered model will be constructed where it will be assumed that, not one, but two susceptible groups are responsible for generating all of the AD data. It will be assumed that, because of genetic inheritance, a fraction of the population is born with a head-start in development of AD in that it is born with many of the mutations needed to cause it. Thus, it will be assumed that the male AD incidence data is generated by a compound incidence function given by

IðtÞ ¼I_{1}ðf_{s1};k;m_{1}Þ þI_{2}ðf_{s2};k;m_{2}Þ; ð13Þ

where the notation in Equation (A8) in the Appendix has been used. Thus, the compound
model assumes that there are two different groups of the population that are susceptible to
acquiring AD, one that has model parameters (f_{s1},k and m_{1}) and the second that has
model parameters (f_{s2}, k and m_{2}). Notice that it is assumed that the mutation rates k
for both susceptible groups are identical, and it will be assumed that m_{1},m_{2}so that

I_{1}(f_{s1},kandm_{1}) describes the group born with some common average number (not zero)
of AD mutations. Fitting the male AD incidence data with the compound incidence
function in Equation (13) gives the much-improved result shown in Figure 3(c), with a fit
error (chisq) that is 60.8% lower than the error in Figure 3(b). Interestingly, the fit returns
the valuesf_{s1}¼0.01163 andf_{s2}¼0.2465 so thatf_{s1}¼0.0472f_{s2}. Thus, a particular small
fraction of the population (about 1%) is predicted to be born with AD mutations on the
average, certainly a plausible result. The fit also produces the values m_{1}¼62 and
m_{2}¼79 so that the particularly susceptible population is born with an average of
79262 ¼ 17 AD mutations. The projection of the model incidence function returned by
this fit is also shown in Figure 3(c) along with the standard deviation error in the data
points.

Integrating the model incidence function shown in Figure 3(c) gives the model prevalence function which is given as a series of points in the second column in Table 4.

The prevalence curve must be a monotonically increasing function of age t, and it
musty saturate at the value of f_{s1}þf_{s2}¼0.258136, which, as seen in Table 4, it
certainly does. Thus, the results in Table 4 constitute a check on the modelling results in
Figure 3(c).

A measure of the error of the least-squares fit is given by the sum (chisq) given in (A11) of the Appendix. A completely different positive definite sum that measures the statistical quality of a fit is known as the chi-square distribution goodness-of-fit test. Even though these two tests have virtually the same name, they are not to be confused with each other. The last part of the Appendix contains a description of the chi-square distribution goodness-of-fit test and all of the details of the application of this test to the fit in Figure 3(c).

Since there are eight data points in the fit in Figure 3(c), there are 821¼7 degrees of
freedom in this problem. The positive definite sum in the chi-square distribution goodness-
of-fit test will be denoted byx^{2}, and for the fit in Figure 3(c), it is found thatx^{2}¼3.83.

For a problem with 7 degrees of freedom, ax^{2}value ofx^{2}¼3.83 means that there is a
79.8% probability (thep-value is 0.798) that the expected (model) and observed (data)
distributions are statistically identical. Thus, by any conventionally used criterion, the
model fit in Figure 3(c) is very good.

It is interesting to see what sort of fit to the same data is returned by the
independent mutation model. Using the incidence rate function given in Equation (9)
gives the fit shown in Figure 4(a). Here, choosing the susceptible fraction parameter to
equal f_{s}¼1 gives a fit with the least error. Thus, no immunity to AD in the male
population is predicted by the independent mutation model. Although the error of the fit
here is about 10% higher than the fit in Figure 3(b) for the ordered mutation model, this
model is nonetheless about as credible. Notice that the projected values of the
independent model incidence rate function IR(t) in Figure 4(a) in the region t,60
years old are vanishingly small in agreement with experiment. As has already been
pointed out, this is one of the strongest features of both the ordered and independent
mutation models.

To improve the fit of the independent mutation model, especially for the early points, a compound version of this model will be tried in exactly the fashion outlined in Equation (13) for the ordered model. The compound independent model fit is shown in Figure 4(b), and it has reduced the fit error (chisq) by 75.5% from that in Figure 4(a). The most important result of this fit is that this model now predicts widespread immunity to AD.

Since the fit returns the valuesf_{s1}¼0.0359 andf_{s2}¼0.329, the compound independent
model predicts that 63.4% of men are immune to acquiring AD.

Because the fit errors of the compound ordered and independent models are so close to each other (compare Figures 3(c) and 4(b)), it is impossible to decide which of these two models is more credible. However, both models predict widespread immunity to AD in the male population.

Fitting the female AD incidence data by the ordered mutation model with the value of the susceptible fraction set equal tofs¼1 gives the results plotted in Figure 5(a). Notice that the mean value of the data point att¼92.5 years was left off of the fit since it was deemed improbable – the mean incidence rate of this point waslower than that of the rate on either side of it. The fit here is determined by only two independent parameters (mandk) whose values are shown in Figure 5(a), and the (chisq) error of the fit is relatively large. Notice that the fit to the earlier data points is particularly poor, lying outside one standard deviation from the mean values of these points.

Repeating the above fit with f_{s} now being an independent parameter leads to the
convincing fit shown in Figure 5(b). The (chisq) error of the fit in Figure 5(b) with
f_{s}¼0.205 is over 62 timessmallerthan the fit in Figure 5(a) withf_{s}¼1. Again, the data
point att¼92.5 years was left off of the fit, but nonetheless the resulting model incidence
rate function IR(t) lies within one standard deviation of the mean forallof the data points,
Table 4. Prevalence curve computed from the compound ordered model fitted to AD male and
female data.

P(t) curve for AD computed from compound ordered model fits

Aget(in years) Male Female

60 0.00020017 0.00012724

62.5 0.00040647 0.00022508

65 0.00075842 0.00038961

67.5 0.00131803 0.00068675

70 0.00216453 0.00125721

72.5 0.00340978 0.00236271

75 0.00522482 0.00442234

77.5 0.00787097 0.00801061

80 0.01171971 0.01379102

82.5 0.01724228 0.02237979

85 0.02495481 0.03416782

87.5 0.03532024 0.04915843

90 0.04862735 0.06688494

92.5 0.06488146 0.08644810

95 0.08374332 0.10666889

97.5 0.10453950 0.12630920

100 0.12634750 0.14429146

102.5 0.14813128 0.15985641

105 0.16889489 0.17262767

110 0.20432751 0.18998329

115 0.22928373 0.19878664

120 0.24435096 0.20254407

125 0.25224469 0.20392637

130 0.25587292 0.20434332

135 0.25735047 0.20446173

140 0.25788830 0.20449045

145 0.25806467 0.20449664

150 0.25811714 0.20449785

Table5.TotalnumberofADcasesintheUSAin2000. Agerange(in years)Male populationa

MaleprevalenceP(t) from Table1(b2)MaleAD casesFemale populationaFemaleprevalenceP(t)fromTable 2(b2)FemaleAD cases 60–645,165,7030.000338851,7505,699,0270.00013585774 65–694,402,8440.00158246,9675,131,1110.000608973,124 70–743,904,3210.004478617,4854,945,6250.002739813,550 75–793,051,2270.009898930,2034,374,1510.008635637,773 80–841,854,5960.02118939,2971,754,8380.02330140,889 85–891,099,0190.04236946,5641,690,7980.05024984,960 90–94438,2690.07501132,874674,2610.08203855,315 95–99112,9750.1005411,358173,8080.1167620,293 Over10019,8750.14813b 2,94430,5780.15985b 4,888 GenderADsum189,442261,566 ADtotal451,008 a Inyear2000fromUSCensusBureau. bComputedprevalenceatage102.5yearsfromTable4[TotalUSApopulationin2000was282,338,631].

0 0.002 0.004 0.006 0.008 0.01 0.012 0.014

(a) 0.02

0 0.004 0.006 0.008 0.01 0.012 0.014 0.016 0.018 0.02

40 60 80 100 120 140 160

*Fitted mean IR(t) data points*
*Non-fitted mean IR(t) data point*
*fs = 1 model IR(t) fit projection*

**Male mean ****AD IR(****t) data and model f****it**

**Age t in years***IR(t) = 1*k*((kt)^ 30)*(exp(–kt))/30!*

Error Value

0.00042651 0.25232 k

NA 1.4074e–07 Chisq

NA 0.99871 R

**f**_{s}** = 1**
**m = 31**
**k = 0.25232 (year)**^{–1}

(b)

0 0.002 0.004 0.006 0.008 0.01 0.012

0 0.002 0.004 0.01 0.012

40 60 80 100 120 140 160

*Fitted mean IR(t) data points*
*Non-fitted mean IR(t) data point*
*Model IR(t) fit projection*

**Male AD ****IR(****t) data, model f****it & pr****ojection**

**Age t in years**

*IR(t) = f*_{s}*k*((kt)^ 44)*(exp(–kt))/44!

Error Value

0.027221 0.45947

f_{s}

0.0030385 0.41405

k

NA 9.0663e–08

Chisq

NA 0.99917

R
**m = 45**

**f**_{s}** = 0.4594**
**k = 0.4140 (year)**^{–1}

(c)

0 0.002 0.004 0.006 0.008 0.01

0 0.002 0.004 0.006 0.008 0.01

40 60 80 100 120 140 160

*Fitted mean IR(t) data points*
*Non-fitted mean IR(t) data point*
*Compund model IR(t) fit projection*

**Male AD ****IR(****t) data, model f****it & pr****ojection**

**Age t in years****m**_{1 }**= 62**

**m**_{2 }**= 79**

*IR(t) = I*_{1}(f_{s1}, k, m_{1 }= 62) + I_{2}(f_{s2}, k, m_{2 }= 79)
Error
Value

0.0022471 0.011636

f_{s1}

0.0042474 0.78924

k

0.0086084 0.2465

f_{s2}
3.5548e– 08 NA
Chisq

NA 0.99967

R

Figure 3. (a) Male AD mean incidence rate IR(t) data and fit usingfs¼1 ordered mutation model.

(b) Male AD incidence rate IR(t) data (mean and standard deviation error) as a function of aget together with ordered mutation model fit and projection. (c) Male AD incidence rate IR(t) data (mean and standard deviation error) as a function of agettogether with COMPOUND ordered mutation model fit and projection.

as seen in Figure 5(b). This fit predicts that 79.4% of females areimmuneto developing
AD, the number of mutations necessary for a brain cell associated with AD to
spontaneously die is m¼87, and the average time between consecutive mutations is
T¼k^{21}¼[0.916]^{21}y¼1.09 years. Figure 5(b) also shows the projection of the model

0 0.002 0.004 0.006 0.008 0.01 0.012 (a)

(b)

0 0.002 0.004 0.006 0.008 0.01 0.012

50 60 70 80 90 100

*Fitted mean IR(t) data points*
*Non-fitted mean IR(t) data point*
*fs=1 IR(t) fit projection*

**Mean male ****AD IR(****t) data points f****itted by** **independent mutation model**

**Age t in years**

*IR(t) = 1*101*k*((1–exp(–kt))^100)*exp(–kt)*
Error
Value

6.0249e–05 0.039599

k

NA 1.0657e–07

Chisq

NA 0.99902

R

**f**_{s}** = 1.000**

**k = 0.039599 (year)**^{–1}**m = 101**

0 0.001 0.002 0.003 0.004 0.005 0.006 0.007 0.008

0 0.001 0.002 0.003 0.004 0.005 0.006 0.007 0.008

50 60 70 80 90 100

*Fitted mean IR(t) data points*
*Non-fitted mean IR(t) data point*

**Fitted mean male ****AD IR(****t) data points by****compound independent mutation model**

**Age t in years**

*IR(t) = I'*_{1}(f_{s1}, k, m_{1} = 351) + I'_{2}(f_{s2}, k, m_{2} = 1218)
Error
Value

0.0043731 0.035978

f_{s1}

0.00034572 0.070267

k

0.0086705 0.32949

f_{s2}

NA 2.6033e–08

Chisq

NA 0.99976

R

**m**_{1}** = 351**

**m**_{2}** = 1218**

Figure 4. (a) Male AD incidence rate IR(t) data and fs¼1 independent mutation model fit.

(b) Compound independent mutation model fit to male AD incidence rate IR(t) data.

incidence rate IR(t) curve into the region above 100 years of age. Notice that the model incidence curve peaks around 98 years of age and monotonically declines thereafter.

To possibly improve the fit in Figure 5(b), a fit using the compound model incidence
function in (13) will now be executed. The improved result is shown in Figure 5(c), with a
fit error (chisq) that is 6.80% lower than the error in Figure 5(b). In the female data case,
the fit returns the valuesf_{s1}¼0.001078 andf_{s2}¼0.2034 so thatf_{s1}¼0.0529f_{s2}, virtually
the same result as for the male data. Thus, once again, a particular small fraction of the
considered population (about 0.1%) is predicted to be born with AD mutations on the
average. The fit also produces the values m_{1}¼66 andm_{2}¼89 so that the particularly
susceptible female population is born with an average of 89266 ¼ 23 AD mutations,
slightly higher than it was for the male population. The projection of the model incidence
function returned by this fit is also shown in Figure 5(c) along with the standard deviation
error in the data points.

Integrating the model incidence function shown in Figure 5(c) gives the model prevalence function which is given as a series of points on the right-hand side in Table 4.

The prevalence curve must be a monotonically increasing function with aget, and it must
saturate at the value off_{s1}þf_{s2}¼0.204498, which, as seen in Table 4, it certainly does.

Thus, the results in Table 4 again constitute a check on the modelling results in Figure 5(c), and the modelling passes this test.

Applying the chi-square distribution goodness-of-fit test to the fit in Figure 5(c) gives the results found in the last part of the Appendix summarized here.

Since there are again eight data points in the fit in Figure 5(c), there are again 821¼7
degrees of freedom in this problem. The value of test statistic for this fit turned out to be
x^{2}¼4.64. For a problem with 7 degrees of freedom, ax^{2}value ofx^{2}¼4.64 means that
there is a 70.3% probability (thep-value is 0.703) that the expected (model) and observed
(data) distributions are statistically identical. Thus, once again, by any conventionally used
criterion, the model fit in Figure 5(c) is very good.

Proceeding in the same way, fitting the same female AD incidence data using the
independent mutation model yields the result in Figure 6(a). Since the value of the
susceptible fraction returned by this fit is f_{s}¼0.271, this model predicts that 72. 8%

of females arenaturally immuneto developing AD. In the independent model result in
Figure 6(a), the number of mutations necessary for a brain cell associated with AD to
spontaneously diem¼1605, and the average time required for a mutation to occur is
T¼k^{21}¼[0.780]^{21}y¼1.28 years.

Fitting the same female data with the compound independent model incidence rate
function gives the fit shown in Figure 6(b). This more sophisticated model reduces the fit
error by 59.1% over what it was in the single term fit in Figure 6(a). Here, the values of the
susceptible fractions returned by the fit aref_{s1}¼0.0177 andf_{s2}¼0.239 so that this model
predicts that 74.3% of females are immune to developing AD.

Interestingly, both the compound ordered and independent mutation models predict that
most females are naturally immune to developing AD. The fits of both models produce
comparable errors, so it is not possible to decide on the basis of the modelling which model
produces superior results. Since females live longer than males, the female AD cohorts above
70 years old in the Rochester, Minnesota study were generally 2 – 4 times larger than the
corresponding male cohorts (see Tables 1(a) and 2(a)). Thus, the female AD data is probably
more reliable than the male AD data, and, therefore, the female modelling results are expected
to be more reliable than the male modelling results. The different values of the fit parameters
(f_{s},mandk) predicted by a mutation model for male and female cohorts demonstrates that AD
develops through different pathways, as is commonly true for various cancers.

0 0.002 0.004 0.006 0.008 0.01 (a)

(b)

(c)

0 0.002 0.004 0.006 0.008 0.01

50 60 70 80 90 100

*Fitted IR(t) mean data points*
*Non-fitted IR(t) data point*

**F****emale mean ****AD IR(****t) f****itted data and model f****it**

**Age t in years***IR(t) = 1*k*((kt)^ 18)*(exp(–kt))/18!*

Error Value

0.0019596 0.14489 k

NA 5.0268e-06 Chisq

NA 0.96163 R

0 0.002 0.004 0.006 0.008 0.01

0 0.002 0.004 0.006 0.01

40 60 80 100 120 140 160

*Fitted IR(t) mean data points*
*Non-fitted IR(t) data point*
*Model IR(t) fit projection*

**F****emale mean ****AD IR(****t) data and model f****it**

**Age t in years**

*IR(t) = f*_{s}*k*((kt)^ 86)*(exp(–kt))/86!

Error Value

0.0024947
0.20533
f_{s}

0.0016505 0.91631 k

NA 8.0388e–08 Chisq

NA
0.9994
R
**m = 87**

**f****s**** = 0.20533**
**k = 0.91631 (year)**^{–1}

0 0.002 0.004 0.006 0.008 0.01

0 0.002 0.004 0.006 0.008 0.01

40 60 80 100 120 140 160

*Fitted mean IR(t) data points*
*Non-fitted IR(t) data points*
*Compound model IR(t) projections*

**F****emale AD ****IR(****t) data, model f****it & pr****ojection**

**Age t in years**

*IR(t) = I*_{1}(f_{s1}, k, m_{1 }= 66) + I_{2}(f_{s2}, k, m_{2 }= 89)
Error
Value

0.001692
0.0010781
f_{s1}

0.0019176 0.93809 k

0.0026198
0.20342
f_{s2}

NA 7.5993e–08 Chisq

NA
0.99943
R
**m**_{1}** = 66**

**m**_{2}** = 89**

Figure 5. (a) Female AD mean incidence rate IR(t) fitted data using fs¼1 ordered mutation model. (b) Female AD mean incidence rate IR(t) data and fit using ordered mutation model.

(c) Female AD incidence rate IR(t) data (mean and standard deviation error) as a function of age ttogether with COMPOUND ordered mutation model fit and projection.