A new look at multi-stage models of cancer incidence

Tyler Lian; Rick Durrett

doi:10.1101/243972

Abstract

Multi-stage models have long been used in combination with SEER data to make inferences about the mechanisms underlying cancer initiation. The main method for studying these models mathematically has been the computation of generating functions by solving hyperbolic partial differential equations. Here, we analyze these models using a probabilistic approach similar to the one Durrett and Moseley [7] used to study branching process models of cancer. This more intuitive approach leads to simpler formulas and new insights into the behavior of these models. Unfortunately, the examples we consider suggest that fitting multi-stage models has very little power to make inferences about the number of stages unless parameters are constrained to take on realistic values.

1 Introduction

Investigation of the age distribution of cancer incidence goes back to the middle of the 20th century. Fisher and Holloman [10] and Nordling [20] found that within the 25–74 age range, the logarithm of the cancer death rate increased in direct proportion to the logarithm of the age, with a slope of about six on a log-log plot. Nordling grouped all types of cancer together and considered only men, but the pattern persisted when Armitage and Doll [2] separated cancers by their type and considered men and women separately.

Nordling [20] suggested that the slope of six on a log-log plot would be explained if a cancer cell was the end result of seven successive mutations. There was no model underlying that conclusion, just the observation that if one sums k exponential random variables with rate μ_i, i.e., with probability density , then when t is much smaller than the mean of the sum , then “the probability that the kth change occurs in the short time interval (t, t + dt) is asymptotically

Later Armitage [1] gave a rigorous proof of this result.

A few years later, Armitage and Doll [3] wrote that the hypothesis described in the previous paragraph was “however, unsatisfactory in that there was no direct experimental evidence to suggest that carcinogenesis was likely to involve more than two stages.” Because of this, they introduced in [3] a two-stage model in which ordinary cells (type 0) mutate at rate μ₁ into initiated (type 1) cells that grow at exponential rate λ and mutate at rate μ₂ into malignant cells (type 2). The two-stage model has been thoroughly analyzed in the literature, see e.g. [12], [13]

In the studies cited above, the stages were unspecified events. That changed in 1971 with Knudson’s study of retinoblastoma [14]. Based on observations of 48 cases of retinoblastoma and published reports, he hypothesized that the disease is a cancer caused by two mutational events. In the dominantly inherited form, one mutation is inherited via the germinal cells. In the nonhereditary form both mutations occur in somatic cells. The underlying gene, named RB1, was found 15 years later. In currentl terminology, RB1 is a tumor suppressor gene. Trouble begins when both copies are knocked out.

Colorectal tumors provide an excellent system in which to search for and study the genetic alterations involved in the development of cancer because tumors of various stages of development, from very small adenomas to very large carcinomas, can be obtained for study. The initiating event is thought to involve the inactivation of the tumor suppressor gene APC (adenomatous polyposis coli). As in retinoblastoma, an inherited germ line mutation in this gene causes greater risk of disease. Individuals with this mutation have numerous polyps form early in their lives, mainly in the epithelium of the large intestine.

In 1990, Fearon and Vogelstein [8] found a second piece of the puzzle when they noted that approximately 50% of colorectal carcinomas, and a similar percentage of adenomas greater than 1 cm have mutations in the RAS gene family, while only 10% of adenomas smaller than 1 cm have these mutations. In the modern terminology, the members of the RAS family are oncogenes. A mutation to a single allele is sufficient for progression. The analysis in [8] also suggested a role for TP53 (which produces the tumor protein p53) in the progression to cancer. The protein p53 has been described as ”the guardian of the genome” because of its role in conserving stability by preventing genome mutations. TP53 has since been implicated in many cancers, see [11] and [25].

Combining the ideas in the last two paragraphs leads to a four (or five) stage description for colon cancer that is described for example in the books of Vogelstein and Kinzel [24], and Frank [9]. In 2002, Leubeck and Moolgavar [16] developed a mathematical model in order to fit the age-incidence of colorectal cancer. We will describe the model in detail in the next section. They tried models with k = 2, 3, 4, 5 stages and found that the four-stage model gave the best fit. The techniques developed in [16] have been applied to study a number of other cancers. See e.g. [17], [18], [19].

2 Analytic approach

In the k-stage model there is a fixed number of stem cells, N, each of which mutates at rate μ₀ to become a type 1 cell, so cells of type 1 are born at times of a Poisson process with rate γ = Nμ₀. Cells of types i = 1,… k − 2 are pre-initiated cells that mutate at rate μ_i,k to become a cell of type i + 1. Cells of type k − 1 are initiated cells that divide into two at rate α, die at rate β, where λ = α − β > 0, and mutate at rate μ_k_−1,_k to become malignant (type k). Let be the number of type i cells at time t in the k-stage model. Let be the time of appearance of the first malignant cell in the k-stage model. Here we will be interested primarily in

the survival function H_k(t) = P(T_k > t), which gives the fraction of individuals that are cancer free at time t
hazard rate , which gives the rate at which healthy individuals become sick at time t.

The traditional approach to studying the k-stage model, as explained for example in the supplementary materials of [16], has been to let define the generating function and compute Ψ_k by solving a hyperbolic PDE using the method of characteristics.

To explain this approach, we will consider the case k = 2 and write

Transition rates are given in the following table.

View this table:

From the table we get

Using the identity jΨ = y₁∂Ψ/∂y₁ this becomes which rearranges to become

To find the generating function Ψ_k(z₁, z₂,… z_k : t), one uses the fact that the solution is constant along characteristic curves, i.e., where the y_i,k(s, t) satisfy the characteristic equations and the derivative is taken with respect to the s variable. The generating function can then be found from and the survival function .

2.1 Solving the equations

To begin to solve the equations in (3), we note that is constant. We write S_1,_k(t) = y_k_−1,_k(t) since it is the first step in solving the system of equations. The subscript k is needed because the mutation rates that enter into the differential equations (3) depend on the number of stages. Throughout this paper we when we write S_i,k(t) it is assumed that z_k = 0 and z_i = 1 for 1 ≤ i < k. Changing notation we want to solve where μ = μ_k−₁_,k and S₁_,k(0) = 1. The quadratic equation αx² − (α + β + μ)x + β = 0 has two roots q > 1 > r > 0. See (31). Solving (5), see (34), gives

Having solved for S_1,_k (t) the other S_i,k (t) = y_k−i_,k(t) can be found by induction: where in the k-stage model v_i = μ_k−i,k. The computation of the S_i,k(t) is not easily found in the literature, so we will give the details in Section 7.

While the recursion in (7) was derived by the method of characteristics, it has a simple probabilistic interpretation. Each individual of type k − i gives birth to individuals of type k − i + 1 at times of a rate v_i Poisson process. A type k − i + 1 born at time s will give rise to a malignant cell with probability 1 − S_i₋₁(t − s). The number of type k − i + 1 individuals that are successful in doing this has a Poisson distribution with mean so the probability none of the type k − i + 1 individuals are successful in creating a malignant cell is S_i,k(t) = exp(−ρ_i,k(t)).

2.2 Hazard rate formulas

It is clear from (4) that H_k = S_k,k so we have

Using (9) and changing variables r = t − s before differentiating it follows that so we do not have to evaluate H_k(t) to find h_k(t).

Using (10) with (35) gives

Using (37) we have

When it comes to the fourth stage, the possibility to compute the integral in (7) breaks down and (39) gives

3 Probabilistic approach

We begin by giving probabilist interpretations for some of the computations above. To explain the differential equation for y_k−₁_,k, note that if we ignore mutation then each individual of type k − 1 initiates a linear birth and death process L(t) in which the number of individuals increases from m → m + 1 at rate am and decreases from m → m − 1 at rate βm, where α > β.

Theorem 1.

As t → ∞, e^−λtL(t) → W with P(W = 0) = β/α = P(L(t) = 0 for some t > 0) and i. e., if we condition on non-extinction then W has an exponential density with rate λ/α. If we let V₀ = (W|W > 0) then

Proof.

We will sketch the proof since it contains details that will be useful later. For more details see Section 3 of [6]. It is well know that if we start with L(0) = 1 then the generating function F(x, t) = Ex^L⁽^t⁾ satisfies with boundary condition F(x, 0) = x. This equation can be solved with the result that where λ = α − β is the exponential growth rate. By considering what happens on the first step (which is a birth with probability α/(α + β) and a death with probability β/(α + β)) we can conclude that the probability ρ that the process dies out satisfies

The extinction probability is the root which is < 1, i.e., ρ = β/α.

Comparing with (15) we see that (5) has an additional term −μy(t). In probabilistic terms, this corresponds to killing the process at rate μm when there are m individuals in the branching process. Let be the birth and death chain conditioned not to die out. Using this observation and Theorem 1, the probability of no malignant cell by time t in is where in the last step we have used (14). When the branching process dies out it does that quickly so the probability of a mutation is small. From this it follows that

To compare with (6) we need to change notation. Equations (41) and (42) imply that

Using these in (6) we have which agrees with the new formula in 17.

To compute the survival function H_k(t) from this we start at 1 and work up to type k − 1. Let η_j(s) be the rate at which type j’s born at time s. Integrating we find

We call a type k − 1 family that does not die out “successful.” Using Theorem 1, the probability that a type k − 1 is successful is λ/α. On the event that a type k − 1 is produced in [0, T], the time it is born will be distributed as η_k−₁(s)λ/α. Recalling that 1 − G(t − s) is the probability a successful birth and death process conditioned to not die out gives rise to a malignant cell and using the reasoning that led to (8), the times of successes will be roughly a Poisson process so

Using this approach we find (see Section 8 for details) that

Differentiating with respect to t, and using

At first glance the new formulas look much different than the ones from the analytic approach. However, a closer look shows they are closely related. When k = 2, η_k₋₁ = γ and (17) implies so the formulas for h₂(t) agree. Comparing the two formulas for h₃(t) and h₄(t) we see that it is enough to argue

4 The two-stage model

The formula for the hazard rate in the two stage case, given in (11), is

If we introduce P = γ(q − 1) and Q = α(q − 1) we can rewrite it as

Note that the two-stage model has five parameters γ, μ₁, μ₂, α, and β but the new formula for the hazard rate has only three γ/α, P = α(r − 1), and Q = α(q − 1). In the terminology of statistics we have an identifiability problem, i.e., not all the parameters in the model can be estimated.

View this table:

Table 1:

Fits of the two stage model hazard rate given in (24) to thyroid cancer in white females by Meza and Chang [18] and to peritoneal mesothelioma by Moolgavkar, Meza, and Turim [17].

Figure 1 gives a picture of h₂(t) for the parameters of [18]. It should be obvious from the picture that as t → ∞, h₂(t) converges to a limit. Since P < 0 < Q letting t → ∞ in (24)

Figure 2 shows the fit of the two stage model to peritoneal mesotheliomas in SEER data from 1973–2005. Parameters are given in Table 1. This time the asymptote has not been reached by age 85. The dotted line gives the fit of the Armitage-Doll formula (1) Ct^k to the data with C = 1.75 × 10⁻¹¹ and k = 2.79. Visually the second fit is worse. This is confirmed by the values of Akaike Information Criterion scores. Interested readers can consult [17] for further details.

Figure 1:

A graph of h₂(t) for the thyroid cancer parameters. x axis is age in years. y axis is cases per 100,000 per year. The asymptotic value, which is 10.02 by (25) is reached at age ≈ 40. For comparison, we give a histogram of age at diagnosis in 508 individuals in a TCGA study [22]. If one transforms the data so that it is cases per 100,000 individuals in each age group, the model fits the data. See Figure 4 in [18].

Figure 2:

The solid line is a graph of h₂(t) for the peritoneal mesothelioma parameters. The dotted line is a fit of the Armitage-Doll model with k = 2.79. x axis is age in years. y axis is cases per 100,000 per year. The asymptotic value has not been reached by age 85.

5 The three-stage model

Using (12) and changing variables P = α(r − 1) and Q = α(q − 1) it follows that the hazard rate is

Meza at al [19] show that h₃ is asymptotically linear. To state their results we need two definitions. The probability that the birth and death processes does not die out is by (6) and (18). Let T_2,3 be the time to malignancy of a single type 2 clone in the 3-stage model conditional on it not becoming extinct.

Theorem 2.

If t is large and t ≪ 1/μ₁p_∞ then

To better understand the formula for h₃(t), and to check the accuracy of the linear approximation, it is useful to have concrete examples.

View this table:

Table 2:

Meza et al [19] estimated parameters for the 3-stage model and 4-stage model (described in the next section) for pancreatic cancer in men.

To be able to compute the hazard function we need a value for α. Meza at al [19] suggest α = 9 cell divisions per year and say that the fit is not sensitive to the value of a chosen. In pancreatic cancer, when α = 9,

If we take μ₀ = μ₁ = 10⁻⁶ then N = 2 × 10⁹, and μ₁/α =1.1 × 10⁻⁷. Figure 3 gives a graph of h₃(t) (and h₄(t)) for the pancreatic cancer parameters. 1/μ₁p_∞ = 5 × 10⁹ years so the condition t ≪ 1/μ₁p_∞ holds. As the graph shows the straight line approximation is good for t ≥ 65.

Figure 3:

Graphs of h₃(t) and h₄(t) for the pancreatic cancer parameters. x axis is age in years. y axis is cases per 100,000 per year. Straight line is the linear approximation to the three stage model (27). The bar graph gives the age at diagnosis for 186 patients in the TCGA study of pancreatic cancer [4]. Again if one transforms the data to be cases per 100,000 in each age group, the theoretical curve fits the data, see Figure 5 in [19].

6 The four-stage model

6.1 Hazard rate

Using (13) and changing variables P = α(r − 1) and Q = α(q − 1)

Let T_2,4 be the time for a single type 2 clone to produce a malignant cell in the four-stage model and let since a type 2 will give rise to infinitely many type 3’s, and one of these will start a branching process that does not die out.

The asymptotic behavior of the hazard rate is constant in the two-stage case and linear in the three-stage case. One might naively guess that in the four-stage case it is asymptotically quadratic, but the simple proof given below shows it is asymptotically linear. It should be clear from the proof that this holds for any k ≥ 3.

Theorem 3.

When t is large and t ≪ 1/μ₁

Proof of Theorem 3.

When μ₁t is small

Using a well-known formula for expected value

Combining the last two equations gives the desired result.

Note: due to the complexity of the formula for S₂_,4(t) given in (36), we do not have a formula for ET_2,4. However, it is easy to compute numerically.

Leubeck and Moolgavkar [16] estimated parameters for the 2, 3, 4, and 5 stage models for colorectal cancer in women. As Figure 4 shows the fits from the four models are all very good. To explain how this could happen, we take a look at the parameters used in fitting. N = 10⁸, α = 9. Note that in the four stage model, μ₂ = 6.3, and in the five-stage model μ₃ = μ₄ = 0.9. These large values speed these processes up, effectively eliminating one and two stages respectively. In the other direction in the two stage model the very slow mutation rates μ₀ = 4.5 × 10⁻⁹ and μ₁ = 1.44 × 10⁻⁷ effectively add a stage. Thus if we judge the fitted model by the size of the mutation rates, it seems that the three-stage model gives the best fit.

Figure 4:

A comparison of the fitted values of the hazard functions for the two, three, four, and five stage models of [16]. The three and five stage fits are almost identical so you can only see three curves on the graph.

View this table:

Table 3:

Parameter values in four fits of colon cancer data from [16].

It is interesting to note that Tomasetti et al [23] have arrived at the conclusion colon cancer is a three-stage process by a completely different reasoning. They compared patients with and without a mismatch repair deficiency. They found that the latter group has 7.7 to 8.8 times as many mutations, versus a 114.2 fold increase in colon cancer rates, and argued that the increase would be more substantial if the process had four-stages. See pages 119–120 in [23] for more details and an analysis of lung adenocarcinomas.

7 Computing S_i,k (t): analytic approach

Let q > 1 > r be the roots of the quadratic equation that is,

If we write y(t) = S_1,_k(t) then the differential equation (5) can be written as

From this we see that y(t) is decreasing and will converge to r as t → ∞. Rearranging (32), we have

Here we have written the right-hand side to avoid taking the logarithm of a negative number in the next step. Multiplying both sides by q – r and then integrating from 0 to t, we have for some constant D

Exponentiating we have

Solving for y gives

Using (33) and recalling y(t) = 1 we have e^D = (q − 1)/(1 − r) which implies which is (6). Our next step is to write

To compute S_2,_k (t) we use the recursion (7)

To compute integral let f(t) = (q − 1)e⁻^α⁽^r⁻¹⁾^t − (r − 1)e⁻^α⁽^q⁻¹⁾^t and note that so we have

When k = 3, v₂ = μ₁ so we have

Integrating again we conclude that

When k = 4, v₃ = μ₁ and v₂ = μ₂ so

7.1 Approximations for q and r

When μ = 0 the roots q, r are (31)

Typically the mutation rate μ is much smaller than α and β. When it is so we have and it follows that

8 Hazard functions H_k(t) probabilistic approach

Using (19) with the formula for G(t) given in (16)

If we differentiate (43) we get

Theorem 4.

and differentiating gives

Proof.

Using (19) with the formula for G(t) given in (16)

Integrating by parts with f(s) = s and g′(s) = the fraction under the integral since f(s)g(s) = 0 when s = 0 and s = t. now change variables u = t − s.

Proof of

(23). If x is small then x ≈ 1 − e⁻^x. Using this with we have

Using (18) q − r ≈ λ, and q − 1 ≈ μ₂/λ, so the above is proving the desired result.

Theorem 5.

Differentiating with respect to t

Proof.

Using (19) with the formula for G(t) given in (16)

Integrating by parts with f(s) = s²/2 and g^′(s) is the fraction inside the integral since f(s)g(s) = 0 when s = 0 and s = t. Changing variables u = t − s gives the formula for H₄(t).

9 Conclusions

Here we have taken a probabilistic approach to analyze multi-stage models of cancer incidence. This leads to an intuitive proof of a simple and general formula for the distribution of the waiting time T_k for the first type k to appear where η_k−₁(s) = Nμ₀μ₁…μ_k−₁s^k⁻²/(k − 2)! is the rate type k − 1 mutations are produced at time s, λ/α is the probability a type k − 1 is successful, i.e., does not die out and is the probability a successful type k − 1 born at time s produces a malignant cell by time t.

Differentiating (48) we can get a formula for the hazard rate . To do this it is convenient change variables u = t − s and write Γ_k₋₁ = Nμ₀μ₁…μ_k₋₁

In the case k = 2 we have (t − u)^k⁻²/(k − 2)! ≡ 1 so there is no t in the integrand and the derivative is

When k ≥ 3 we have a positive power of t − u so differentiating the upper limit does not contribute and the derivative is

We have verified that our new formulas are almost exactly the same as the traditional ones for the k-stage model.

In Sections 4, 5, and 6 we considered four concrete applications that have been analyzed in the literature. In the case of pancreatic cancer, three and four stage models gave similar fits. See Figure 3. In the case of colon cancer, one gets almost identical fits from k-stage models with k = 2, 3, 4, 5. The parameter values for those fits (see Table 3) indicate how this is possible. The two stage fits have very small mutation rates while in the four and five stage fits, one or two mutation rates take large values. The pancreatic and colon cancer examples suggests that fitting k-stage models has little power to estimate the number of stages, but that power might be restored by constraining the parameter values to take on “realistic” values.

Footnotes

↵* This research was done while TL was participating in the Huang Fellows Summer Program at Duke
↵† RD is partially supported by NSF grant DMS 1614838 from the math biology program.

References

[1].↵
Armitage, P. (1953) A note on the time-homogeneneous birth process. J. Royal Statistical Society, B. 15, 90
OpenUrl
[2].↵
Armitage, P., and Doll, R. (1954) The age distribution of cancer and a multi-stage theory of carcinogenesis. British J. Cancer. 8, 1–12
OpenUrl CrossRef PubMed Web of Science
[3].↵
Armitage, P., and Doll, R. (1957) A two-stage theory of carcinogenesis in relation to the age-distribution of human cancers. British Journal of Cancer. 11, 161–169
OpenUrl CrossRef PubMed Web of Science
[4].↵
Bailey, P. et al. (2016) Genomic analyses identify molecular subtypes of pancreatic cancer. Nature. 531, 47–65
OpenUrl CrossRef PubMed
[5].
Durrett, R. (2012) Essentials of Stochastic Processes. Springer, New York
[6].↵
Durrett, R. (2015) Branching Process Models of Cancer. Springer, New York
[7].↵
Durrett, R., and Moseley, S. (2010) Evolution of resistance and progression to disease during clonal expansion of cancer. Theor. Pop. Biol. 77, 42–48
OpenUrl CrossRef PubMed Web of Science
[8].↵
Fearon, E.R., and Vogelstein, B. (1990) A genetic model for colorectal tumorigenesis. Cell. 759–767
[9].↵
Frank, S.A. (2007) Dynamics of Cancer: Incidence, Inheritance and Evolution. Princeton U. Press
[10].↵
Fisher, J.C., and Hollomon, J.H. (1951) A hypothesis for the origin of cancer foci. Cancer. 4, 916–918
OpenUrl CrossRef PubMed Web of Science
[11].↵
Garraway, L.A., and Lander, E.S. (2013) Lessons from the cancer genome. Cell. 153, 17–37
OpenUrl CrossRef PubMed Web of Science
[12].↵
Heidenreich, W.F., Luebeck, E.G., and Moolgavkar, S.H. (1997) Some properties of the hazard function of the two-mutation clonal expansion model. Risk Analysis. 17 (1997), 391–399
OpenUrl CrossRef PubMed Web of Science
[13].↵
Hoogenveen, R.T., Clewell, H.J., Andersen, M.E., and Slob, W. (1999) An alternative exact solution of the two-stage clonal growth model of cancer. Risk Analysis. 19, 9–14
OpenUrl
[14].↵
Knudson, A.G. (1971) Mutation and cancer: Statistical study of retinoblastoma. Proc. Natl. Acad. Sci. 68, 820–823
OpenUrl Abstract/FREE Full Text
[15].
Knudson, A.G. (2001) Two genetic hits (more or less) to cancer. Nature Reviews Cancer. 1, 157–162
OpenUrl CrossRef PubMed Web of Science
[16].↵
Leubeck, E.G and Moolgavkar, S.H. (2002) Multistage carcinogenesis and the incidence of colorectal cancer. Proc. Natl. Acad. Sci. 99, 15095–15100
OpenUrl Abstract/FREE Full Text
[17].↵
Moolgavkar, S.H., Meza, R., and Turim, J. (2009) Pleural and peritoneal mesothelioma in SEER: age effects and termporal trends, 1973–2005. Cancer Causes Control. 20, 935–944
OpenUrl CrossRef PubMed Web of Science
[18].↵
Meza, R., and Chang, J.T. (2015) Multistage carcinogenesis and gthe incidence of thryoid cancer in the US by sex, race, stage and histology. BMC Public Health. 15, paper 789
OpenUrl CrossRef PubMed
[19].↵
Meza, R., Jeon, J., Moolgavkar, S.H., and Luebeck, E.G. (2008) Age-specific incidence of cancer: Phases, transitions, and biological implications. Proc. Natl. Acad. Sci. 105, 16284–16289
OpenUrl Abstract/FREE Full Text
[20].↵
Nordling, C.O. (1953) A new theory on cancer inducing mechanism. Brit. J. Cancer. 7, 68–72
OpenUrl CrossRef PubMed Web of Science
[21].
The Cancer Genome Atlas Network. (2012) Comprehensive molecular characterization of human colon and rectal cancer. Nature. 487, 330–337
OpenUrl CrossRef PubMed Web of Science
[22].↵
The Cancer Genome Atlas Research Network. (2014) Integrated genomic characterization of pappilary thyroid carcinoma. Cell. 159–690
[23].↵
Tomasetti, C., marchioni, L., nowak, M.A., parmigiani, G., and Vogelstein, B. (2015) Only three driver gene mutations are required for the developmetn of lung and colorectal cancers. Proc. Natl. Acad. Sci. 112, 118–123
OpenUrl Abstract/FREE Full Text
[24].↵
Vogelstein, B., and Kinzler, K.W. (1998) The Genetic Basis of Human Cancer. McGraw Hill
[25].↵
Vogelstein, B, et al. (2013) Cancer genome landscapes. Science 339, 1546–1558
OpenUrl Abstract/FREE Full Text

View the discussion thread.

Posted January 05, 2018.

Download PDF

Citation Tools

Subject Area

Cancer Biology

Subject Areas

All Articles

Animal Behavior and Cognition (5220)
Biochemistry (11760)
Bioengineering (8760)
Bioinformatics (29211)
Biophysics (14986)
Cancer Biology (12104)
Cell Biology (17417)
Clinical Trials (138)
Developmental Biology (9429)
Ecology (14189)
Epidemiology (2067)
Evolutionary Biology (18316)
Genetics (12246)
Genomics (16807)
Immunology (11875)
Microbiology (28106)
Molecular Biology (11607)
Neuroscience (61019)
Paleontology (452)
Pathology (1872)
Pharmacology and Toxicology (3238)
Physiology (4964)
Plant Biology (10429)
Scientific Communication and Education (1683)
Synthetic Biology (2888)
Systems Biology (7341)
Zoology (1651)

[1] [1].↵
Armitage, P. (1953) A note on the time-homogeneneous birth process. J. Royal Statistical Society, B. 15, 90
OpenUrl

[2] [2].↵
Armitage, P., and Doll, R. (1954) The age distribution of cancer and a multi-stage theory of carcinogenesis. British J. Cancer. 8, 1–12
OpenUrl CrossRef PubMed Web of Science

[3] [3].↵
Armitage, P., and Doll, R. (1957) A two-stage theory of carcinogenesis in relation to the age-distribution of human cancers. British Journal of Cancer. 11, 161–169
OpenUrl CrossRef PubMed Web of Science

[4] [4].↵
Bailey, P. et al. (2016) Genomic analyses identify molecular subtypes of pancreatic cancer. Nature. 531, 47–65
OpenUrl CrossRef PubMed

[5] [5].
Durrett, R. (2012) Essentials of Stochastic Processes. Springer, New York

[6] [6].↵
Durrett, R. (2015) Branching Process Models of Cancer. Springer, New York

[7] [7].↵
Durrett, R., and Moseley, S. (2010) Evolution of resistance and progression to disease during clonal expansion of cancer. Theor. Pop. Biol. 77, 42–48
OpenUrl CrossRef PubMed Web of Science

[8] [8].↵
Fearon, E.R., and Vogelstein, B. (1990) A genetic model for colorectal tumorigenesis. Cell. 759–767

[9] [9].↵
Frank, S.A. (2007) Dynamics of Cancer: Incidence, Inheritance and Evolution. Princeton U. Press

[10] [10].↵
Fisher, J.C., and Hollomon, J.H. (1951) A hypothesis for the origin of cancer foci. Cancer. 4, 916–918
OpenUrl CrossRef PubMed Web of Science

[11] [11].↵
Garraway, L.A., and Lander, E.S. (2013) Lessons from the cancer genome. Cell. 153, 17–37
OpenUrl CrossRef PubMed Web of Science

[12] [12].↵
Heidenreich, W.F., Luebeck, E.G., and Moolgavkar, S.H. (1997) Some properties of the hazard function of the two-mutation clonal expansion model. Risk Analysis. 17 (1997), 391–399
OpenUrl CrossRef PubMed Web of Science

[13] [13].↵
Hoogenveen, R.T., Clewell, H.J., Andersen, M.E., and Slob, W. (1999) An alternative exact solution of the two-stage clonal growth model of cancer. Risk Analysis. 19, 9–14
OpenUrl

[14] [14].↵
Knudson, A.G. (1971) Mutation and cancer: Statistical study of retinoblastoma. Proc. Natl. Acad. Sci. 68, 820–823
OpenUrl Abstract/FREE Full Text

[15] [15].
Knudson, A.G. (2001) Two genetic hits (more or less) to cancer. Nature Reviews Cancer. 1, 157–162
OpenUrl CrossRef PubMed Web of Science

[16] [16].↵
Leubeck, E.G and Moolgavkar, S.H. (2002) Multistage carcinogenesis and the incidence of colorectal cancer. Proc. Natl. Acad. Sci. 99, 15095–15100
OpenUrl Abstract/FREE Full Text

[17] [17].↵
Moolgavkar, S.H., Meza, R., and Turim, J. (2009) Pleural and peritoneal mesothelioma in SEER: age effects and termporal trends, 1973–2005. Cancer Causes Control. 20, 935–944
OpenUrl CrossRef PubMed Web of Science

[18] [18].↵
Meza, R., and Chang, J.T. (2015) Multistage carcinogenesis and gthe incidence of thryoid cancer in the US by sex, race, stage and histology. BMC Public Health. 15, paper 789
OpenUrl CrossRef PubMed

[19] [19].↵
Meza, R., Jeon, J., Moolgavkar, S.H., and Luebeck, E.G. (2008) Age-specific incidence of cancer: Phases, transitions, and biological implications. Proc. Natl. Acad. Sci. 105, 16284–16289
OpenUrl Abstract/FREE Full Text

[20] [20].↵
Nordling, C.O. (1953) A new theory on cancer inducing mechanism. Brit. J. Cancer. 7, 68–72
OpenUrl CrossRef PubMed Web of Science

[21] [21].
The Cancer Genome Atlas Network. (2012) Comprehensive molecular characterization of human colon and rectal cancer. Nature. 487, 330–337
OpenUrl CrossRef PubMed Web of Science

[22] [22].↵
The Cancer Genome Atlas Research Network. (2014) Integrated genomic characterization of pappilary thyroid carcinoma. Cell. 159–690

[23] [23].↵
Tomasetti, C., marchioni, L., nowak, M.A., parmigiani, G., and Vogelstein, B. (2015) Only three driver gene mutations are required for the developmetn of lung and colorectal cancers. Proc. Natl. Acad. Sci. 112, 118–123
OpenUrl Abstract/FREE Full Text

[24] [24].↵
Vogelstein, B., and Kinzler, K.W. (1998) The Genetic Basis of Human Cancer. McGraw Hill

[25] [25].↵
Vogelstein, B, et al. (2013) Cancer genome landscapes. Science 339, 1546–1558
OpenUrl Abstract/FREE Full Text

A new look at multi-stage models of cancer incidence

Abstract

1 Introduction

2 Analytic approach

2.1 Solving the equations

2.2 Hazard rate formulas

3 Probabilistic approach

Theorem 1.

Proof.

4 The two-stage model

5 The three-stage model

Theorem 2.

6 The four-stage model

6.1 Hazard rate

Theorem 3.

Proof of Theorem 3.

7 Computing Si,k (t): analytic approach

7.1 Approximations for q and r

8 Hazard functions Hk(t) probabilistic approach

Theorem 4.

Proof.

Proof of

Theorem 5.

Proof.

9 Conclusions

Footnotes

References

Citation Manager Formats

Subject Area

7 Computing S_i,k (t): analytic approach

8 Hazard functions H_k(t) probabilistic approach