Abstract
In rapidly growing bacteria initiation of DNA replication occurs at intervals shorter than the time required for completing genome duplication, leading to overlapping rounds of replication. We propose a mathematical model of DNA replication defined by the periodicity of replication initiation. Our model predicts that a steeper gradient of the replication profile is to be expected in origin proximal regions due to the overlapping rounds of synthesis. By comparing our model with experimental data from a strain with an additional replication origin, we predict defined alterations to replication parameters: (i) a reduced fork velocity when there were twice as many forks as normal; (ii) a slower fork speed if forks move in a direction opposite to normal, in line with head-on replication-transcription collisions being a major obstacle for fork progression; (iii) slower cell doubling for a double origin strain compared to wild-type cells; and (iv) potentially an earlier initiation of replication at the ectopic origin than at the natural origin, which, however, does not a˙ect the overall time required to complete synthesis.
Introduction
Eukaryotic DNA replication is highly regulated to ensure that replication origins are activated only once during each synthesis phase (Blow and Gillespie, 2008). In contrast, bacteria can, under favorable conditions, increase their growth rate by permitting overlapping rounds of replication (Nordstrom and Dasgupta, 2006). Concurrent rounds of replication have also been shown to occur in some archaea with multiple replication origins (Hawkins et al, 2013). The majority of bacteria, including Escherichia coli, have one circular chromosome and a single replication origin, termed oriC (Duggin et al, 2008; Rocha, 2008). Two replication forks are recruited at oriC and proceed bidirectionally until they meet in an area opposite the origin. This area is flanked by multiple ter sites, which, if the Tus terminator protein is bound, forms a replication fork trap that allows forks to enter but not to leave (Rocha, 2008). This divides the E. coli chromosome into two defined halves or replichores. Progression of forks within each replichore is not necessarily smooth. Replication forks may be blocked or displaced from the chromosome when they encounter obstacles such as DNA lesions, proteins bound to DNA or transcribing RNA polymerase complexes. The majority of highly transcribed genes are oriented co-directionally with replication fork movement, indicating that head-on collisions of replication and transcription complexes might be particularly problematic (Brewer, 1988; McLean et al, 1998; McGlynn et al, 2012).
Cells undergoing rapid growth may initiate multiple, concurrent rounds of DNA replication such that each cell may contain two or more origins at the time when septation is complete (Fig 1A). This allows cells to grow with a doubling time less than that required to copy the whole genome (Nordstrom and Dasgupta, 2006). The relative replication time of chromosomal loci can be determined experimentally by measuring the locus copy number in replicating cells relative to non-replicating cells; an approach termed marker frequency analysis. Recent technical advances have allowed genome-wide application of marker frequency analysis using microarrays or deep sequencing. Such experimental measures of genome replication can identify deviations from the predicted marker frequency that indicate perturbations to genome replication, such as DNA damage (Sangurdekar et al, 2010), chromosomal fragmentation (Kouzminova and Kuzminov, 2008), regions of variable fork velocity (DeMassy et al, 1987; Srivatsan et al, 2010), chromosomal rearrangements (Skovgaard et al, 2011), or pathological consequences of replication termination (Rudolph et al, 2013). However, in order to fully analyse the consequences of changes to replication dynamics in cells with distortions of DNA replication, we need a robust understanding of the parameters that define the experimentally obtained replication profiles.
A model for DNA replication during the bacterial division cycle was proposed by Cooper and Helmstetter (Cooper and Helmstetter, 1968). In this model, the average frequency of a particular locus per cell (e.g. a marker) can be predicted by taking into account the duration of chromosome replication, the time from the completion of chromosome duplication to complete cell division, the cell doubling time and distance between the loci and the origin. The model assumes that replication is initiated at every doubling in cell mass. A number of mathematical models have been formulated to describe other biological processes related to or influencing replication: simulation of gene expression in exponentially growing bacterial cells (Luo et al, 2013; Gomez et al, 2014), computation of theoretical DNA histograms for flow cytometry (Stokke et al, 2012), simulation of the bacterial cell division cycle (Zaritsky et al, 2011), the average cell oscillations in DnaAATP/DNA (Grant et al, 2011), or even a wide variety of 128 models that make di˙erent assumptions about the regulation of replication (Creutziger et al, 2012), but none of these studies considered the genome-wide parameters of replication.
In the present work, we have developed a new mathematical model for concurrent rounds of DNA replication. The model derivation is based on a spatiotemporal representation of the replication program (Retkute et al, 2012; Baker and Bechhoefer, 2014; Retkute, 2014). We test the proposed model on experimental data from the bacterium E.coli (Guzman et al, 2002; Riber et al, 2006; Skovgaard et al, 2011; Rudolph et al, 2013; Maduike et al, 2014). We found that a steeper gradient in replication profile is to be expected in the origin proximal region due to overlapping cell cycles. Furthermore, we extend this model for the example of DNA replication in E.coli with two origins (Wang et al, 2011; Rudolph et al, 2013) to gain valuable biological insight into this perturbed system. We found that fork velocity was reduced during times when there were twice as many forks as normal and synthesis was even slower for forks moving in a direction opposite to the normal direction of replication. As a consequence, a double origin strain grows slower compared to cells with a single origin. Finally, the ectopic origin may start ahead of the natural origin, but this has no e˙ect on the time required to complete synthesis.
Results
Modelling concurrent rounds of DNA replication
The initiation of DNA replication in fast growing bacteria occurs prior to the completion of the previous round of replication resulting in concurrent rounds of synthesis. Therefore, we define a parameter s as the periodicity of replication initiation, i.e. the time between two concurrent rounds of replication normalised by the time required to replicate the whole chromosome (Fig 1 and Supplementary Fig S1; for details see Materials and Methods). The model is defined in dimensionless variables: the chromosome positions are scaled relative to the distance between oriC and ter; while time, periodicity and di˙erence in initiations between origins, are scaled by the time required to replicate half of the chromosome. Assuming that the replication fork velocity is constant from initiation to termination, the periodicity can describe both temporal coordination of replication initiation and relative location of forks along the chromosome.
Figure 1 shows three representations of bacterial chromosome replication: schematics of chromosome replication configuration, spatiotemporal representation, and predicted marker frequency profiles; graphs are shown for two (s = 0.8, left panel) and three concurrent rounds of replication (s = 0.4, right panel). The marker frequency at a position x is the average number of copies in a population. The origin proximal regions have the highest copy number, and sites at −1 + s and 1 − s have exactly two copies, as these are the positions on the genome where forks from the subsequent round of replication are located when forks from the current round reach the ter site.
Replication in wild-type E. coli
Our mathematical model predicts that replication profiles from fast growing bacteria will have at least one change in the slope, with the steepest replication profile gradient in the origin proximal regions. This is in contrast to other models which predict a continuous variation in slope approximating to an exponential decay (Cooper and Helmstetter, 1968). To test whether this model prediction is accurate, we fitted the model to 5 experimental data-sets (see Materials and Methods; Fig 2). Using only a single parameter (s), our model was able to give a fit to experimental data comparable or better than the Cooper and Helmstetter model (Cooper and Helmstetter, 1968), which requires knowledge of at least two cell cycle parameters (doubling time and time required to complete synthesis). Parameters for both models were fitted by minimising a mean squared error (Supplementary Table S1).
Replication in E. coliwith two origins
Initiation of DNA synthesis occurring at multiple sites along a chromosome has been studied in E.coli by constructing a strain with two replication origins: the natural origin oriC and an ectopic origin oriZ, inserted 1 Mb away from the normal origin on the right replichore (Wang et al, 2011). Figure 3A shows the marker frequency profile (Rudolph et al, 2013) and the fitted model with assumption that both origins have identical properties. The fit to the experimental data is good, considering the simplicity of the model. However, the model shows some deviations from the marker frequency profile, in particular the peak heights and the slopes in the oriC and oriZ proximal region.
In order to improve the fit, we refined our model to test our assumptions and gain biological insight into DNA replication in this two origin system. We allowed (i) forks to move with di˙erent velocities, and (ii) one of two origins to initiate earlier than the other. Our refined model gives a significantly better fit to the experimental data and provides evidence to support the following perturbations to DNA replication resulting from the addition of the ectopic origin (Fig 3B and C, Supplementary Fig S2). First, the model predicts that replication forks proximal to oriC and oriZ are slower than forks proximal to the termination site. A reduction in speed by 30% gave an significantly improved fit. The spatiotemporal representation indicates that fork velocity reduction occurs during times when there are twice as many forks compared to cells with a single origin (Fig 3C). Following the termination event between oriZ towards oriC the number of replication forks is comparable to that in wild-type cells and this is concomitant with a change to faster fork velocities. Second, the replication fork proceeding from oriZ towards oriC is further reduced in velocity (total 50% reduction relative to forks proximal to the termination site) and it is noteworthy that this fork is moving against the direction of highly transcribed genes. Third, one of the explanation for the di˙erence in peak heights in the oriC and oriZ proximal region (Supplementary Fig S3) may be that initiation at the ectopic origin occurs before initiation at the natural origin (population average δ = 0.06, i.e 6% of the time required to replicate the whole chromosome).
By directly comparing the experimentally measured profile with the profile based on the simple model, we were able to uncovered how cells respond when an extra replication origin is introduced to the chromosome. Most of the deviations can be readily explained by biological features which may be expected − activation of the ectopic origin will require additional replication forks, and some of these forks will move against the direction of highly transcribed genes. Our accurate mathematical model should be useful tool in experiments where perturbations to replication process are not known in advance, for example, in the presence of defects in nucleic acid metabolism.
Discussion
We have formulated a mathematical model that allows interpretation of DNA replication initiation dynamics from marker frequency profiles. We have successfully applied this model to study replication dynamics in wild type E. coli cells. Analysis of this mathematical model shows that the replication profile is composed of a set of varying slopes (replication profile gradients). This has an important biological implication − the change in the gradient in the oriC proximal region is predicted as a consequence of overlapping cell cycles, not as a consequence of variability in replication fork velocity. The increased steepness of the slope of the profile 0.3 Mb on each side of oriC previously has been attributed to either replication forks moving slower over this interval, or some unusual perturbation of replication initiation (Maduike et al, 2014) (Fig 2E). This region coincides with a marker frequency gradient change predicted by the model without the need to invoke a variable fork velocity. The model precisely predicts the coordinates where the slope of replication profile changes: it is (1 − 0.71) × 4.63 Mb/2 = 0.42 Mb. Therefore, our mathematical model is able to explain that a steeper gradient is to be expected due to the overlapping cell cycles without assuming any perturbation of the speed of synthesis. The model can describe replication in other bacteria such as Bacillus subtilis, as shown in Supplementary Fig S4 (Kono et al, 2014).
To maintain constant genomic content at division, the doubling time should be directly proportional to the periodicity of initiation. If the time required to replicate the chromosome, C, is known, then doubling time τ = s×C (where τ and C are in minutes and s is unitless). This means that the periodicity of initiation is what sets the cell division time. The growth rate (doublings per minute) can be calculated as 1/(s × C).
We assumed that there is little variation in the periodicity of origin firing. This implied a discrete location for the change in gradient in replication profiles (Fig 2). If this assumption is relaxed and initiation can occur in a window [s − Δs, s + Δs], replication profiles will have a smoother transition between gradients (Supplementary Fig S5). However, allowing less defined periodicity does not improve the model fit (Supplementary Table S2). This suggests that there is a strict periodicity in replication initiation, which is consistent with known feedback loop that regulates initiation time (Browning et al, 2004).
The comparative analysis of our simple model with the data obtained in a double ori background reveals di˙erences, which allows us to gain insights into the processes that govern DNA replication when replication is perturbed. Any deviation from the model can indicate alterations to replication parameters. By adjusting the model assumptions we can add extra parameters, analyse their influence and uncovered how cells respond to the presence of two origins. The model deviations predict variability of the following parameters: reduced fork velocity, longer chromosome synthesis time, and possible earlier initiation of the ectopic origin.
The overall reduction in fork velocity proximal to oriC and oriZ could be a consequence of limited dNTP concentrations if there are insu˚cient compensatory mechanisms to supply the increased number of forks resulting from the addition of the ectopic origin (Nordman and Wright, 2008). For example, in cells with a single origin undergoing concurrent rounds of replication the total number of replisomes per cell is limited to approximately 10 (Kelman and O’Donnell, 1995). The maximum number of replisomes per genome is 10 in E. coli with two origins (Figure 3D). Therefore, the rapid speed of DNA synthesis may be compromised due to the intracellular scarcity of dNTP molecules.
The further reduction in fork velocity upstream of oriZ is likely to be due to head-on collisions between this replication fork and transcription complexes. There are 36 mRNA promoters with transcriptional start sites encoded on the leading strand in this region (Supplementary Information, Hershberg et al, 2001). The replisome may stall when it encounters a RNA polymerase moving in the opposite direction but eventually resumes elongation after displacing the RNAP from DNA (McGlynn et al, 2012). Similar behaviour was observed when transcription was reversed by inverting a quarter of the B. subtilis chromosome downstream of the origin, this led to 30% decrease of replication rate (Srivatsan et al, 2010). This implies that the site of replication termination between oriC and oriZ is dictated not solely by replication timing and fork velocity, but is modulated by transcription (Rudolph et al, 2013).
We found a slower cell doubling for the double origin strain compared to wild-type cells: 23 min in rapidly growing bacteria with one origin, and 25 min in rapidly growing bacteria with two origins (Supplementary Information). This is in agreement with experimentally measured doubling times in rich media, which were 23 min for the single origin strain and 24 min for the two origin strain, however, the di˙erence observed experimentally is not statistically significant (Wang et al, 2011).
Fitting the modelled replication profile to the experimental data indicated that initiation at the ectopic origin may occur before initiation at the natural origin, which could be a consequence of the di˙erences in accessibility for DnaA binding due to nuclear di˙usion (Berg, 1984) or macromolecular crowding (Akabayov et al, 2013). However, this does not a˙ect the time required to complete synthesis, as the distance travelled by the rightward moving fork initiated at oriZ is about 50% shorter then the distance traveled by the leftward moving fork initiated at oriC (1.25 Mb and 2.31 Mb). So the former reaches ter site earlier, but is blocked by terminator proteins and has to wait in the ter region. Under normal conditions, the left and right replicons are approximately equal in size, so the di˙erence in initiation timing may indicate there is a mechanism which tries to regulate time spend by replisome in the termination zone. Further experimental data is required to verify if this e˙ect is a consequence of the particular growth conditions or bacterial strain.
We expect that our proposed mathematical formulation can be adopted for various DNA replication alterations. Although the model is formulated for a bacterial genome, it is relevant to replication in archaea (Hawkins et al, 2013), the development of genetic instability in tumour cells due to rereplicating DNA segments (Vaziri et al, 2003), plant cell growth during endoreplication cell cycle (Edgar and Orr-Weaver, 2001) or replication in polytene nurse cell from Drosophila (Hammond and Laird, 1985). By focussing on specific and more specialised aspects of chromosome replication, our model can be further refined to gain deeper biological insight into the regulation of DNA replication. With deep sequencing technologies providing cheaper and higher-resolution profiles of whole genome replication, there is an increasing requirement for mathematical models to maximise our biological understanding of these fundemental cellular processes.
Materials and Methods
Marker frequency
We have analysed five experimental data-sets for E. coli with one origin, and one experimental data-set for E. coli with two origins. Details on the data, method and growth conditions are given in Table 1. The chromosome positions were scaled relative the distance between oriC and ter (Fig 2).
Mathematical model for DNA replication in bacteria with one origin
Mathematical model for DNA replication during overlapping cell cycles in bacteria is based on the spatiotemporal representation of the replication program (Retkute et al, 2012; Baker and Bechhoefer, 2014; Retkute, 2014). Replication initiates at replication origin oriC and proceeds bidirectionally around the chromosome to a site opposite oriC called the replication terminus ter (Fig. 1A). For simplicity we virtually linearise the chromosome by putting replication origin at position x = 0, and normalise its length so that the distance between oriC and ter is one arbitrary unit (a.u.). The chromosome is divided into a left replichore at x ∊ [−1, 0] and right replichore at x ∊ [0, 1]. Each replication initiation event creates two replication forks which move with a constant velocity v to the opposite directions away from oriC. Replication timing R(x) is defined as the time at which the locus x is replicated. For a single initiation event the replication timing at the site x is given by (de Moura et al, 2010; Yang et al, 2010):
If we define a time unit one as the time required to complete synthesis of the whole chromosome, and assume that the replication fork velocity is constant from initiation at oriC to termination at ter, we obtain from Eq.(1) that ,therefore v = 1.
We defined the periodicity of initiation, s, as a time interval at which new initiation events occur. Therefore replication timing of kth initiation is Rk(x) = |x| + (k − 1)s. When a new round of replication is initiated, forks from the previous round have moved v × s = s away from oriC, i.e. the distance is exactly s.
Based on the spatiotemporal representation of the replication program we can derive an analytical expression for marker frequency at a position x which depend only on the parameter s. Marker frequency profile for replication case where up to two replication rounds occur is given by the following equation:
Derivation of the model is given in Supplementary Information.
Mathematical model for DNA replication in bacteria with two origin
If there is more then one origin present on the chromosome, then the replication time of a given position in the chromosome is found by evaluating where xi is the positions of the origin i (de Moura et al, 2010; Yang et al, 2010). The marker frequency is calculated from the spatiotemporal representation of the replication program (details in Supplementary Information).
Parameter estimation
Parameters were fitted by minimising a mean squared error between model predicted marker frequency, fi,and experimentally measured marker frequency, di:
Author Contributions
R.R., C.A.N, C.R. and M.H. designed the study. R.R. developed the mathematical model and conducted calculations. R.R., C.A.N, C.R. and M.H. wrote and edited the manuscript.
Conflict of interest
The authors declare that they have no conflict of interest.
Acknowledgements
The authors thank Prof Anders Lobner-Olesen, Prof Ole Skovgaard and Dr Kono Nobuaki for providing experimental data.
Footnotes
Subject categories: DNA Replication, Repair & Recombination