ABSTRACT
The human brain can be represented as a graph in which neural units such as cells or small volumes of tissue are heterogeneously connected to one another through structural or functional links. Brain graphs are parsimonious representations of neural systems that have begun to offer fundamental insights into healthy human cognition, as well as its alteration in disease. A critical open question in network neuroscience lies in how neural units cluster into densely interconnected groups that can provide the coordinated activity that is characteristic of perception, action, and adaptive behaviors. Tools that have proven particularly useful for addressing this question are community detection approaches, which can be used to identify communities or modules in brain graphs: groups of neural units that are densely interconnected with other units in their own group but sparsely interconnected with units in other groups. In this paper, we describe a common community detection algorithm known as modularity maximization, and we detail its applications to brain graphs constructed from neuroimaging data. We pay particular attention to important algorithmic considerations, especially in recent extensions of these techniques to graphs that evolve in time. After recounting a few fundamental insights that these techniques have provided into brain function, we highlight potential avenues of methodological advancements for future studies seeking to better characterize the patterns of coordinated activity in the brain that accompany human behavior. This tutorial provides a naive reader with an introduction to theoretical considerations pertinent to the generation of brain graphs, an understanding of modularity maximization for community detection, a resource of statistical measures that can be used to characterize community structure, and an appreciation of the utility of these approaches in uncovering behaviorally-relevant network dynamics in neuroimaging data.
The brain is a complex system composed of neural units that often communicate with one another in spatially intricate and temporally dynamic patterns (Alivisatos et al., 2012). Modern neuroscience seeks to understand how these patterns of neural communication reflect thought, accompany cognition, and drive behavior (Bressler and Menon, 2010). Many conceptual theories and computational methods have been developed to offer mechanisms and rules by which heterogeneous interaction patterns between neural units might produce behavior. A particularly appropriate mathematical language in which to couch these theories and methods is network science. In its simplest form, network science summarizes a system by isolating its component parts (nodes) and their pairwise interactions (edges) in a graph. The application of network science to neuroscience (also known as network neuroscience (Bassett and Sporns, 2017)) has offered intuitions for the fundamental principles of organization and function in the brain (Bullmore and Sporns, 2009). In particular, network modularity has proven useful in identifying neural units that are structurally or functionally connected to one another in clusters or modules (Meunier et al., 2010). Intuitively, modularity is an architectural design feature that allows neurophysiological processes to implement local integration of information. To assess the presence and strength of modularity in the brain, we first build graphs from neural data and then apply community detection techniques to identify modules and to characterize their structure and function. In this tutorial, we introduce community detection and its application to neuroimaging data, including approaches to define graph nodes and edges from diverse data sources. We then describe methods and summary statistics to identify and characterize community structure in single graphs, as well as some methods and summary statistics to identify and characterize community structure in time-evolving graphs. Finally, we discuss methodological innovations that are needed to drive the future of network neuroscience, enabling critical advancements in our knowledge about how patterns of coordinated activity in the brain accounts for human behavior.
Modularity in mind and brain
Before describing computational methods to detect network communities, we first motivate the investigation of modularity in the brain by drawing on historical observations in philosophy, psychology, and neuroanatomy. The concept of modularity was central in Greek philosophy, and is illustrated by Plato’s famous passage stating, “That of dividing things again by classes, where the natural joints are, and not trying to break any part” (Plato, Phaedrus, section 265e). This intuitive notion of modularity strongly influenced the precursors of contemporary psychology, as perhaps best illustrated by Franz Josef Gall in the 18th century whose work was predicated on the notion of morphometric modularity. Skull landmarks and aberrations in cranial morphology were used to identify cognitive modules, leading to the pseudo-science of phrenology (Gall, 1835). Although modern psychology, cognitive science, and neuroscience have replaced phrenology, the modular account of brain function remains a principal feature of cognitive theory (Fodor, 1983; Bilder et al., 2009; Price and Friston, 2005). Indeed, a key focus of modern neuroimaging lies in understanding the computational specificity of brain regions that serve as the building blocks for brain modules, which in turn give rise to human thought and behavior (Poldrack, 2010).
Modularity offers fundamental advantages for evolution and development. Research across the biological sciences suggests that modular organization allows for rapid adaptation (Kashtan and Alon, 2005; Gross and Blasius, 2008) and provides robustness to either sudden or gradual perturbations in genes or environment (Kirschner and Gerhart, 1998; Kashtan and Alon, 2005; Kashtan et al., 2007). Unlike homogeneously connected networks, modular networks can effectively buffer the impact of perturbations by keeping their effects relatively local, sometimes even to the point of constraining them to remain within the boundaries of a single community (Nematzadeh et al., 2014). This segregation of the system also enables efficient information processing (Espinosa-Soto and Wagner, 2010; Baldassano and Bassett, 2016), supporting functional specialization (Gallos et al., 2012) and efficient learning (Ellefsen et al., 2015). These benefits of modularity are particularly relevant for the human brain, which evolved under evolutionary pressures for adaptability (Lipson et al., 2002), energy efficiency, and cost minimization (Clune et al., 2013; Bullmore and Sporns, 2012; Raj and Chen, 2011; Chen et al., 2006; Betzel et al., 2017b), and which also develops under biological pressures to balance segregation and integration of function (Bullmore and Sporns, 2012).
From a systems perspective, modularity can also play a role in shaping neural activity. Compared to random networks, modular networks can give rise to more complex dynamics (Sporns et al., 2000). Modular networks of coupled oscillators also promote synchronizability (Arenas et al., 2006) as well as the formation of chimera states, characterized by the coexistence of synchronized and desynchronized neural elements (Wildie and Shanahan, 2012). In systems like the brain that have a large number of interacting elements, modularity often exists across hierarchical levels of organization (Meunier et al., 2009b), enabling rapid responses to fluctuating external input (Kinouchi and Copelli, 2006; Beggs, 2008; Moretti and Muñoz, 2013). As its core advantage for brain function, hierarchical modularity has been shown to enable complex dynamics alongside functional efficiency in both biological and man-made systems (Simon, 1962; Kaiser and Hilgetag, 2010). Collectively, the functional benefits of modularity provide a strong motivation for studying the modular organization of the human brain in both healthy and clinical populations (Fig. 1).
A language for probing modularity in the brain
Exactly how modularity is instantiated in the brain is a question that has fascinated neuroscientists for more than a century. In reality, the answer depends on what spatial scale of organization is considered: single neurons (y Cajal, 1954), larger neural ensembles (Hubel and Wiesel, 1962), or the whole-brain (Hagmann et al., 2008). Even at a single spatial scale, the answer could depend on the neuroimaging modality used to measure ongoing dynamics such as local field potentials from small groups of neurons in ECOG (Garell et al., 1998), ensemble electrical activity from estimated neural sources in EEG (Berger, 1929), or indirect energy consumption of regions from blood oxygenated level-dependent (BOLD) data in fMRI (Logothetis and Wandell, 2004). Each spatial scale and measurement technique can offer different insights into how modularity influences brain function.
Ideally, to fully understand modularity in the brain, one might seek a language that can describe and mathematically quantify the grouping of neural units in a way that is agnostic to many of the biological details that differ across these scales and measurements. Network science offers exactly such a language. In its simplest form, network science distills a system into its component parts and their interactions, and then represents them in the form of a graph. In broad terms, graphs can be used to represent relationships (edges) between objects or processing units (nodes). Across both manmade and natural systems, observed graphs frequently display appreciable heterogeneity that is critical for system function and that can be extracted, quantified, and explained using graph-based tools (Fortunato, 2010). Intuitively, graph representations reduce the natural complexity characteristic of neuroimaging data, while seeking to maintain the most important organizational features of that data.
Building brain graphs
A brain graph represents neuroimaging data in a manner that is agnostic to its spatial scale and measurement modality (Fig. 2). In its general form, a brain graph is a set of nodes characterizing anatomical, functional, or computational units, and a set of edges that represent a pairwise relation between two nodes. The flexibility of this representation allows one to distill the brain into the pieces that are most fundamental to the function under study or the hypothesis being tested (Butts, 2009). The graph representation also allows for the notion of relation to differ across graphs by changing the way in which edges are defined. In brain graphs, for example, edge definitions can range from direct structural connections between nodes (Yeh et al., 2016; Muldoon et al., 2016; Vettel et al., in press; Kahn et al., 2017) to higher order pairwise relations between anatomical or functional units (Betzel and Bassett, 2016; Bassett et al., 2014; Davison et al., 2015, 2016; Giusti et al., 2016; Schmälzle et al., 2017). Irrespective of one’s choice of how to define nodes and edges, it is critical to ensure that the interpretations that are made from the graph are consistent with the spatial and temporal truths about those choices (Power et al., 2011; Butts, 2009; Wig et al., 2011). In the next two sections, we describe some of the common choices for defining nodes and edges, and a few relevant considerations that affect interpretation.
Defining nodes
The most common nodes in brain graphs represent regions delineated by anatomical boundaries or functional contrasts in MRI data. However, nodes have also been defined from other types of neuroimaging data with higher temporal sampling of neural activity, including electrocorticography (ECoG), electroencephalography (EEG), and magnetoencephalography (MEG). More recently, methods have been developed to define nodes by combining data from multiple imaging modalities. Below, we briefly present the specifics of each approach with a focus on how each definition constrains the interpretation of the resulting graph.
The first few MRI studies using graphs to examine human brain function defined brain nodes using anatomical landmarks (Achard et al., 2006): each large-scale brain region was defined by features of cytoarchitecture including cell density, synaptic density, and myelination (Brodmann, 1909; Von Economo and Koskinas, 1925). While a reliable relationship between cytoarchitecture and network architecture has been identified (Beul et al., 2015), research has also productively used sulcal and gyral landmarks to delineate brain regions in several popular parcellations, including the Lausanne (Hagmann et al., 2008), Desikan (Desikan et al., 2006), and Destrieux (Destrieux et al., 2010) atlases. These anatomically-defined nodes provide insight into how the variation in cellular structure across regions can give rise to efficient brain function.
Brain regions can also be defined based on the boundaries of functional activation to a task (Yeo et al., 2011). Functionally-defined nodes have been used to study changes in brain graphs due to different experimental conditions, variation in brain graphs across individuals that map on to differences in task performance, and differences in brain graphs between healthy and clinical groups (Poldrack, 2007). One key benefit of this approach is that it includes task-specific functional characteristics in the definition of a node. However, it is also important to note that this specificity could also limit the generalizability of resultant findings (Friston et al., 2006; Saxe et al., 2006); by definition, these parcellations discretize the cortex based on one biologically plausible rule (e.g., activation in response to a single task), but ignore other potentially important biologically plausible rules (e.g., activation in response to other tasks, anatomy, connectivity, or spatial constraints). To overcome this limitation, functional atlases (Cohen et al., 2008; Nelson et al., 2010) can be constructed that incorporate additional functional characteristics in the definition of a node by collating activation data over many tasks (Dosenbach et al., 2006) or by identifying regions of the brain that tend to be activated independently (Power et al., 2011).
While MRI data is most commonly used to construct brain graphs, nodes can also be estimated from neuroimaging techniques that capture synchronized activity of neuronal ensembles at a finer temporal scale (Passaro et al., 2017; Gordon et al., 2015; Lau et al., 2012). ECoG measures local field potentials from electrodes implanted directly on cortical tissue. Graph analyses based on these data have addressed — for example — the characteristics of normative brain state transitions (Khambhati et al., 2017a) and their alteration during seizures (Burns et al., 2014; Khambhati et al., 2016). However, since ECoG is an invasive procedure that requires a craniotomy, data is limited to patients who are undergoing brain surgery for clinical treatment (Wang et al., 2016; Cervenka et al., 2013). Fortunately, both EEG and MEG also capture neural dynamics at millisecond time scale and only require non-invasive sensors on or near the scalp (Garcia et al., 2013; Brooks et al., 2016). One of the earliest uses of graph statistics to understand human brain function capitalized on the temporal resolution of MEG to examine empirical evidence for small-world organization (Bassett and Bullmore, 2006, 2016) in healthy adults (Stam, 2004). Despite the behaviorally-relevant time scales of EEG and MEG (Garcia et al., 2017), these techniques have the disadvantage that the signals themselves represent a combination of signals from cortical and subcortical sources and are susceptible to artifacts from muscles contractions, head movements, and environmental noise (Vindiola et al., 2014). Consequently, neuroimaging signals recorded at the scalp are often reduced to their component sources (for review, Grech et al., 2008; Sakkalis, 2011), and the sources are then used as nodes in the brain graph (Smit et al., 2008; Muraskin et al., 2016).
Recent innovations in defining nodes for brain graphs include unimodal approaches that capitalize on fine-scale graph structure, as well as multimodal approaches that combine anatomical and/or functional imaging data. In the former category, one new method using MRI data defines nodes at the regional level based on small-scale connectivity at the voxel level in an approach called connectivity-based parcellation (Wiegell et al., 2003; Behrens and Johansen-Berg, 2005). Most commonly, the process begins with voxel-level estimates of structural connectivity from diffusion MRI (Vettel et al., in press), and then applies a clustering technique to extract modules of densely interconnected voxels; each module is then treated as a node in the brain graph (Eickhoff et al., 2015). In the latter category, some methods use multiple functional neuroimaging measurements to define nodes (Brown et al., 2012). In simultaneous fMRI and EEG recordings, nodes can be jointly defined from MRI activity (dependent on a slow, indirect neural response that evolves over 16-20secs) and EEG activity (dependent on fast, synaptic activity that adapts on shorter 100-500msec epochs), identifying nodes that lie at the intersection of these measurements (Muraskin et al., 2017). By combining the strengths across multimodal imaging techniques, brain nodes can represent fundamental brain processes that are independent from the limitations of particular measurement techniques.
Defining edges
Commonly, graph edges reflect anatomical connectivity or synchronized functional activity measured by MRI, ECoG, EEG, or MEG. Less commonly, hyperedges can be used to reflect groups of nodes in a hypergraph. Below, we briefly present the specifics of each approach with a focus on how each edge definition constrains the interpretation of the resultant graph.
In a brain graph of disparate regions, the most straightforward edge definition relates to the structural connections between two regions. These structural connections include synapses between neurons at the cellular level, and bundles of axonal fibers at the level of large-scale brain regions. In humans, structural connections are commonly estimated from diffusion MRI (dMRI) which measures water diffusion in the brain by relying on the clever insight that the presence of an axonal bundle in a voxel will restrict the movement of water molecules to align with the direction of the bundle’s principal spatial axis (for review, see Assaf and Pasternak, 2008; Vettel et al., in press). Once the fiber directions are reconstructed within a voxel, trajectories of axonal bundles can be modeled using either deterministic or probabilistic tractography methods (Mori and van Zijl, 2002; Behrens et al., 2007). The complete map of fiber pathways crisscrossing the cortex can be used as edges in the resultant brain graph to identify specific connections that enable efficient and rapid communication between regions (Bassett et al., 2011a).
A second common definition of an edge in brain graphs relates to coordinated activity between regions as estimated by functional MRI (Honey et al., 2009; Hermundstad et al., 2013, 2014; Goñi et al., 2014). Interregional similarities are often quantified using time series methods such as a correlation, coherence, phase lag index, or a measure of synchronization (Bastos and Schoffelen, 2015; He et al., 2011). A complementary set of measures known as effective connectivity methods estimate casual relations (rather than similarities) between regions (Friston, 2011). Generally, functional connectivity is thought to reflect patterns of interregional communication that underlie cognition (Fries, 2005, 2015; Gilbert and Sigman, 2007). Yet, it is important to note that two regions could display strong similarities in their time series if they were driven by a third source, perhaps even located outside of the body such as an environmental factor (Blinowska et al., 2013; Gramann et al., 2011). Care therefore must be taken to ensure that interpretations of the brain network dynamics are aligned with the fact that the observed signals can be driven by many factors.
Brain graph edges can also be defined to reflect frequency-specific relations reflecting inter-regional coupling based on synchronized oscillations measured with ECoG, EEG, and MEG. Several frequency bands have been robustly linked to functional roles in cognitive tasks (Buzsaki, 2006; Bressler and Kelso, 2001), including δ (1-4Hz), θ (4-7Hz), α (8-12Hz), β (12-32Hz), and γ (32Hz and greater). Low frequency activity is often associated with global processing and is thought to reflect long-range coordination and synchronization (Fries et al., 2001). In contrast, high frequency activity is often associated with local processing and is thought to reflect the transient coordination of inhibitory and excitatory neighboring neurons (Brunel and Wang, 2003; Geisler et al., 2005). Importantly, global dynamics carried by low frequency activity can interact with local dynamics driven by high frequency activity (for review, see Buzsáki and Draguhn, 2004; Buzsáki and Wang, 2012), suggesting the utility of defining edges based on cross-frequency coupling, as was recently done in (Siebenhuhner et al., 2013).
A promising avenue for studying relations among edges defined either on the same or different graphs is to build on the notion of a hypergraph, a mathematical object that can be used to formalize the idea that groups of edges — rather than single edges alone — can form a fundamental unit of interest (Bassett et al., 2014; Davison et al., 2015; Gu et al., 2017). This approach is partially motivated by evidence suggesting that edges can develop differentially in a coordinated fashion over the lifespan (Davison et al., 2016), leading to architectural features that cannot simply be defined by graphs composed of dyads (Bassett et al., 2014). Such developmental coordination of functional connections might be driven by intrinsic computations (Bassett et al., 2014), and subsequently have mutually trophic effects on underlying structural connectivity (Bassett et al., 2008). Co-varying functional connections in early life could support the emergence of cognitive systems observed in adulthood (Gu et al., 2017). Hypergraphs can formalize these relationships, and thereby offer a unique perspective on brain graph architecture.
Putting it all together
The graph representation is flexible enough to allow one to choose a definition of a node, and a definition of an edge, that best enable one to test their hypothesis. Once defined, the graph is formalized as an adjacency matrix A, whose element Aij represents the weight of the edge linking nodes i and j. The type of edge weights indicate the type of graph. In a binary graph, elements are either values of 0 or 1 which indicates whether an edge exists, while in a weighted graph, elements have non-binary values that reflect the strength of their pairwise connection. Graphs are further differentiated by the directionality of the connections. If the edge weight between a node pair is symmetric, the graph is called undirected and Aij= A ji for all (i, j); the graph is called directed otherwise. Once an adjacency matrix has been constructed, we are ready to evaluate community structure of the underlying network dynamics in our brain graph of interest.
Evaluating community structure in brain graphs
The overarching goal of community detection is to provide an understanding of how nodes are joined together into tightly knit groups, or modules, and whether such groups are organized into natural hierarchies in the graph. Community structure exists in a variety of real world systems including several social, biological, and political systems (Fortunato, 2010), and community detection methods can be used to uncover that structure algorithmically. Recent applications of these methods to real-world systems have uncovered segregated committees in the US House of Representatives (Porter et al., 2005), segregated protein classes in protein-protein interaction networks (Chen and Yuan, 2006), and segregated functional groups of areas in brain graphs (Bassett et al., 2011b). Uncovering community structure can provide important intuition about the system’s function, and the large-scale functional units that drive the system’s most salient processes (Girvan and Newman, 2002).
Mathematics of modularity maximization
Many methods exist for community detection (Fortunato and Hric, 2016). Some draw on notions in physics such as the Potts model (Reichardt and Bornholdt, 2004), while others draw on notions in mathematics such as random walks (Zhou and Lipowsky, 2004). Still others more closely track other concepts and techniques in computer science and engineering. In this review, we will primarily discuss a single method – modularity maximization – due to its frequent use in the network neuroscience community. However, readers interested in understanding various other algorithms and approaches may enjoy several recent reviews (Fortunato, 2010; Porter et al., 2009). After describing the method in its simplest instantiation, we will define several common metrics that can be extracted from the estimated community structure to provide insight into the system’s organization. We will then turn to reviewing recent extensions of modularity maximization to time-varying graphs and discuss appropriate null model networks for statistical inference.
Modularity maximization refers to the maximization of a modularity quality function, whose output is a hard partition of a graph’s nodes into communities. The most common modularity quality function studied in network neuroscience to date is where Aij is the ijth element of the adjacency matrix, i is a node assigned to community Ci and node j is assigned to community Cj. The Kronecker delta δ (Ci,Cj) is 1 if i, j are in the same community and zero otherwise, γ is called a structural resolution parameter (Bassett et al., 2013), and Pij is the expected weight of the edge connecting nodes i, j under a null model. A common null model is the Newman-Girvan null model which is given by kik j/2m where ki is the degree of node i and m is the total density of the graph (Newman and Girvan, 2004). For a discussion of alternative null models, see (Bassett et al., 2013).
The structural resolution parameter, γ, is often set to unity for simplicity. However, due to a well-known resolution limit (Reichardt and Bornholdt, 2004), this choice will tend to produce a fixed number of communities, even if a stronger community structure could be identified at smaller or larger topological scales. To deal with this limitation, it is common to vary γ over a wide range of values. The benefit of such a parameter sweep is that it can also uncover hierarchical organization in the graph: robust community structure across several topological scales (Porter et al., 2005). Some graphs contain a single scale (or several discrete scales) at which community structure is present. For these graphs, it has been suggested that a useful method by which to identify that scale(s) is to search for γ values at which all partitions estimated (from multiple runs of the modularity maximization algorithm) are statistically similar (Bassett et al., 2013).
It is worth noting that the maximization of the modularity quality function defined above is NP-hard. Because an exact solution is unknown, various heuristics have been devised to maximize (or nearly maximize) Q without resorting to an exhaustive search of all possible partitions, which for most real-world graphs proves to be computationally intractable (Porter et al., 2009). Heuristics vary in terms of their relative speed, fidelity, and appropriateness for large versus small graphs. Greedy algorithms tend to be relatively swift (Clauset et al., 2004), while simulated annealing (Guimera and Amaral, 2005), extremal optimization (Duch and Arenas, 2005), and others (Noack and Rotta, 2009) can be slower yet provide quite stable partitions. With most heuristics, one should perform the optimization many times in order to create an ensemble of partitions, and both understand and report the variability in those solutions.
The modularity landscape is rough, containing many near degeneracies (Good et al., 2010). This means that there are many structurally diverse alternative partitions of nodes into communities with modularity values very close to the optimum. Near degeneracy is particularly prevalent in large binary graphs, and less prevalent in small weighted graphs. Degeneracy becomes especially problematic when the partitions identified by multiple optimizations of the modularity quality function are dissimilar. In these cases, we might wish to identify a single representative partition from the set of partitions observed. One common approach to identify a consensus community structure is similarity maximization (Doron et al., 2012), where the partition of interest is that which has the greatest similarity to all other observed partitions. A second common approach is an association-recluster method (Lancichinetti and Fortunato, 2012; Bassett et al., 2013; Betzel and Bassett, 2016), which uses a clustering algorithm to find a consensus partition by exploiting the fact that across an ensemble of partitions, a single node may be affiliated with the same other nodes. Partition degeneracy can also be addressed by expressing the best partition as an average across multiple near-optimal partitions, and by treating the community allegiance of nodes as fuzzy variables (Bellec et al., 2010) or via probabilistic clustering (Hinne et al., 2015).
Summarizing community structure in brain graphs
Topological summary statistics
Several summary statistics that can be derived from community detection methods are reported in neuroimaging studies. Many of these can be defined based on the network’s topology, independent of any embedding of that network into a physical space (Fig. 3). We begin by denoting G = (V, A) to be a complex network of N nodes, where V = {1,…, N} is the node set,A ∈ ℝN×N is the adjacency matrix whose elements Aij give the weight of the edge between node i and node j. A community structure is a partition , where Ci ⊂ V consists of the nodes in the ith community and K is the number of communities in G. Here we only consider non-overlapping community structure, which means that if i≠ j.
Number of communities
The number of communities provides an indication of the scale of community structure in a network. Note that Nck = ∣Ck∣is the number of nodes in module Ck. A large number of communities suggests a small scale of structure in the network, while a small number of communities suggests a large scale of structure in the network.
Size of communities
The average size of communities, and the distribution of community sizes are also useful diagnostics of community structure. The number of nodes N divided by the number of communities K gives the mean size of communities in the graph.
Modularity quality index
For community structure identified with modularity-based approaches, the modularity quality index Q serves as a useful measure of the quality of the partition of nodes into communities. To some degree, higher values indicate more optimal partitions than lower values, after accounting for caveats of the roughness of the modularity landscape Good et al. (2010), the size of the graph, and the edge weight distribution, among potentially other confounds.
Within- and between-module connectivity
It is also of interest to calculate the strength of edges inside of modules, and the strength of edges between modules. We refer to these notions as within- and between-module connectivity, respectively and define to be the strength between moduleCk1 and moduleCk2. When the two modules are identical (k1 =k2), this measure amounts to the average strength of that module, and we interpret it as the recruitment of the module. When the two modules are different (k1 /≠ k2), we might also wish to compute therelative interaction strength to account for statistical differences in module size. Following (Bassett et al., 2015), we interpret this interaction strength as the integration between modules.
Intra-module strength z-score
One might also wish to quantify how well connected a node is to other nodes in its community, a notion that is formalized in the intra-module strength z-score (Guimera and Amaral, 2005): where SCi denotes the strength (i.e., total edge weight) of node i’s edges to other nodes in its own community Ci, the quantity is the mean of SCi over all of the nodes in Ci, and is the standard deviation of SCi in Ci. This statistic was recently applied to brain graphs to study the learning of categories (Soto et al., 2016).
Participation coefficient
One might also wish to measure how the connections emanating from a node are spread among nodes in the different communities, a notion that is formalized in the participation coefficient (Guimera and Amaral, 2005): where SiCk is the strength of edges of node i to nodes in community Ck. In (Soto et al., 2016), this statistic was used to better understand how learning is impacted by patterns of inter-modular connectivity.
Spatial summary statistics
It is often interesting to quantify how a network is embedded into physical space (Fig. 3), and specifically the spatial properties of communities. Currently, relatively few measures exist and future work should focus on this important area. Below we present five measures previously proposed to quantify the spatial aspects of community structure.
Community average pairwise spatial distance
The community average pairwise spatial distance, lCk is the average Euclidean distance between all pairs of nodes within a community (Feldt Muldoon et al., 2013): where ri is the position vector of node i. The average pairwise spatial distance of the entire network is given by the same equation calculated over all nodes within the network.
Community spatial diameter
The community spatial diameter, dCk is defined as the maximum Euclidean distance between all pairs of nodes within a community (Feldt Muldoon et al., 2013):
The spatial diameter of the entire network is given by the same equation, but calculated over all nodes within the network.
Community spatial extent
The spatial extent of a community is an inverse estimate of the density of a community and quantifies the area or volume of the community, normalized by the number of nodes within the community (Feldt Muldoon et al., 2013). Specifically, we can define where Vh is the volume (3 dimensions) or area (2 dimensions) of the region bounded by the points of the convex hull of nodes within the community. The convex hull is the minimal convex set containing all of the points within the community and is informally described as the polygon created by connecting all points that define the perimeter of the community. It should be noted that in this definition of the spatial extent, the normalization assumes the average size of a region is approximately constant. If this is not the case, the equation could be modified to take into account the boundaries or sizes of individual regions to better estimate the inverse measure of density.
Community radius
We can define the community radius ρCk as the length of the vector of standard deviations of all nodes in the community (Lohse et al., 2014):
The average community radius of the entire network is a dimensionless quantity that expresses the average relationship between individual community radii and the network as a whole where NCk serves to weight every community by the number of nodes it contains, and R is a normalization constant equal to the radius of the entire network:
Community laterality
Laterality is a property that can be applied to any network in which each node can be assigned to one of two categories, J1 and J2, and describes the extent to which a community localizes to one category or the other.
For an individual community Ck, the laterality ΛCkis defined as (Doron et al., 2012): where NJ1 and NJ2 are the number of nodes located in each category, respectively. The value of ΛCk ranges between zero (i.e., the number of nodes in the community are evenly distributed between the two categories) and unity (i.e., all nodes in the community are located in a single category).
The laterality of a given partition, , of a network is defined as: where denotes the expectation value of the laterality under the null model specified by randomly reassigning nodes to the two categories while keeping the total number of nodes in each category fixed.
Strength and significance of communities
When reporting values for either topological or spatial diagnostics, it is important to consider potential sources of error or variation that would inform the confidence in the measured values. For example, there may be error in the estimated weights of individual edges in the network, either from errors in the images themselves, or errors in the statistical estimates of structural or functional connectivity from those images. There may also be variance associated with multiple estimates of a network, either from different subjects, or from the same subject at different instances in time or in different brain states. In each case, it is useful to discuss the potential errors or sources of variance contributing to the estimated diagnostics of community structure, and to quantify them where possible.
In addition to accurately describing the potential sources of error in one’s data, it can also be useful to explicitly measure the significance of a given community structure. In this section, we describe two notions that can be used to quantify the strength and significance of communities. (Note that in this section, we use a few variable names that have been defined differently in earlier sections, largely to remain consistent with the traditional use of these variable names in their relevant subfields.)
Normalized persistence probability
The persistence probability is a measure of the strength of a community in a graph with salient community structure (Piccardi, 2011). Given an adjacency matrix A, we construct an N-state Markov chain with transition matrix P by performing a row-normalization on A. Specifically, the transition probability from i to j is given by
Under some mild conditions, there exists a unique equilibrium distribution π∈ℝN that satisfies π=πP. Roughly speaking, this implies that if an individual takes a random walk on V with transition probabilities given by P, then — after some sufficiently long period of time — the probability that the individual is on the ith node is πi regardless of where the individual started.
Now, given P and any distribution π on V, we can construct a K-state Markov chain with transition probability where H is an N × K binary matrix coding the partition ; that is, hni indicates whether the nth node is in the kth community. We call the K-state Markov chain a lumped Markov chain. We can check that Π = πH is an equilibrium distribution of the lumped Markov chain, which satisfies Π = ΠQ, and therefore the lumped Markov chain can be treated as an approximation of the transition of communities in the original Markov chain. We note that then, the expected escape time of Ck is τk= (1−qkk)−1, which implies that if now the individual is in Ck, then on average it will take τk jumps for the individual to jump to another community. The persistence probability of the kth comunity is therefore defined as qkk; the larger this value, the longer the expected escape time, and the more significant the community.
In practical applications, the persistence probability is influenced by the size of the community. Larger communities always have larger persistence probabilities. Importantly, this fact can bias empirical results for graphs whose community size distribution is relatively broad. To address this limitation, we can normalize the persistence probability as follows
The kth community is significant if . Intuitively, this normalization assumes that the graph is fully connected and that the weights of edges are all equal; then, the persistence probability of the k-th cluster is. Whenever a community has a persistence probability that is larger than some threshold α, we will refer to it as an α-community. If all communities are α-communities, we call the entire partition an α-partition.
Statistical comparison to a permutation-based null model
Given a community structure , we can in fact compute the contribution of each community to the modularity quality index as follows: where as before γ is the structural resolution parameter, A is the adjacency matrix, and P is a null model adjacency matrix. Intuitively, measures how strong the kth community is, and it is interesting to ask whether it is stronger than expected under some appropriate null model.
To address this question, we can algorithmically generate a community structure , which has exactly the same number of communities and the same number of nodes in each corresponding community as in , by simply permuting the order of nodes in V. We use this permutation-based approach to construct an ensemble of partitions, and for each partition we can calculate . Now, we define where . The quantity is a normalized measure that provides information about how strong the community is in comparison to what is expected under a permutation-based null model (Gu et al., 2017).
Modularity maximization for temporal graphs
The methods described above can be applied to a single graph, or separately to all graphs in a graph ensemble. However, in the study of neural function and its relation to cognition, or its change with age and disease, we often have an ordered set of graphs, where the order is based on time (Fig. 4). In this case, it is useful to consider methods for modularity maximization in temporal graphs — a set of graphs ordered according to time from earliest time to latest time (Sizemore and Bassett, 2017). A recent generalization of modularity maximization for graphs with L layers is given by the multilayer modularity quality function (Mucha et al., 2010): where the adjacency matrix of layer l has elements Aijl, and the null model matrix of layer l has elements Pijl, γl is the structural resolution parameter of layer l, ωjlr is the temporal resolution parameter and gives the strength of the inter-layer link between node j in layer l and node j in layer r, and δ is the Kronecker delta. Small values of the temporal resolution parameter result in greater independence of partitions across neighboring layers, and large values of the temporal resolution parameter result in greater dependence of partitions across neighboring layers. Note that ω can vary from 0 to infinity.
Determining appropriate choices for the values of the structural (γ) and temporal (ω) resolution parameters is an important enterprise. In some cases, one might have information about the system under study that would dictate the number of communities expected, or their relative size, or their relative variation over time. However, if such information is not available for the system under study, then one must turn to data-driven methods to obtain values for γ and ω that most accurately reflect the spatial and temporal scales of community structure within the data. Several heuristics have been suggested in the literature, including (i) comparison to statistical null models (which we will describe in a later section) (Bassett et al., 2013),(ii) identifying a point in the γ-ω plane where the set of partitions obtained from multiple maximizations of the multilayer modularity quality function are statistically similar (Chai et al., 2016), or (iii) identifying the point in the γ-ω plane where the dynamic community structure displays certain features (Telesford et al., 2016).
Topological summary statistics for dynamic community structure
Several summary statistics exist which are frequently reported to characterize dynamic community structure in empirical studies. A few particularly simple statistics include (i) the mean and temporal variance of the number of communities, (ii) the mean and temporal variance of the size of communities, and (iii) the multilayer modularity quality index Qmultilayer. In addition to these simple statistics — which have their correlaries in the single-layer case — we can also define several statistics that explicitly capitalize on the temporal nature of the data.
Flexibility
The flexibility of a single node i, ξi, is defined as the number of times a node changes in community allegiance across network layers, normalized by the number of possible changes (Bassett et al., 2011b). Mathematically, where gi is the number of times that the particle changes its community. The flexibility of the entire multilayer graph is then given by the mean flexibility of all nodes
Node disjointedness
Node disjointedness describes how often a node changes communities independently. Specifically, we are interested in when a node moves from community s to community k, and no other nodes move from community s to community k. If node i makes such changes out of L − 1 possible changes, we define the node disjointedness as follows (Telesford et al., 2017):
Cohesion strength
The node cohesion can be defined as the number of times a node changes communities mutually with another node. Specifically, node cohesion is a pairwise measure that is expressed as a cohesion matrix, M, where edge weight Mij denotes the number of times a pair of nodes moves to the same community together, divided by L − 1 possible changes.The cohesion strength of node i is then defined as follows (Telesford et al., 2017):
Promiscuity
The promiscuity ψi of node i is defined as the fraction of all communities in which the node participates at least once, across all network layers (Papadopoulos et al., 2016). The network promiscuity Ψ can be defined as the average promiscuity over all nodes
Stationarity
To define stationarity, we first write the autocorrelation J(Cl,Cl+m) between a given community at layer l, Cl, and the same community at layer l+ m,Cl+m, as where ∣Cl ∩ Cl+m∣ is the number of nodes present in community C in layer l and in layer l + m, and ∣Cl ∪ Cl+m∣ is the number of nodes present in community Ck at layer l or layer l + m (Palla et al., 2005). Then if li is the layer in which community C first appears, and lf is the layer in which it disappears, the stationarity of community Ck is The stationarity of the entire multilayer network is then given by
Statistical Validation and Prediction
After estimating community structure from a single brain graph, or from a multilayer brain graph, one is next faced with the questions of (i) whether and how that community structure is statistically significant, (ii) how to compare community structure in one graph ensemble to community structure in a second graph ensemble, and (iii) how to infer underlying mechanisms driving the observed community structure. Answering these questions requires tools from statistics that are directly informed by network architecture, and tools from generative modeling that can provide insights into possible mechanisms.
The statistical significance of a community structure can only be determined in relation to a defined null model. One of the most common approaches to defining null models for brain graphs is via permutation: for example, the placement or weight of the edges in the true graph can be permuted uniformly at random (Fig. 5). In prior work, this null model has been referred to both as a connectional null model (Bassett et al., 2011b) or a random edges null model (Sizemore and Bassett, 2017). If the graph is a temporal multilayer brain graph, one could also consider permuting the inter-layer links uniformly at random (sometimes referred to as a nodal null model). One could also consider permuting the order of the layers uniformly at random (sometimes referred to as a temporal null model) (Bassett et al., 2011b). For a discussion of related null models specifically for dynamic graphs, see (Sizemore and Bassett, 2017; Khambhati et al., 2017b).
When graphs are built from functional data, one can also consider null models that are constructed from surrogate time series (Bassett et al., 2013; Khambhati et al., 2017b). Perhaps the simplest surrogate data technique begins by permuting the elements of each time series uniformly at random and then continues by recomputing the measure of functional connectivity between pairs of time series (Theiler et al., 1992). This approach is sometimes referred to as a random shuffle null model. While a fundamental benchmark, this null model is quite lenient, and it is commonly complemented by more stringent tests (Bassett et al., 2013). For example, the Fourier Transform surrogate preserves the linear correlation of the series by permuting the phase of the time series in Fourier space before taking the inverse transform to return the series to temporal space. A related technique – the Amplitude Adjusted Fourier Transform – works similarly except that it also preserves the amplitude distribution of the original time series (Theiler and Prichard, 1996). For helpful additional discussion of surrogate data time series, see (Schreiber and Schmitz, 1996, 2000).
After confirming that the community structure observed in the empirical graph is unlike that observed in either graph-based or time-series-based null models, one might next wish to compare two sets of empirical graphs. Specifically, one might wish to state that the community structure in one graph ensemble (e.g., healthy brains) is significantly different from the community structure in another graph ensemble (e.g., brains from individuals with disorders of mental health). One simple approach would be to use traditional parametric statistics to determine group differences in a summary measure such as the many defined in the earlier sections of this review. However, such an approach is naive in that it assumes that the distributions of these summary measures are well understood, and that the data do not violate the assumptions of those parametric tests. It is arguably more appropriate to instead consider non-parametric permutation testing, which accounts for the true variation in the empirically observed data.
Finally, moving beyond statistical validation, one might also wish to understand the mechanisms by which community structure arises in one’s data of interest, and predict how alterations in those mechanisms could lead to altered community structure. These sorts of topics are particularly important in understanding normative development and aging, and in understanding changes in graph architecture with disease or injury. To begin to build an intuition for possible mechanisms of community structure, it is natural to turn to generative network modeling techniques (Betzel and Bassett, 2017), in which wiring rules are posited and the resultant graph is compared to the empirically observed graphs; if the observed graph displays similar architecture to the modeled graph, then the wiring rule is said to constitute a potential mechanism. Such generative models can be either static or growing models (Klimm et al., 2014), and can be defined either in a deterministic or probabilistic manner (Sporns, 2006). A particularly useful model for mesoscale structure — including but not limited to community structure — is the stochastic blockmodel, which has recently been used in the context of both structural (Betzel et al., 2017a) and functional (Pavlovic et al., 2014) brain graphs (Fig. 6). Importantly, stochastic blockmodels have also recently been extended to multilayer graphs (Stanley et al., 2016), suggesting their potential utility in understanding mechanisms of brain dynamics as well.
Collectively, these statistical approaches provide a rich set of tools to examine the robustness and reliability of brain graphs constructed from neuroimaging techniques sensitive to neural structure and activity across different spatial scales. Furthermore, recent advancements in generative network modeling provide new avenues to examine the mechanisms supporting network modularity that will complement work using community detection to characterize the static and dynamic evolution of these networks.
From modularity in neural systems to behavior
Mounting evidence supports the notion that modularity in brain graphs is important for healthy task-based and resting-state dynamics. Functional network communities correspond to groups of regions that are activated by the performance of specific cognitive and behavioral tasks requiring for example perception, action, and emotion (Crossley et al., 2013). Interestingly, evidence suggests that the human brain transitions among functional states that maximize either segregation or integration of communities, and the integrated states are associated with faster and more accurate performance on a cognitive task (Shine et al., 2016; Shine and Poldrack, 2017). Several studies have identified relationships between individual differences in modularity and memory performance (Vatansever et al., 2015; Chen et al., 2016; Alavash et al., 2016; Shine et al., 2016; Finc et al., 2017; Stanley et al., 2014). Changes in global modularity predict effective memory retrieval (Westphal et al., 2017), account for reaction time on correct responses (Vatansever et al., 2015), and relate to individual variability on other measures of behavioral performance (Stanley et al., 2014). Converging evidence from electroencephalography studies further suggest that increased integration is required for successful working memory function (Bola and Borchardt, 2016; Kitzbichler et al., 2011; Zippo et al., 2016).
The relationship between network modularity and performance is also expressed in the resting brain. Global network modularity at rest has been shown to predict inter- and intra-individual differences in memory capacity (Stevens et al., 2012). When network modularity in resting state dynamics decreases following sleep deprivation, it accounts for behavioral performance impairments (Ben Simon et al., 2017). Aging brains typically become less modular at the global scale (Meunier et al., 2009a; Geerligs et al., 2014; Chan et al., 2014), including specific modularity decreases in the executive control network and attention subsystems associated with typical cognitive decline (Betzel et al., 2014). Similarly, increased modularity is associated with improved learning and neuroplasticity. Patients with brain injury (Arnemann et al., 2015) as well as older adults (Gallen et al., 2016) with more modular brain networks at baseline have been shown to exhibit greater improvements following cognitive training.
Importantly, community detection approaches have also revealed the importance of time-evolving changes in modular networks that underlie human behavior. When participants successfully learn a simple motor skill across several days, the community organization and its dynamics change as the skill becomes more automatic (Bassett et al., 2011b, 2015). Motor skill learning is also accompanied by a growing autonomy of the sensorimotor system, and by a disengagement of frontal-cingulate circuitry which predicts individual differences in learning rate (Bassett et al., 2015). Even at much shorter time-scales and over the course of a single session, dynamic community structure can capture changes in task demands and changes in cognitive state (Andric and Hasson, 2015; Godwin et al., 2015; Braun et al., 2015; Betzel et al., 2017c; Cohen and D’Esposito, 2016).
Finally, the importance of modular network organization for healthy brain function is underscored by its alteration in clinical samples (Fornito and Bullmore, 2015). Connectopathy has been documented in patients with several mental health disorders including but not limited to schizophrenia, depression, anxiety, dementia, and autism (Micheloyannis, 2012; Menon, 2011; Yerys et al., 2017). Schizophrenia has been characterized by diminished small-world organization (Micheloyannis et al., 2006; Liu et al., 2008; Rubinov et al., 2009), altered modular organization (Yu et al., 2011; Lerman-Sinkoff and Barch, 2016; Micheloyannis, 2012; Kim et al., 2014; Bassett et al., 2008; Alexander-Bloch et al., 2012), and dysmodularity: an overall increase in both structural and functional connectivity that greatly reduces the anatomical specialization of network activity (David, 1993). Other disorders of mental health, such as depression, have also been documented to exhibit altered network modularity (Lord et al., 2012; Ye et al., 2015; Satterthwaite et al., 2015), and emerging evidence suggests that changes in inter-module connectivity could underlie common reward deficits across both mood and psychotic disorders (Sharma et al., 2017).
Methodological considerations and future directions
There are several methodological considerations that are important to mention in the context of applying modularity maximization techniques to neuroimaging data condensed into brain graphs. Perhaps one of the most fundamental consideration relates to the notion that one might be able to identify an “optimal” structural or temporal resolution parameter with which to uncover the graph’s most salient community structure. Such a notion presupposes that the graph displays strongest community structure at only a single topological or temporal scale. Yet, in many real-world systems — including brain graphs — modules exist across a range of topological scales from small to large, each contributing in a different manner to the system’s overall function. Moreover, such nested modules might display dynamics over different temporal scales, enabling segregation and integration of computational processes from transient control to long-distance synchrony. Thus, while choosing an optimal resolution parameter may not only be difficult, it may also be unfounded, depending on the architecture of the single-layer graph or multilayer graph under study.
Several approaches have been proposed to address the multi-scale organization of brain graphs (Fig. 8; for a recent review, see (Betzel and Bassett, 2016)). One intuitive solution is to sweep across the topological and temporal scales of the system by incrementally changing the resolution parameters (Fenn et al., 2009). The advantage of this approach is that it allows us to track the stability of partitions across topological scales and identify robust modules. Nevertheless the communities in this approach are identified independently at each scale and thus a secondary algorithm is necessary for the reconstruction of a continuous topological community structure. An explicit multi-scale community detection algorithm can be used to address this limitation, by allowing simultaneous identification of the community organization across several scales (Mucha et al., 2010). A recent application of this approach to neuroimaging data has uncovered notable topological heterogeneity in the community structure of both structural and functional brain graphs, and in the extent of coupling across these modalities (Ashourvan et al., 2017) (Fig. 9).
In addition to understanding community structure across different scales in a single data modality, it is becoming increasingly important to identify and characterize community structure across different data modalities. The multilayer network formalism, which we described in this review in the particular context of temporal graphs, can also be used to link graphs from different imaging modalities together (Vaiana and Muldoon, 2017; Muldoon and Bassett, 2016). Intuitively, community structure — and the topological or temporal scales at which it is most salient — can differ significantly across imaging modalities. In functional brain graphs estimated over long time scales, the community structure is of neural origin, and thus communities at coarser scales imply higher temporal independence and functional segregation between the communities. By contrast, in structural brain graphs, the community structure can be more reflective of the brain’s spatial organization, constituted by small focal clusters, mesoscale distributed circuits, and gross-scale hemispheres. Since the topological organization of a brain graph can differ across scales in different imaging modalities, it is useful to apply methods that can explicitly compare and contrast community structure across a range of topological and temporal resolutions (Ashourvan et al., 2017).
Advantages and Disadvantages of the Graph Approach
A key advantage of community detection techniques is their relative simplicity. Nevertheless, this same simplicity can challenge mechanistic understanding of the organizational principles that shape emerging real-time dynamics of the system. This challenge is particularly apparent in the interpretation of the value of the modularity quality index: researchers often interpret higher (lower) modularity values as an increase (decrease) in overall segregation (integration) of brain networks. Yet, it is critical to realize that one can change the structure of a network in a host of ways that all lead to comparable changes in the value of the modularity quality index, but also lead to strikingly different large-scale functional dynamics. Moreover, modularity values themselves are dependent on the resolution parameter at which they are calculated, and direct comparison between modularly values in two graphs using the same resolution parameter hinges on the assumption that both graphs display “optimal” community structure at the same topological scale. Modularity values are also difficult to compare in two graphs that exhibit community structure at different topological scales, as the resolution parameters used for the calculation of the modularity are different. Thus, in general, the interpretability of the modularity value is quite limited.
More generally, it is important to bear in mind that community detection techniques such as modularity maximization assess one specific type of organization in a graph, and often should be combined with other techniques to examine other types of organization present in the same graph. Specifically, community detection can be used to examine a specific type of meso-scale organization (for others, see (Betzel et al., 2017a)), while other graph measures can be used to examine organization at other scales (Nadakuditi and Newman, 2012). Examples of these other measures include centralities (Barthelemy, 2004), clustering coefficient (Saramaaki et al., 2007), path-length (Barabasi and Albert, 2002), and global and local efficiency (Latora and Marchiori, 2001) to name a few. Future work could also use generalizations of these network measures to multilayer data (see Kivelä et al. (2014) for a recent review, and see (Sizemore and Bassett, 2017) for a toolbox for use in applying those notions to neuroimaging data). Furthermore, these tools may also provide novel avenues for studying the coupling between the time-varying and multi-scale community structure in functional brain graphs and the underlying hierarchical scaffold in structural brain graphs (Ashourvan et al., 2017).
Finally, it is important to note that the most common approach used to construct a brain graph treats brain regions as nodes and inter-regional connections as edges. Although this simple graph model has proven useful in advancing our understanding of the organization of brain networks in health and disease, it suffers from an implicit assumption of node homogeneity. That is, each node is distinguished not by any feature of its own, but by its relation to other nodes. Future work could aim to explore and advance community detection methods for annotated graphs (Newman and Clauset, 2016) in the context of brain networks to account for the heterogeneous function and anatomy of different brain regions (Murphy et al., 2016) (Fig. 10).
Moreover, exploring alternative ways to construct brain networks such as hypergraphs (Bassett et al., 2014; Gu et al., 2017), and alternative methods to identify community structure such as link-communities (Ahn et al., 2009; de Reus et al., 2014), could offer important and complimentary information regarding the organizational principles of brain network architecture.
Conclusion
Here, we have reviewed recent efforts to model brain structure and function using graphs. We focused on describing methods to identify, characterize, and interpret community structure in such graphs, with the goal of better understanding cognitive processes and resulting behavior in health and disease. We began by describing how brain graphs are commonly built, and then we discussed two community detection algorithms based on modularity maximization: one constructed for use on single graphs, and one constructed for use on multilayer graphs. We also offered a collation of summary statistics that can be used to characterize topological features of community structure, spatial features of community structure, and features of dynamic community structure. We closed with a discussion of methodological considerations and future directions, as well as a few comments on the advantages and disadvantages of the graph approach. Our hope is that this review will serve as a useful introduction to the study of community structure in brain graphs, and will spur the development of new tools to more accurately parse and interpret modularity in human brain structure and function.
Acknowledgements
This work was supported by mission funding to the Army Research Laboratory, as well as research executed under contract number W911NF-10-2-0022. D.S.B. and A.A. would also like to acknowledge support from the John D. and Catherine T. MacArthur Foundation, the Alfred P. Sloan Foundation, the Army Research Office through contract number W911NF-14-1-0679, the National Institute of Health (2-R01-DC-009209-11, 1R01HD086888-01, R01-MH107235, R01-MH107703, R01MH109520, 1R01NS099348, R21-M MH-106799, the Office of Naval Research, and the National Science Foundation (BCS-1441502, CAREER PHY-1554488, BCS-1631550, and CNS-1626008). The content is solely the responsibility of the authors and does not necessarily represent the official views of any of the funding agencies.