### Animal models and experimental design

All animal care and experiments for this research were approved by the Seoul National University Institutional Animal Care and Use Committee and the Kyung Hee University Institutional Animal Care and Use Committee. All experiments were performed in accordance with relevant guidelines and regulations regarding the care and the use of animals for the experimental procedures. Thirty-four SHRs and 12 WKYs were raised in a laboratory cage with a standard condition (22–24 °C, 12-hours light and dark cycle) with no restriction of standard feeding and water-drinking. All the rats underwent two times of brain FDG PET scans at 4 weeks old and 6 weeks old, which represent childhood and entry of puberty, respectively^{47}. Behaviors, including hyperactivity and impulsivity, were checked with MBT at 5 weeks old, and OFT and DDT at 8 to 9 weeks old (Fig. 1). Rats belong to the lower quartile per each behavioral test in SHRs were excluded from the ADHD-model rats, which led to twelve ADHD-model rats matching the number of WKY rats. Whole-brain connectivity based on brain metabolic activity was analyzed for 4 groups as follows: (1) ADHD-model rats at 4 weeks old, (2) ADHD-model rats at 6 weeks old, (3) control rats at 4 weeks old, and (4) control rats at 6 weeks old.

### Behavioral tests

MBT was performed individually to reveal the rats’ degree of impulsivity. Rats were acclimated for 15 minutes before MBT. They were tested with 3 × 5 placed glass marbles for 15 minutes after the acclimatization. The number of buried marbles was counted after the removal of the rats from the cages. Burial of marble was determined when 50% or more of it was covered by bedding. OFT was performed individually to reveal rats’ hyperactivity by measuring the total distance of moving around for 30 minutes. The movement of rats was tracked by a video camera system installed above the open-field apparatus.

DDT measuring intolerance to delay was performed individually using previously described methods with minor changes^{48,49}. Briefly, after habituation in the animal room without the restriction of feeding and sequential 2 days of food restriction, rats were trained for 5 days on two levers returning different amounts of food pellets per one press. A press on the right lever delivered a food pellet (about 45 mg) immediately (small and immediate reward), whereas a press on the left lever resulted in the delivery of five food pellets (large and delayed reward) later. After pellet delivery, the time-out period lasted 20 seconds, and the light was on during this period. During the testing phase for 4 days, a delay was sequentially increased for the large rewards over the test days (0, 10, 20, 30, and 40 seconds). Each test took 30 minutes. During adjustment of delay sessions, restricted feeding that 5 g of pellet per 100 g of body weight was allowed. The mean percentage of choice for the larger rewards with specific delays from 20 to 40 seconds was considered as the score of DDT.

### Brain PET scanning and reconstruction

Brain PET images were obtained using a dedicated small animal PET/computed tomography (CT) scanner (eXplore VISTA, GE Healthcare, WI) after overnight fasting. Before brain PET/CT scanning, rats were anesthetized by 2% isoflurane at 1.5–2 L/min oxygen flow for 5 minutes before the injection of FDG (150–220 MBq/kg) via a tail vein. Rats were awake and took a rest in a dark room till brain PET/CT scanning. A static brain PET scan was acquired for 20 minutes, 45 minutes after FDG injection. The energy window of PET scanning was 250–700 keV. PET images were reconstructed using the three-dimensional ordered-subsets expectation maximum algorithm with the correction of attenuation, random, and scatter. The voxel size of reconstructed PET images was 0.3875 × 0.3875 × 0.775 mm^{3}.

### Preprocessing of brain FDG PET

Voxel size was rescaled by a factor of 10 in each dimension. The rescaled brain PET images were manually realigned to the Schiffer template of rat brain MRI T1 in PMOD2.7 (PMOD group, Zurich, Switzerland)^{50}. Spatial alignment using non-linear registration on Statistical Parametric Mapping (SPM8, University College of London, London, UK) was applied with the Schiffer template of rat brain PET and binary brain mask. Global normalization of voxel counts was applied as the last step of preprocessing.

### Brain parcellation and distance matrix computation

Among the 58 predefined ROIs on the Schiffer template, 32 ROIs, including the cortices and subcortical structures, were selected as nodes to construct a brain metabolic network in each group. See the Supplemental Methods and Materials for the details of the 32 ROIs. The Pearson correlation coefficient (*r*_{ij}, *r*_{ij} > 0) between two nodes (*p*_{i}, *p*_{j}) was computed to obtain a positive correlation matrix. A distance (*d*_{ij}) between two nodes (*p*_{i}, *p*_{j}) was defined as the following:

$${d}_{ij}=sqrt{1-{r}_{ij}}$$

### Graph filtration

We used a multiscale approach to analyze networks to avoid fixing the threshold of distance^{18}. In this study, we performed graph filtration, which decomposed a weighted network into unweighted networks at many possible thresholds. We found the connected components and the number of connected components, denoted by *β*_{0} of each unweighted network. The *β*_{0} decreased from the number of nodes in a network to one by merging two connected components into a connected component by increasing the threshold. When *β*_{0} = 1 at the minimum threshold, we called the connected component a GCC here. The change of *β*_{0} with respect to threshold is visualized by *β*_{0}-curve. The *β*_{0}-curve had the same information as the barcode of connected components. All bars in the barcode of connected components always start from zero in a network, and only the end of bars which corresponds to the death of connected components has the information of the change of connected structures. The *β*_{0}-curve is obtained by connecting the end of bars in the barcode.

The threshold of distance when two connected components were merged into a connected component during the graph filtration is called a single linkage distance (SLD) between the two connected components. If the two connected components are denoted by *A* and *B* ((Acap B=varnothing )), then, the SLD between *A* and *B* is defined by

$$d(A,B)={{rm{min }}}_{xin A,yin B}d(x,y),and,d(A,B)ge d({C}_{1},{C}_{2}),for,any,{C}_{1},{C}_{2}subseteq A,or,B.$$

The SLD between two nodes in A and B is defined by^{51}

$$d(x,y)=d(A,B),for,all,xin A,yin B.$$

When the number of nodes in a network is *p*, SLM is a *p*-by-*p* matrix of which element is an SLD. The *β*_{0} was counted along the filtration to make a dendrogram, which is equivalent to an SLM. An example of counting *β*_{0} and constructing an SLM is provided in Supplemental Fig. S2.

### Minimum spanning tree

Minimum spanning tree (MST) is a subset of a weighted network that has all nodes and the subset of edges which only connect (hat{x}) and (hat{y}) that satisfies

$$(hat{x},hat{y})={rm{arg }},{{rm{min }}}_{xin A,yin B}d(x,y),and,d(hat{x},hat{y})=d(A,B).$$

The MST of a weighted network is a network that has the minimum number of edges that have the same SLM of the weighted network^{51}. Therefore, it is easier to see the modular structure of a weighted network.

We evaluated modularization of the reward-motivation system (CP, mPFC, and ACC) and memory system (anterodorsal hippocampus [ADH], posteroventral hippocampus [PVH], RSC, THA, and insula [INS]) during growth by counting the number of a direct connection between nodes included in each system^{52}.

### Volume entropy, and edge and node capacities

The graph filtration shows the change of the connected structure of a weighted network until constructing a GCC. To quantify the pattern of information flow of the weighted network constructing a GCC, we used the GMS of volume entropy.

The GMS is the generalization of the Markov chain defined on edges. In our GMS, an edge from *v* to *w* is different from an edge from *w* to *v* for any two nodes *v* and *w* in a network. Therefore, when the number of nodes is *p*, the number of all possible edges is *q* = *p*(*p* − 1). Moreover, the GMS assumes that the sum of edge weights in a weighted network is equal to 2. Then, the edge-transition matrix is defined by

$$L(h)=[{L}_{ef}={a}_{ef}{e}^{-hl(f)}]in {R}^{qtimes q},$$

where *a*_{ef} is 1 if an edge *e* is connected to an edge *f* in the network, 0, otherwise, *h* is a nonnegative constant, and *l(f)* is the weight of the edge *f*^{23,53}.

Unlike random walk in Markov chain, the state transition by *L(h)* is from *i* + 1 to *I* such that

$${z}_{i}=L(h){z}_{i+1},$$

where *z*_{i} shows the state of *q* edges at the *i*th step. When (z={z}_{i}={z}_{i+1}), the GMS is stationary, and *z*, the eigenvector of *L*(*h*) corresponding to an eigen value of 1, is called the stationary state of *q* edges, and *h* satisfying (z=L(h)z) is called volume entropy.

The *z* can be represented in a matrix form such that (Z=[{z}_{vw}]), where the (*v*, *w*)th element is the stationary state of an edge from a node *v* to a node *w* in *z*. The *Z* is called an edge capacity matrix. The afferent node capacity of a node *w* is obtained by the sum of the column vectors of *Z*, i.e., (sum _{v}{z}_{vw}), and the efferent node capacity of a node *v* is obtained by the sum of the row vectors of *Z*, i.e., (sum _{w}{z}_{vw}). The node capacity of a node *v* is obtained by the difference between the afferent and efferent node capacities, i.e., (sum _{v}{z}_{wv}-sum _{v}{z}_{vw}) ^{23}.

### Statistics

To examine the statistical differences of the graph parameters, including SLD, modulation score, volume entropy, global efficiency, edge capacity, and afferent and efferent node capacities, we applied the permutation test, which is a subset of non-parametric tests. In the unpaired permutation tests between the ADHD-model and control groups, individual PET images were randomly shuffled from a single mixed data set to construct two pseudo-groups which had the same number of subjects to the original data sets (*n* = 12) per each iteration. The graph parameters were calculated for the resampled pseudogroups in each iteration. The permuted distributions of the differences of the graph parameters between the pseudogroups were obtained via 10,000 times of iteration of the process, which were used to test the significance of the differences of the graph parameters between original groups with a two-tail P-value < 0.05 (Supplemental Fig. S3). In the paired permutation tests between 4 weeks old and 6 weeks old of the ADHD-model or control group, 12 times of random shuffling were available between paired data. Therefore, the paired permutation test with the enumeration of all the possible distinct 12-paired permutations (2^{12} = 4,096 times) was done to test developmental changes of graph parameters of each group from 4 weeks old to 6 weeks old^{54}. Multiple comparison problem was controlled as an FDR less than 0.05 using fdr_bh function with the Benjamini-Hochberg procedure in MATLAB^{55}.