Genomic Selection Using Maize Ex-Plant Variety Protection Germplasm for the Prediction of Nitrogen-Use Traits

Maize (Zea mays L) yield increases associated with better usage of N fertilizer, (i.e., increased N use efficiency [NUE]), will require innovative breeding efforts. Genomic selection (GS) for N-use traits (e.g., uptake or utilization efficiency) may speed up the breeding cycle of programs targeting NUE in maize. We evaluated the GS accuracy of 12 N-use traits for training populations (TPs) varying in composition (TC) and size, predicted yield performance under different N fertilizer rates, and investigated the usefulness of GS for NUE in maize breeding programs. A total of 552 maize hybrids were planted under low (0 kg N ha−1) and high N fertilizer (252 kg N ha−1) conditions across 10 environments. Training composition scenarios included T0 (hybrids in which none of the parents were included in the random subset of inbreds), T1 (hybrids in which one of their parents were included in the random subset of inbreds), and T2 (hybrids in which both of their parents were included in the random subset of inbreds). Training population sizes ranged from 10 to 40 or 30 to 90 hybrids, depending on the N-use trait. Across different TC, TP sizes, and N-use traits, GS accuracy ranged from −0.12 to 0.78 and was greatest with larger TP sizes when both parents of untested hybrids appeared in the training and validation sets (T2 hybrids). Moreover, GS accuracy in response to different TC and TP sizes was dependent on the N-use trait. Successful breeding for N stress tolerance or improved yield response to N fertilizer level will require selection of specific N-use traits. Dep. of Crop Sciences, Univ. of Illinois, Urbana, IL 61801-4730. Received 26 June 2018. Accepted 3 Oct. 2018. *Corresponding author (fbelow@illinois.edu). Assigned to Associate Editor Owen Hoekenga. Abbreviations: BLUE, best linear unbiased estimator; EBV, estimated breeding value; ex-PVP, expired Plant Variety Protection Act certification; G-BLUP, genomic best linear unbiased prediction; GCA, general combining ability; GEBV, genomic estimated breeding value; GS, genomic selection; GU, genetic utilization; HI−N, harvest index at low nitrogen; NHI−N, nitrogen harvest index at low nitrogen; NSSS, non-stiff stalk synthetic; NUE, nitrogen use efficiency; NUpE, nitrogen uptake efficiency; NUtE, nitrogen utilization efficiency; SCA, specific combining ability; SNP, single nucleotide polymorphism; SSS, stiff stalk synthetic; TC, training composition; TP, training population; Yield−N, yield under nitrogen stress; Yield+N, yield under high nitrogen. Published in Crop Sci. 59:212–220 (2019). doi: 10.2135/cropsci2018.06.0398 © Crop Science Society of America | 5585 Guilford Rd., Madison, WI 53711 USA This is an open access article distributed under the CC BY license (https:// creativecommons.org/licenses/by/4.0/). Published December 13, 2018

for secondary traits (Bänziger et al., 2000).Nitrogen use efficiency is the product of N uptake efficiency (NUpE, the ratio of the additional plant N content due to fertilizer N to the amount of fertilizer-applied N) and N utilization efficiency (NUtE, the ratio of yield increase to the difference in plant N content compared with those of an unfertilized crop) (Moll et al., 1982).Therefore, NUpE and NUtE are important component traits for the genetic characterization of maize NUE (Uribelarrea et al., 2007;Haegele et al., 2013;Mastrodomenico et al., 2018).In addition, secondary traits relevant for NUE improvement include harvest index, grain protein concentration, and genetic utilization (GU) (Haegele et al., 2013).Genetic utilization is defined as the plant's physiological efficiency under N stress (unfertilized) conditions to utilize N for grain production.
Nitrogen use efficiency can be improved by applying conventional selection strategies using phenotypic data for the above traits.In addition, marker-assisted breeding is becoming more commonly used in breeding programs as the cost of acquiring genotypic data becomes cheaper than phenotypic data (Bernardo, 2008).Linkage mapping studies were the first marker-assisted breeding efforts for improved NUE (Bertin and Gallais, 2001;Agrama, 2005;Liu et al., 2012).However, given the high genetic complexity of NUE in combination with diverse mapping populations, environments, and experimental designs, it is not surprising that these studies identified different sets of genes and genomic regions involved in NUE.The identification of maize genes consistently associated with improved NUE is a complex task, since gene expression is dependent on the soil N level (Chen et al., 2015), the source of N fertilizer (Patterson et al., 2010), and the plant's growth stage (Amiour et al., 2012).In addition to linkage mapping, several proteome studies designed to better understand the genetics of NUE have shown no relationship between transcriptome and metabolomic results (Simons et al., 2014).Recent research efforts using constitutive gene promoters have been made in plants to improve NUE (Xu et al., 2012), but transgenic genotypes with improved NUE performance are not yet available in the commercial seed market.However, marker-assisted breeding approaches with the ability to estimate smalleffect genes (commonly associated with complex polygenic traits) may be another strategy for developing genotypes with superior NUE performance.
Genomic selection (GS) uses molecular markers to predict the genotype's breeding value (Meuwissen et al., 2001).Therefore, GS could potentially be used for NUE improvement, where many loci with small effects contribute to the phenotypic expression of a genotype.Genomic estimated breeding values (GEBV) can be calculated by using the genomic relationship or the marker effects from individuals.Thus, the GEBV prediction uses a set of individuals that were both phenotyped and genotyped (i.e., a training population [TP]).The prediction accuracy is determined by the correlation between GEBV and the true genetic value estimated from the TP.The response to GS and the genetic gain of a genomic breeding program depend on the prediction accuracy (Falconer and Mackay, 1996).
The main questions that plant breeders have in developing GS models pertain to the optimal composition of the training sets and their size.Breeders must define their trait of interest based on the selection criteria (one or more traits), trait heritability, and phenotyping ability and capacity.These factors will define the training set size that is necessary to obtain accurate prediction and is related to the resources (e.g., budget, space, personnel) available to the breeding program.
A recent study evaluating the genetic variation for NUE in the US maize germplasm found a large range of broad-sense heritability estimates (from 0.11 to 0.77) across 12 N-use traits described in Table 1 (Mastrodomenico et al., 2018).This study suggested that breeding for improved NUE will require an integration between selection of highly heritable secondary traits, such as grain protein concentration or harvest index at high N, and marker-assisted breeding.However, no information about the accuracy of GS for maize yield performance under different N fertilizer rates or for different N-use traits is available.The objectives of this experiment, which uses ex-Plant Variety Protection (ex-PVP) germplasm adapted to the US Corn Belt, were (i) to evaluate the GS accuracy of different N-use traits, (ii) to predict yield performance under different N fertilizer rates, and (iii) to investigate the usefulness of applying GS in NUE maize breeding programs.

Germplasm Identification and Hybridization
A diversity panel of 89 ex-PVP and two public (B73 and Mo17) lines adapted to the US Corn Belt were tested in this study (Supplemental Table S1).The original seed source was acquired from the USDA-ARS North Central Regional Plant Introduction Station, Ames, IA.A collection of 12 important progenitor lines previously characterized by Hauck et al. (2014) and a random collection of inbreds originating from six different seed companies and released between 1972 and 2011 were also included in the diversity panel.Although diversity panels can capture the historical genetic recombination performed by previous breeding programs (Lipka et al., 2015), the more recently released ex-PVP lines may expose the genetic diversity shifts observed during the past 20 yr (Smith et al., 2004).
For all inbreds, DNA was isolated from 14-d-old seedlings.Inbreds were genotyped using the genotype-by-sequencing method (Elshire et al., 2011), and two enzyme combinations were used to reduce genomic complexity: PstI-HF plus Bfal and PstI-HF plus HinP1I.Sequenced data were obtained from Illumina HiSeq2000 (University of Illinois W.M. Keck Center for Comparative and Functional Genomics, Urbana, IL), and single-nucleotide-polymorphism (SNP) data were called using the genotype-by-sequencing pipeline in TASSEL 3.0 addition to 50 tested hybrids, were planted in an augmented design with four commercial hybrid checks randomly assigned in nine blocks.Although experimental error of genotypes cannot be estimated in augmented designs, this experimental design provides the opportunity to test more genotypes.New hybrids tested in 2016 were created based on the genomic prediction results using the phenotypic data obtained from the 2011 to 2014 experiments.A schematic representation of the incomplete hybrid combination factorial is shown in the Supplemental Fig. S2.Across all 10 environments, a total of 522 single-cross maize hybrids were evaluated.On average, each SSS line was used in 16 (range = 6-57) and each NSSS line in nine (range = 2-38) different hybrid combinations.

Agronomic Practices, Ex-PVP Hybrids, and Experimental Design
Field experiments were conducted in 11 environments from 2011 to 2016.Experimental data from 2012 were removed from the analysis due to drought stress.Plots were planted using a precision plot planter (SeedPro 360, ALMACO) for one environment at DeKalb, IL (41°47¢ N, 88°50¢ W), six environments at Champaign, IL (40°3¢ N, 88°14¢ W), and three environments at Harrisburg, IL (37°43¢ N, 88°27¢ W).Planting, N application, plant sampling, and harvesting dates are listed in Supplemental Table S2, whereas preplating soil test values are presented in Supplemental Table S3.
Plots were 5.6 m in length with 0.76-m row spacing, and two rows in width.The previous crop planted in each environment was soybean [Glycine max (L.) Merr.].Final population was adjusted to 79,000 plants ha −1 .Preemergence weed control consisted of the herbicide Lumax EZ (mixture of S-metolachlor, atrazine, and mesotrione; Syngenta Crop Protection) applied at a rate of 7 L ha −1 to control early-season weeds.Before planting, seeds were treated with Maxim XL fungicide (fludioxonil and mefenoxam at rate of 0.07 mg a.i.kernel −1 ; Syngenta Crop Protection) and Cruiser 5FS insecticide (thiamethoxam at 0.80 mg a.i.kernel −1 ; Syngenta Crop Protection) for disease and insect damage protection, respectively.Additionally, Force 3G insecticide [tefluthrin 2,3,5,6-tetrafluoro-4 Syngenta Crop Protection] was applied at planting in furrow (at a rate of (Bradbury et al., 2007), resulting in 86,592 SNPs.The minor allele frequency cutoff was set to 10%, and SNPs with >50% missing data were removed.A total of 26,769 SNPs were used for the analyses and may be found online as supplemental data.Principal component analysis using the full set of SNP markers was performed on all inbreds categorized to distinct different heterotic groups (Fig. 1).Moreover, the genomic relationship matrix (K matrix) between all inbreds was calculated according to VanRaden (2008) (Supplemental Fig. S1).Different heterotic groups consisted of 36 stiff-stalk synthetic (SSS) lines originating from the B73 cluster, vs. 53 non-SSS (NSSS) lines, in which 19 lines were from the Iodent subheterotic group (PH207 cluster), and 34 lines were from the Lancaster subheterotic group (Mo17 cluster).
Single-cross hybrid seeds were produced between SSS and NSSS parental lines in nurseries from 2011 to 2015 at Champaign, IL. Between 2011 and 2015, 259 hybrids (ex-PVP inbred combinations within and across seed companies) were tested in a randomized complete block design with three replications.In 2016, 263 new (i.e., not tested) hybrid combinations, in  0.15 kg a.i.ha −1 ) to prevent western corn rootworm (Diabrotica virgifera virgifera) larvae infestation.At all research sites, hybrids were planted at two N fertilizer rates (0 and 252 kg N ha −1 ; designated low and high N, or −N and +N, respectively) in a split-plot arrangement.The main plot was hybrid and the split plot was N fertilizer rate.Nitrogen fertilizer was hand applied in a diffuse band as urea (46-0-0 N-P-K) during the V2 to V3 growth stages (Ritchie et al., 1997).At maturity, plots were harvested with a two-row plot combine (SPC40, ALMACO).Grain yield is reported as megagrams per hectare at 15.5% grain moisture.Grain protein concentration was estimated using a representative grain subsample from each plot collected during harvest using near-infrared transmittance (NIT) spectroscopy (Infratec 1241, FOSS).

Phenotypic Data Analysis and Genomic Prediction Model
Nitrogen use traits were measured according to Mastrodomenico et al. (2018).Briefly, six whole plants from each experimental plot were harvested at the R6 growth stage (Ritchie et al., 1997) to measure biomass, grain weight, and plant N concentration.The 12 N-use traits measured in this study are described in Table 1.Adjusted means for hybrids tested in 2016 were calculated using best linear unbiased estimators (BLUEs) with hybrid as a fixed effect and rows, columns, and environments used as random effects.General combining ability (GCA) and specific combining ability (SCA) were calculated using the phenotypic data obtained between 2011 and 2015 and the hybrid BLUEs from 2016.Best linear unbiased predictors (BLUPs) of the GCA were calculated using the restricted maximum likelihood method according to the model described by Reif et al. (2013).The estimated breeding value (EBV) of each hybrid was calculated according to Eq. [1]: where EBV kl is the estimated breeding value of the klth hybrid, m is the grand mean, GCA k is the general combining ability effect of kth SSS inbred (k = 1-57), GCA l is the general combining ability effect of lth NSSS inbred (l = 1-38), and SCA kl is the specific combining ability effect of klth hybrid (kl = 1-522).All variance components were estimated using the lmer4 package in R Studio (R Development Core Team, 2015).Phenotypic variance was calculated as the sum of all variance components, except the variance component for block effect and environment (Holland et al., 2003).Therefore, broad-sense heritability was calculated according to Eq. [2]: ( ) where H 2 is the broad-sense heritability, and s 2 GCAk , s 2 GCAl , and s 2 SCA are the variance components for the SSS GCA, NSSS GCA, and SCA, respectively.Similarly, s 2 GCA ´ E , s 2 SCA ´ E , s 2 R are variance components for GCA ´ environment interaction, SCA ´ Environment interaction, and the residual, respectively.Genomic best linear unbiased prediction (G-BLUP) of untested hybrids (y u ) was calculated according to Eq. [3]: where C UT is the covariance matrix among untested and tested hybrids, C TT −1 is the inverse of the variance-covariance matrix of the tested hybrids, and y T is the EBV of a set of tested hybrids.The hybrid's EBV will vary according to the phenotypic information and the additive effect between the individuals (Lynch and Walsh, 1998).Therefore, the genomic relationship coefficients between SSS and NSSS inbreds (Supplemental Fig. S2) were assigned to C UT and C TT −1 according to Bernardo (1996).

Cross-Validation
The estimation of prediction accuracy was performed in R Studio (R Development Core Team, 2015) using the cross-validation approach described by Technow et al. (2014).For investigating the effect of training composition (TC) on the prediction accuracy, a random subset of 16 SSS and 30 NSSS lines was selected in each iteration.Based on each random subset of lines, hybrids were categorized as T0 (i.e., hybrids in which none of the parents were included in the random subset of inbreds), T1 (i.e., hybrids in which one of their parents were included in the random subset of inbreds), and T2 (i.e., hybrids in which both of their parents were included in the random subset of inbreds).For investigating the effect of TP size in the prediction accuracy, a random subset of either 10, 20, 30, or 40 T2 hybrids (for the traits of harvest index, N harvest index, NUE, NUpE, NUtE, and GU) or 30, 50, 70, or 90 T2 hybrids (for the traits of yield and grain protein concentration) were used to predict the T0, T1, or the remaining T2 hybrids.Whereas harvest index, NUE, NUpE, NUtE, and GU were evaluated in 259 hybrids, yield and grain protein concentration were evaluated in 522 hybrids.The restriction in TP sizes between the two dataset sizes allowed a reasonable number of hybrids to be compared in the validation set.Prediction accuracy was calculated as the Pearson correlation between a hybrid's EBV and predicted values (GEBV).Moreover, the cross-validation process was repeated 1000 times.On average, predictions of T0, T1, and T2 consisted 65, 129, and 63 hybrids, respectively, across iterations.

Genetic Relationship and Population Structure between Ex-PVP Lines
Principal component analysis using SNP markers of all inbreds revealed distinct clusters among heterotic groups (Fig. 1).Mean relationship coefficients between SSS, NSSS, and SSS by NSSS lines were 0.59, 0.31, and 0.31 with SDs of 0.31, 0.23, and 0.15, respectively (Fig. 2).Inbreds from the SSS group were more genetically related than inbreds from the NSSS group.This variation in genetic relatedness between heterotic groups is due to the fact that most SSS inbreds are B73 descendants and the inbreds of the NSSS group originated from two separate subheterotic groups (Iodent [PH207] and Lancaster [Mo17]).

Phenotypic Variation and Correlation
Variance components and broad-sense heritability (H 2 ) varied across N-use phenotypic traits for the 522 hybrids when grown with differing N supplies in 10 environments (Fig. 3).Broad-sense heritability ranged from 0.11 to 0.77.

Genomic Prediction Accuracy
Across all N-use traits and TC schemes, G-BLUP accuracy ranged from −0.12 to 0.78 (Supplemental Tables S4 and S5).Prediction accuracy increased by 13% as the TC changed from T0 to T1 and increased by 10% as the TC changed from T1 to T2 hybrids when averaged across N-use traits and TP sizes (Fig. 5).Increasing TP size was more effective when more genetic information (TC) was available in the TP.Consequently, increased TP size (from 30 to 90 or 10 to 40 hybrids) improved GS accuracy by 5, 19, and 31% using T0, T1, and T2 hybrids in the TC, respectively.Changes in prediction accuracy as a result of variation in TC and TP sizes were dependent on the N-use trait.For example, harvest index at low N (HI −N ) and GU, compared with other traits, exhibited a greater increase in prediction accuracy due to increased TP size compared with increased inclusion of parents in the TC.In contrast, grain protein concentration at low N (Protein −N ) exhibited a greater increase in prediction accuracy from increasing the TC rather than the training size.

DISCUSSION Prediction Accuracy Response to Different Training Composition
Hybrid performance prediction is mostly driven by the coancestry coefficient between individuals (Bernardo, 1996).In addition, prediction accuracy is affected by the composition of the TP (Riedelsheimer et al., 2013;Technow et al., 2013Technow et al., , 2014)).The related genetic constitution between training and validation sets allows for similar linkage phases between markers and quantitative trait loci among these groups (Technow et al., 2013).Similar to these previous studies, we also observed higher prediction accuracies as a result of increasing the number of parents shared between the training and validation sets (Fig. 5).However, differences in prediction accuracy associated with TC were trait specific.
High H 2 and genetic relatedness between individuals are important factors for increased prediction accuracy in additive genetic models (Daetwyler et al., 2010).We observed a low genetic relatedness among NSSS lines (Supplemental Fig. S1) and a large range of H 2 for N-use traits (Fig. 3).Technow et al. (2014) compared different cross-validation methods using G-BLUP by changing the TC and found higher prediction accuracy values than reported here.High prediction accuracies obtained by Technow et al. (2014) were associated with higher H 2 and realized relationships between parental lines used in training and validation populations, due to a greater number of both hybrids and environments.The low genetic relatedness between inbreds used in our study is not surprising, given that competing seed companies developed these inbreds.Therefore, the development Genetic variance and H 2 were greater for traits when hybrids were grown under high-N compared with low-N conditions.Across all phenotypic traits, residual and GCA variances accounted for the majority of the total phenotypic variance, regardless of the N treatment.In contrast, SCA and genetic ´ environment interaction variances had a small contribution to the total phenotypic variance.
A biplot depiction of the principal component analysis revealed correlations among phenotypic traits due to fertilization level for the field-grown hybrids (Fig. 4).Overall, N-use traits associated with plant N partitioning and redistribution to the grain (i.e., GU, harvest index, and N harvest index at high N [NHI +N ]) were highly correlated with yield under low N, whereas traits associated with the yield response to N fertilizer (i.e., NUE, NUtE, and NUpE) were correlated with yield under high-soil-N conditions.Within each N fertilizer treatment, grain protein concentration was negatively correlated to yield.In addition, N harvest index at low N (NHI −N ) accounted for a small portion of the total phenotypic variation, likely due to a large residual variance (Fig. 3).   of breeding programs targeted for specific N-use trait improvement will likely increase the genomic prediction accuracy as the number of newly selected breeding populations and the number of individuals are increased.

Prediction Accuracy Response to Increased Training Size
Increased TP size has an important effect on GS accuracy with both animals (VanRaden, 2008) and crops (Lorenzana and Bernardo, 2009).However, increased TP size was minimally or not effective when no parental information was available in the TP (T0 hybrids, Fig. 5).Our results suggest that increased TP size may have a greater influence on prediction accuracy when training and validation populations become more genetically related.
One of the possible reasons for the success of increasing prediction accuracy by increasing TP size is that the SCA effect had only a small contribution to the total phenotypic variance (Fig. 3).Therefore, increasing TP size will increase the precision of estimating GCA effects (Technow et al., 2014).The greater importance of GCA than SCA for genomic prediction is mainly found in plant species with genetically distinct heterotic patterns (Reif et al., 2007).This heterotic pattern condition may be one explanation why prediction accuracy benefits more from increasing TP size using T2 hybrids than T0 hybrids.Although the genetic covariance between inbreds was used for hybrid performance prediction in this study, recent research showed that applying genomic-estimated GCA and SCA might result in higher prediction accuracy when using T0 hybrids in the TP (Kadam et al., 2016).
Another interesting finding from this study was the negative prediction accuracy observed for NUtE and NHI −N (Fig. 5).Although negative prediction accuracy for NUtE was observed only when using unrelated parents (T0 hybrids) in the TP, NHI −N provided negative accuracies regardless of TC or TP size.Negative prediction accuracy in maize has been reported, using variations in full and half-sib doubled haploid lines in the TP evaluating five agronomic traits (Riedelsheimer et al., 2013).In agreement with the previous research, large residual variance and small genetic variance of NUtE and NHI −N traits have provided a poor genetic signal for the training model and, therefore, reduced the GS accuracy.

Use of Secondary Traits for Nitrogen Use Efficiency Breeding
Whereas the genotypic correlation between yield at low and high N is ?0.31 (Mastrodomenico et al., 2018), the secondary traits associated with yield at low and high N are negatively correlated (Fig. 4).One strategy for a maize NUE breeding program is to improve genotypes for the desirable phenotypic traits correlated to increased yield under N-stress (Yield −N ) or high N conditions (Yield +N ).The use of secondary traits may improve the precision to identify a genotype, identify the degree of N stress, and aid plant breeders in making selections.Under high-N conditions, NUE and NUpE were the secondary traits that combined the highest GS accuracy and correlation with Yield +N .However, NUE exhibited greater GS accuracy than NUpE across different TP scenarios.The derivation of NUpE from multiple plant measurements (individual plant biomass, seed weight, and N concentration) may have contributed to increased residual error as the error variance from each of these trait components have added up to the final error variance of NUpE (Table 1).On the other hand, NUE is derived from only two component measurements (yield at low and high N).In addition to higher H 2 , NUE was more stable than NUpE across high-N environments (Mastrodomenico et al., 2018).Therefore, NUE may be the most effective secondary trait to be selected in breeding programs developing hybrids for agricultural systems using high N fertilizer inputs.
Under low-N stress conditions, GU and HI −N both displayed higher GS accuracy and correlation with Yield −N .However, prediction accuracy for HI −N was higher than for GU across different TC and training sizes.Therefore, under N stress environments, the most effective phenotypic trait to select for is HI −N for the following reasons: (i) harvest index is highly genetically controlled and associated with yield under N stress conditions, (ii) harvest index requires less genetic information in the TP than GU to reach the same prediction accuracy value, (iii) harvest index requires fewer hybrids to be phenotyped than GU within the same prediction accuracy, and (iv) harvest index is easier and cheaper to measure than GU.
The large genotypic variation of N-use traits found among the ex-PVP germplasm highlights the opportunity that exists for selecting maize genotypes with desirable NUE performance.Previous research has demonstrated that maize genotypes can reach high NUE using different plant physiological strategies (Mastrodomenico et al., 2018).Genomic prediction can be integrated into NUE breeding programs using specific phenotypic traits depending on the target environment for N condition (low or high N).Since H 2 and prediction accuracy for yield at low N are less than for yield at high N, breeding for low N tolerance may benefit more from using secondary traits in GS than from breeding for increased yields at high N. Furthermore, HI −N exhibited higher H 2 and higher prediction accuracy than Yield −N (Fig. 5).

CONCLUSIONS
Nitrogen-use traits are highly polygenic and complex.Phenotyping for maize NUE under field conditions is time consuming and costly.The use of GS for NUE improvement holds great promise, since it can reduce the number of breeding cycles and the cost for field phenotyping.The best GS scenario was when both parents were present in the training and validation sets with larger TP size.However, prediction accuracy response to TP size and composition was dependent on the N-use trait.The identification of highly heritable and therefore predictable N-use traits is important and must be targeted according to the breeding objectives (tolerance to N stress or yield response to N fertilizer).This research suggests that breeding for NUE improvement must be phenotyped under both N conditions.More importantly, yield response to N fertilizer and all N-use traits are dependent on the field trial's low-N conditions.Although N fertilizer trials have an increased cost compared with a standard yield selection trial, the characterization of genotypes for N stress tolerance and their response to N fertilizer are important to a maize breeding program.Previous research has effectively integrated crop growth models and genotype ´ environment interaction effects in GS models (Cooper et al., 2016).Future research integrating environmental effects associated with maize N acquisition and assimilation may improve GS prediction accuracy.
Fig. 1.Principal component analysis (PCA) of 89 ex-Plant Variety Protection (ex-PVP) and two public (B73 and Mo17) inbred lines using 26,769 single-nucleotide polymorphisms.The main progenitor lines for each cluster are identified: B73 for the stiffstalk synthetic (SSS), PH207 for the Iodent, and Mo17 for the Lancaster heterotic groups.Colors represent the origin of the different inbred lines.

Fig. 3 .
Fig. 3. Relative contribution to total phenotypic variance of general combining ability [Var(GCA)], specific combining ability [Var(SCA)], genotype ´ environment interaction [Var(G ´ E)], and residual variances [Var(R)] for 12 different N-use traits averaged over 522 maize hybrids grown at low (0 kg N ha −1 ) or high N (252 kg N ha −1 ) from 2011 to 2016.The measured traits included yield at low N (Yield −N ), yield at high N (Yield +N ), harvest index at low N (HI −N ), harvest index at high N (HI +N ), N harvest index at low N (NHI −N ), N harvest index at high N (NHI +N ), grain protein concentration at low N (Protein −N ), grain protein concentration at high N (Protein +N ), N use efficiency (NUE), N uptake efficiency (NUpE), N utilization efficiency (NUtE), and genetic utilization (GU).Broad-sense heritabilities (H 2 ) were estimated on an entry-mean basis for the same hybrids described above.

Fig. 4 .
Fig. 4. Biplot of the loadings derived from principal component analysis of 12 phenotypic traits.Principal component analyses were performed comparing N-use phenotypic traits with 522 single-cross hybrids receiving either low N (0 kg N ha −1 ) or high N (252 kg N ha −1 ) fertilizer and averaged across 10 environments from 2011 to 2016.Phenotypic traits positively correlated with yield at low N and yield at high N are represented by blue and orange arrows, respectively.Phenotypic traits not positively correlated with yield (low or high N) are represented by black arrows.The measured traits included yield at low N (Yield −N ), yield at high N (Yield +N ), harvest index at low N (HI −N ), harvest index at high N (HI +N ), N harvest index at low N (NHI −N ), N harvest index at high N (NHI +N ), grain protein concentration at low N (Protein −N ), grain protein concentration at high N (Protein +N ), N use efficiency (NUE), N uptake efficiency (NUpE), N utilization efficiency (NUtE), and genetic utilization (GU).

Fig. 5 .
Fig.5.Prediction accuracy response to different training population sizes and compositions for 12 phenotypic traits of maize when grown with either low N (0 kg N ha −1 ) or high N (252 kg N ha −1 ) fertilization.Training composition schemes were categorized into T0 (hybrids in which none of the parents were included in the random subset of inbreds), T1 (hybrids in which one of their parents were included in the random subset of inbreds), and T2 (hybrids in which both of their parents were included in the random subset of inbreds).A total of 522 hybrids were used for the prediction of yield and grain protein, and 259 hybrids were used for harvest index (HI), N harvest index (NHI), N use efficiency (NUE), N uptake efficiency (NUpE), N utilization efficiency (NUtE), and genetic utilization (GU).Vertical bars represent the SD of the accuracy mean.

Table 1 .
List of 12 N-based phenotypic traits, units, abbreviations, and formulas.