Genome‐wide association analysis of natural variation in seed tocochromanols of barley

Tocochromanols (tocols for short), commonly called Vitamin E, are lipid‐soluble plant antioxidants vital for regulating lipid peroxidation in chloroplasts and seeds. Barley (Hordeum vulgare L.) seeds contain all eight different isoforms of tocols; however, the extent of natural variation in their composition and their underlying genetic basis is not known. Tocol levels in barley seeds were quantified in diverse H. vulgare panels comprising 297 wild lines from a diversity panel and 160 cultivated spring‐type accessions from the mini‐core panel representing the genetic diversity of the USDA barley germplasm collection. Significant differences were observed in the concentration of tocols between the two panels. To identify the genes associated with tocols, genome‐wide association analysis was conducted with single nucleotide polymorphisms (SNPs) from Illumina arrays for the mini‐core panel and genotyping‐by‐sequencing for the wild barley panel. Forty unique SNPs in the wild barley and 27 SNPs in the mini‐core panel were significantly associated with various tocols. Marker–trait associations (MTAs) were identified on chromosomes 1, 6, and 7 for key genes in the tocol biosynthesis pathway, which have also been reported in other studies. Several novel MTAs were identified on chromosomes 2, 3, 4 and 5 and were found to be in proximity to genes involved in the generation of precursor metabolites required for tocol biosynthesis. This study provides a valuable resource for barley breeding programs targeting specific isoforms of seed tocols and for investigating the physiological roles of these metabolites in seed longevity, dormancy, and germination.


INTRODUCTION
Vitamin E includes a family of eight compounds that are biochemically referred to as tocochromanols (tocols for in short) (Munne-Bosch & Alegre, 2002). These tocols consist of a chromanol ring and a polyprenyl side chain. The chromanol ring is derived via the shikimate pathway from homogentisate. Depending on the origin of the prenyl side chain, tocols are divided into two subgroups: tocopherols and tocotrienols. In tocopherols, the 15-C tail attached to the chromanol ring is derived from phytyl diphosphate; in tocotrienols, it originates from geranylgeranyl diphosphate (GGDP). Further, depending on the number and position of the methyl groups on the chromanol ring, there are four different isoforms of tocopherol and tocotrienols. The α isoforms have three methyl groups at positions 5, 7, and 8 in the chromanol ring, whereas the β and γ isoforms have a methyl group at the eighth position and a methyl group at the fifth position or seventh position respectively. The δ isoforms have only one methyl group at the eighth position on their chromanol ring ( Figure 1).
Vitamin E compounds are lipid-soluble and synthesized in the plastids of plants, algae, and cyanobacteria (Schultz, Soll, Fiedler, & Schulzesiebert, 1985). Alphatocopherol (AT), a major tocol isoform in green plant tissues, is involved in preserving the integrity of photosynthetic membranes by preventing lipid peroxidation (Munne-Bosch, 2005). Tocopherols play an important role in seed longevity by limiting the oxidation of storage lipids in the seeds and protecting seeds from reactive oxygen species (Sattler, Gilliland, Magallanes-Lundback, Pollard, & DellaPenna, 2004). However, γ-tocopherol (GT) may negatively impact seed germination and seedling development through modulating nitric oxide signaling (Desel & Krupinska, 2005). In barley, seed longevity and germination are important factors that directly have a bearing not only on the growers but also their major stakeholders, the malting and brewing industries.
The natural abundance and high Vitamin E activity of tocopherols, especially AT, (Bieri & Evarts, 1974), combined with the discovery of the AT transfer protein in humans (Arita et al., 1995;Hosomi et al., 1997), have led to a biased focus on this particular isoform. This is even more evident in how the recommended dietary allowance for Vitamin E is issued only for AT, since it is the only form maintained in the plasma. Tocotrienols have recently received considerable attention for their various biological properties such as their antioxidative (Serbinova, Kagan, Han, & Packer, 1991), antihypercholesterolemic (Qureshi, Sami, Salser, & Khan, 2002), anticancer (Goh, Hew, Norhanom, & Yadav, 1994;Takahashi & Loo, 2004), and neuroprotective activity (Khanna et al.,

Core Ideas
• Extensive natural variation for seed tocochromanol levels exists in barley • Alpha-tocopherol levels differed significantly between wild and domesticated barley • Diverse panels & high-density genotyping enabled marker-trait associations (MTAs) • MTAs were identified for all the primary tocochromanol biosynthesis pathway genes 2003). These findings suggest that tocotrienols have a wide range of physiological functions; therefore, developing plants that could synthesize high concentrations of these compounds would be useful for nutraceutical applications. Tocotrienols are found almost exclusively in the seeds of cereal grains and palm seeds (Khan, Ahsan, Siddiqui, & Siddiqui, 2015). All four isoforms of tocotrienol have been reported in developing barley grains (Falk, Krahnstover, van der Kooij, Schlensog, & Krupinska, 2004) and hence provide an excellent model for further analysis of these metabolites. The cloning of Vitamin E biosynthesis genes (VTE genes) in the model plant Arabidopsis thaliana (L.) Heynh. allowed the core tocol pathway to be engineered for improved nutritional content and composition in various plants (Collakova & DellaPenna, 2003;Hunter & Cahoon, 2007;Karunanandaa et al., 2005;Kumar et al., 2005;Lu, Rijzaani, Karcher, Ruf, & Bock, 2013;Savidge et al., 2002;Shintani & DellaPenna, 1998;Zhang et al., 2013). Efforts to identify the genetic loci associated with tocopherol biosynthesis have been reported for rice (Oryza sativa L.) (Sookwong et al., 2009), oats (Avena sativa L.) (Jackson et al., 2008), maize (Chander et al., 2008;Baseggio et al., 2019), and barley (Graebner et al., 2015;Oliver, Islamovic, Obert, Wise, & Herrin, 2014). In most of these studies, the major quantitative trait loci (QTLs) identified were found to be associated with one or two key genes in the tocol biosynthetic pathway.
In this study, we have analyzed the seed tocol concentrations from two diverse panels of H. vulgare: the Wild Barley Diversity Collection (WBDC) panel consisting of 297 H. vulgare ssp. spontaneum accessions (Sallam et al., 2017) and the mini-core panel consisting of 160 cultivated spring-type H. vulgare ssp. vulgare accessions that represents the genetic diversity of the USDA barley icore collection (Munoz-Amatriain et al., 2013). A genome-wide association study (GWAS) was undertaken to identify the genetic loci associated with each of the eight different The Plant Genome F I G U R E 1 Structure of tocochromanols isoforms of Vitamin E, tocopherols (the total of the four tocopherol isoforms), tocotrienols (total of the four tocotrienol isoforms) and total tocols (the total of all eight isoforms). The vast phenotypic variability in the concentration of these metabolites in the two highly diverse germplasm panels, together with the use of high-density SNP marker mapping, enabled the identification of significant MTAs for genes involved in tocol biosynthesis. Several novel QTLs that accounted for a significant amount of the phenotypic variation of tocols were identified in close proximity to genes in the biosynthesis pathway of tocol precursor metabolites.

Plant materials
The mini-core panel of cultivated barley comprises 184 accessions as previously described (Munoz-Amatriain et al., 2013). Of the 184 accessions in the mini-core panel, 160 spring types were selected. The seeds were obtained from the USDA-ARS National Small Grains Collection (Aberdeen, ID) and then grown in a greenhouse at the USDA-ARS Cereal Crops Research Unit in 2016 and 2017. Plants were grown in a potting mix consisting of sand, peat moss, and vermiculite (2:1:1). Osmocote 14-14-14 N-P-K (Scott's Co, Marysville, OH), was added to the potting mix (2 g per pot). Plants were watered regularly and maintained at 18 to 22 • C with 16 h of light (380 μmol m −2 s −1 ) until harvest. Spikes were harvested at maturity and the seeds were collected after threshing (LT15 laboratory thresher, Haldrup, Poneto, IN). The 297 WBDC accessions used in this study were collected from two field plantings (2005-2006 and 2015-2016) at the University of California, Davis. Seeds from these two field collections were stored in a cold room at the University of Minnesota and have previously been used in a GWAS of stem rust resistance (Sallam et al., 2017) and seed β-glucan content.

Tocochromanol assay and phenotypic data analysis
Barley seeds were ground in a Retsch ZM-1 mill (Retsch, Haan, Germany) and extracted via a modified hot saponification and extraction method (Fratianni, Caboni, Irano, & Panfili, 2002) (Supplementary File S1). Identification of individual forms was based on retention time, and quantification was based on standard curves developed from authentic tocopherols (Metraya LLC, Pleasant Gap, PA) (Supplemental Figure S1). Tocotrienols, which have essentially the same fluorescent properties as their corresponding tocopherols (Thompson & Hatina, 1979), were quantified with the same standard curves.
The eight tocols identified in this study were AT, β-tocopherol (BT), GT, δ-tocopherol (DT), α-tocotrienol (AT3), β-tocotrienol (BT3), γ-tocotrienol (GT3), and δtocotrienol (DT3). Additionally, the following three measurements were calculated with the data from the individual isoforms: total tocotrienols, total tocopherols, and total tocols. Within-line variance and between-line variances for each of the tocol traits from the two replications were used to estimate broad-sense heritability in the two populations.

Single nucleotide polymorphism genotyping
Approximately 200 mg of leaf tissue from each accession of the mini-core panel was used for DNA isolation via standard extraction methods in the presence of RNAse A. The integrity and concentration of DNA were analyzed with a Nanodrop 2000 (ThermoFisher, Waltham, MA). The 50k Illumina Infinium iSelect SNP genotyping array (Bayer et al., 2017) was used for genotyping the 160 barley minicore accessions at the North Central Small Grains Genotyping facility (USDA-ARS Edward T. Schafer Agricultural Research Center, Fargo, ND). Single nucleotide polymorphism alleles were called by GenomeStudio Genotyping Module Version 2.04 (Illumina, San Diego, CA) with the default parameters to the de novo clustering algorithm. Markers with poorly resolved genotype clusters (<15%) were altered so that the cluster locations represented the proper genotype. Artifacts of heterozygous genotypes were manually converted to "missing" if A and B genotypes were present. For the WBDC panel, genotypingby-sequencing marker data were generated as described previously (Sallam et al., 2017). For both the mini-core collection and the WBDC panel, physical marker positions were used (Mascher et al., 2017).

2.4
Mini-core panel structure analysis Pairwise genetic distances among all accessions was calculated for all markers as 1 -identity-by-state. With the use of the distance matrix, K-means clustering was performed on the 160 accessions of the barley mini-core panel with 5,000 iterations and five subpopulations in line with the previous study (Munoz-Amatriain et al., 2013). To investigate the effect of population structure on GWAS, principal component analysis (PCA) and pairwise genetic similarity was performed in R (http://www.R-project.org/, accessed 7 Aug. 2020) on the 50k SNP marker data. Subpopulations identified via K-means clustering were visualized on the PCA plot.

Genome-wide association study
To identify markers associated with the eight individual tocol isoforms, tocopherols, tocotrienols, and total tocols, GWAS was conducted by applying a mixed linear model that accounted for population structure and kinship in the rrBLUP package (Endelman, 2011). To account for structure, the first two PCAs were used as a covariate in the GWAS model. To account for genetic kinship, genetic relat-edness was fitted as a random covariate in the mixed linear model. In order to reduce redundancy in declaring MTAs, several filters were included in this analysis. A highly stringent cutoff that used the lowest 0.0001 percentile of the distribution of p-values for each trait was tested. Secondly, haplotype blocks were generated from the adjacent marker's linkage disequilibrium (LD) (estimated as r 2 ) in the significant segments, with an LD level of ≥0.2. Only the most significant SNP within a haplotype block was selected. Thirdly, to account for the large number of multiple tests involved in this analysis, a high false discovery rate threshold of 0.0005 was used as a cutoff for identifying significant markers. A Manhattan plot was generated for each tocol isoform and the threshold for detecting significant SNP markers was represented by a horizontal dashed red line. The phenotyic variance (R 2 ) explained by each MTA was calculated as described earlier (Sallam et al., 2017). A sliding window of 50 adjacent markers was used to characterize the LD in the mini-core population. These sliding window LD estimates of r 2 were plotted against the physical position. A locally weighted scatter plot smoother was fitted in JMP Version 11.2 (SAS Institute Inc., Cary, NC) to visualize the LD changes with the physical positions.

Gene annotations
Information about genes associated with the tocol biosynthetic pathway and their Ensembl gene identifiers were obtained from a recently published study (Schuy et al., 2019). The physical positions of the tocol biosynthesis pathway genes and the significantly associated SNP markers in proximity to these genes were determined at bp resolution and displayed on a physical barley chromosome map. On the basis of the LD decay patterns, proximity was defined as within 45 Mb for the mini-core population (Supplemental Figure S2) and within 13 Mb for the WBDC (Sallam et al., 2017). The significant marker loci identified by GWAS were called quantitative Vitamin E loci (QVEs) in line with the description for genetic loci associated with Vitamin E in A. thaliana (Gilliland et al., 2006). In order to identify other genes associated with the tocol pathway, the Ensembl database was used to retrieve all stable gene identifiers within a 45-Mb window surrounding each new QVE identified in the mini-core population and a 12-Mb window for the novel QVEs identified in the WBDC panel. Their corresponding protein sequences and protein domain information were retrieved from the Ensembl database (https://plants.ensembl.org/ Hordeum_vulgare/Info/Index, accessed 5 Aug. 2020). On the basis of the protein domains, we manually curated several of the genes in these regions using information from the Interpro (https://www.ebi.ac.uk/interpro/, accessed 5 Aug. 2020), Uniprot (https://www.uniprot.org/, accessed 5 Aug. 2020), Conserved Domain Database (https: //www.ncbi.nlm.nih.gov/cdd/, accessed 5 Aug. 2020), and Prosite (https://prosite.expasy.org/, accessed 5 Aug. 2020) databases. Protein sequence alignments were conducted by the TCOFFEE program (http://tcoffee.crg.cat/apps/ tcoffee/do:expresso, accessed 5 Aug. 2020). The gene expression profiles of some of the interesting genes associated with significant MTAs were retrieved from the Barlex database (https://apex.ipk-gatersleben.de/apex/f? p=284:10::::::, accessed 7 Aug. 2020)

Phenotypic data
The extent of phenotypic variation in the total tocol levels between the wild barley and the mini-core panels was analyzed. Broad-sense heritability estimates based on a linemean basis for each of the tocol isoforms ranged from 0.93 to 0.98 for the mini-core population. In the WBDC panel, the average heritability ranged between 0.82 to 0.97 (Table 1). The two biological replications of the seed samples from the wild barley panel indicated that the median values between these replicates were very close (∼52-54 μg g -1 ) ( Figure 2a, Supplemental Table S1).
There were a few accessions from the 2016 season that showed levels above the upper quartile of the plot. Interestingly, in the mini-core panel, the median value of total tocols was lower (∼44 μg g -1 ) than the WBDC panel (Supplemental Table S2). The upper quartile of the distribution was also significantly lower (70 μg g -1 ) than the WBDC's distribution (85-90 μg g -1 ) ( Figure 2a). The lower quartile of the distribution was similar between the two panels.
Next, we examined if the variation in the tocols in the two panels was attributable to tocotrienols or tocopherols. The median values of the distribution of tocotrienols between the panels as well as the upper and lower quartiles were not significantly different (∼32-34 μg g -1 ) ( Figure 2b). However, in the mini-core panel, the number of accessions that were above the upper extreme (represented by the whiskers in Figure 2) was higher than in the WBDC panel. On the contrary, the tocopherol levels in the WBDC panel were nearly twice the amount observed in the mini-core panel (Figure 2c). A significant number of accessions in the WBDC panel with tocopherol levels exceeded the upper extremes compared with the mini-core panel.
The distribution of each isoform of tocopherol and tocotrienol was examined ( Figure 3). The abundance of the tocopherol and tocotrienol isoforms in the barley seeds was in this order: α > γ > β > δ. The most abundant isoform was AT3 in both panels. The most significant difference F I G U R E 2 Whisker plots of (a) total tocochromanols, (b) tocotrienols and (c) tocopherols in the mini-core panel (MC16) and the wild barley panel over two different growing seasons (WB06 and WB16). The box represents the upper and lower quartiles of the distribution and the thick line inside the box represents the median value. The dotted lines extending from the box (whiskers) represent the variability outside the upper and lower quartiles. The circles represent the accessions whose values are outside the range of the normal distribution between the panels was observed in AT, which had two distinct distribution patterns: the mini-core panel was skewed towards lower concentrations, whereas the WBDC panel exhibited higher concentrations of this isoform. This skewness in the distribution between the WBDC and mini-core panels was much less pronounced for other tocopherols. In contrast to AT, the distribution of AT3 between the WBDC and mini-core panels showed a significant overlap. Compared with the WBDC panel, the mini-core panel showed a slight right skew in the distribution of BT3 and DT3. Similar to the distribution of GT, the distribution of GT3 showed a significant overlap between the two panels.
In order to get a better appreciation for the contribution of each of the tocol isoforms to total tocol levels, a correlation analysis was undertaken for the two panels separately ( Figure 4). In this analysis, it was clear that vast majority of the correlations were positive and only a few minor negative correlations were noted among BT, GT3, and DT in the mini-core panel and between DT3 and DT in the WBDC panel. The contributions of the isoforms to the total tocols in the two panels showed some subtle differences. In the WBDC panel, the relative levels of the tocol isoforms was Table 2).

Genotypic analyses and LD of the mini-core panel
In the Infinium 50k arrays, 44,041 SNPs were screened after removing the genotype clusters with poor scores. In this set, 232 SNPs had no calls and were removed from further analysis. After screening for a minor allele frequency of 5%, 32,942 markers were used for the final analyses.
The LD was estimated from these SNP markers, the number of SNPs for each chromosome, and the average adjacent marker LD for each chromosome. The average LD for the chromosomes ranged from 0.42 (2H) to 0.51 (1H) with an overall average LD of 0.46 (Table 3). With a sliding window of 50 markers, LD as r 2 decayed to 0.22 at ∼45 Mb and LD never reduced below 0.1 (Supplemental Figure S2).

3.3
Population structure and genetic relatedness in the mini-core panel K-means clustering separated the 160 accessions of the mini-core panel into five subpopulations, as was previously reported for the larger iCore population (Munoz-Amatriain et al., 2013). This was congruent with the grouping revealed by the PCA. The PCA plot showed that Principal Component 1 explained 14.6% of the variability and Principal Component 2 explained 6.6% of the variability ( Figure 5a). Furthermore, the genetic similarity based on kinship analyses was consistent with the PCA and revealed an especially close genetic relationship among accessions within Subpopulations 2, 4, and 5 (Figure 5b).
Distinct geographic distributions were identified for the subpopulations on the basis of the K-means, PCA, and genetic kinship analyses. According to the PCA plot, it appears that Subpopulation 1 accessions are equidistant and were mostly from the United States. Accessions in the Subpopulation 2 cluster were mostly from Asian TA B L E 2 Accessions from the Wild Barley Diversity Collection (WBDC) and mini-core panels with highest or lowest amounts of tocochromanol isoforms

Genome-wide association study for tocols in the WBDC and mini-core panels
In order to identify significant MTAs, a liberal criterion of the bottom 0.1 percentile for the distribution of p-values has been suggested (Chan, Rowe, & Kliebenstein, 2010;Pasam et al., 2012;Zhang et al., 2017). In our study, we implemented a more stringent cutoff by using the lowest 0.0001 percentile of the p-values' distribution. In the  Table S3). Thirteen of these markers were associated with more than one trait, thus bringing the number of unique QVEs to 40. Haplotype analysis surrounding these significant markers indicated there were 22 haplotype blocks.
Only the most significant marker within each block was selected for further analysis (Supplemental Table S4). When we used a highly stringent three-tiered approach for identifying MTAs, more than one significant marker could be identified for each of the tocols that also could account for 10% or more of the phenotypic variance (R 2 ) ( Table 4).
Significant marker associations with the tocol traits were identified on all seven chromosomes ( Figure 6). The largest number of significant SNPs (12) appeared on chromosome 7. Chromosome 2 had 11, and chromosome 5 had 10. Chromosomes 3 and 4 each had five significant SNPs; chromosome 6, with four significant SNPs, had the least. Nearly 77% of the variance in the amount of DT in the WBDC panel was accounted for by the seven significant SNP associations identified on six different chromosomes. More than 50% of the variation in the quantity of BT3 and AT was explained by five and six significant SNPs respectively. Five significant markers each for BT, DT3, and GT3 accounted for 30 to 40% of the phenotypic variation in these traits. The lowest amount of phenotypic variation explained was for the quantity of AT3, for which the significant SNPs on chromosomes 2 and 7 accounted for 17%. The marker S3H_168655778 on chromosome 3 explained 22 and 18% of the phenotypic variation in the quantity of DT and GT respectively. The marker S7H_639471725 on chromosome 7 was associated with AT3, BT3, and GT3, accounting for 7, 4, and 8%, respectively, of the variation for these traits (Supplemental Table S3).
In the mini-core panel, 34 significant SNP markers were identified that were associated with the tocol traits (Supplemental Table S3). Seven of these markers were associated with more than one trait, thus bringing the number of unique markers to 27. Haplotype analysis surrounding these significant markers indicated there were 15 haplotype blocks. Similar to the case of the wild barley panel, significant MTAs with the tocol traits were identified on all the seven chromosomes for the mini-core panel ( Figure 6). The highest number of significant SNPs (nine) was identified on chromosomes 2 and 7. Chromosome 3 had six significant SNPs, whereas chromosomes 4, 6, 5, and 1 harbored 4, 3, 2, and 1 SNPs, respectively. Nearly 57% of the phenotypic variation in DT levels in the mini-core panel was explained by markers on chromosomes 2, 3, and 6. Two markers, one at the proximal and other at the distal end of chromosome 2, accounted for more than 14% each of the variation in DT. Only one marker (JHI-Hv50k-2016-510759 on chromosome 7) was identified as being significantly associated with AT3 and accounted for 23.5% of the variation in this isoform of Vitamin E. This marker was also found to be associated with GT3 and explained 21.5% of the variation in the total tocotrienol levels in the minicore panel (Supplemental Table S3).

Candidate genes
All the genes associated with the central tocol biosynthesis pathway (VTE1-6) in plants have been identified (Munne-Bosch & Alegre, 2002) ( Figure 7). The coordinates of each of these genes were retrieved from the barley database and cross-referenced against the location of the significant SNP markers from this analysis (Table 4). Coordinates for the genes associated with the methylerythritol phosphate (MEP) and shikimate pathways were retrieved from the barley Ensembl database with the enzyme names used as keywords and were cross-referenced to the location of the significant SNPs identified in this study. On the basis of the LD decay pattern for the mini-core population (Supplemental Figure S2), genes that were up to 45 Mb from the SNP were included. On the other hand, the rapid decay of LD in the WBDC panel (Sallam et al., 2017) indicated the inclusion of genes within a distance of 12 Mb from the SNP. All the genes in the primary tocol pathway were in the proximity of at least one SNP from our analysis and, in several instances, the mapping revealed clusters of significant SNPs in close proximity to pathway genes (Figure 8).  Single nucleotide polymorphism (SNP) markers in italics indicate these identified in the mini-core population; regular font represents the markers in the WBDC panel. There are three main precursors for this pathway: homogentisic acid, phytyl diphosphate, and GGDP. Homogentisic acid, which forms the aromatic head group of all tocols, requires tyrosine that is usually produced via the shikimate pathway (Figure 7). Significant SNPs associated with two key enzymes involved in the tyrosine biosynthetic pathway were identified. Two SNPs in the mini-core population (JHI-Hv50k-2016-328186 and   -Hv50k-2016-351248) were 12 and 10.8 Mb from a 3-dehydroquinolate synthase at the distal end of chromosome 5. A SNP marker (S4H_591201688) in the WBDC panel that explained 14% of the variation in AT content was 5.4 Mb from another 3-dehydroquinolate synthase gene. A SNP marker (S7H_23774978) on chromosome 7 that accounted for 11% of the variation in DT3 in the WBDC panel was 4 Mb from the key penultimate enzyme, 5-enoylpyruvyl-shikimate-3-phosphate synthase in the tyrosine pathway. The conversion of tyrosine to homogentisic acid involves two enzymes: tyrosine amino transferase (TAT) and hydroxyphenyl pyruvate dioxygenase (Figure 7). Four SNPs significantly associated with tocols were identified in proximity to three different genes coding for TAT. A SNP in the mini-core panel (JHI-Hv50k-2016-7394) accounting for 13% of the variation in BT3 and a SNP from the WBDC panel (S13H_31689162) accounting for 10% of the variation in BT were 36 and 11.3 Mb, respectively, from a TAT gene on the proximal end of chromosome 1. In the WBDC panel, a SNP (S1H_504493781) that accounted for 15% of the variation in DT3 on the distal end of chromosome 3 was less than 12 Mb from another TAT gene (Table 4, Figure 8). A SNP (JHI-Hv50k-2016-415781) identified in the mini-core panel that explained 13% of the variation in DT mapped to the bottom of chromosome 6 was 11.5 Mb from the third TAT gene. Only a single gene coding for the hydroxyphenyl pyruvate dioxygenase was identified in the barley genome on chromosome 6. A significant SNP for AT was identified in the WBDC panel, which, however, was located ∼30 Mb from this SNP. Phytyldiphosphate, the precursor for the hydrophobic tail group for tocopherol biosynthesis, is derived from the chlorophyll breakdown pathway. A SNP marker on chromosome 5 (JHI-Hv50k-2016-328186) that accounted for 10% of the variation in BT was mapped close to a Stay-Green gene. The latter steps in the biosynthesis of phytol diphosphate include phytol kinase (VTE5) and phytyl phosphate kinase (VTE6) (Figure 8). Two markers on chromosome 2 identified in the mini-core panel were significantly associated with DT and GT3 and were 9 and 24 Mb, respectively, from the VTE5 gene (Figure 8). Another VTE5 gene was found in proximity to two markers on chromosome 3. A marker from the mini-core population (JHI-Hv50k-2016-205136) accounted for 15% of the variation in AT content and was 11 Mb from VTE5, whereas the marker S3H_624411648 in the WBDC panel that explained 12% variation in BT3 was 2.4 Mb from the VTE5 gene (Table 4, Figure 8). A marker on chromosome 3 (JHI-Hv50k-2016-185988) that accounted for 17% of the variation in DT3 was less than 12 Mb from the VTE6 gene. The identification of significant SNPs for tocols on chromosomes 2 and 3 and their proximity to key enzymes for the biosynthesis of tocols has not been reported in other barley QTL studies of Vitamin E.
Geranylgeranyl diphosphate is the precursor for the hydrophobic tail group for tocotrienol biosynthesis and is derived from the plastid-localized MEP pathway. The rate-limiting enzyme of the MEP pathway in plants is 1-deoxy-d-xylulose-5-phosphate synthase (DXP) (Estevez, Cantero, Reindl, Reichler, & Leon, 2001). Two DXP genes in the barley genome were identified that were in proximity to significant SNPs associated with tocols in both the wild barley and mini-core panels. Two adjacent SNPs on chromosome 1 belonging to two different haplotype blocks in wild barley were 1.36 and 2.91 Mb from the DXP gene (Table 4, Figure 8). In the mini-core panel, a SNP on chromosome 2 (JHI-Hv50k-2016-135562) was ∼20 Mb from a DXP gene (Figure 8, Table 4). A SNP on chromosome 3 (JHI-Hv50k-2016-205136) accounting for 15% of the variation in AT was 2.8 Mb from the 2-c-methyld-erythritol 4-phosphate cytidylyltransferase, the third enzyme in the MEP pathway (Phillips, Leon, Boronat, & Rodriguez-Concepcion, 2008). A SNP in the mini-core population (JHI-Hv50k-2016-1585988) that explained 17% of the variation in DT3 content was ∼18 Mb from a prenyltransferase enzyme that leads to the generation of farnesyl diphosphate. Adjacent to this marker, another SNP in the mini-core population (JHI-Hv50k-2016-183011) that accounted for 19% of the variation in BT was found to be 15 Mb from a farnesyl diphosphate synthase, which is a key enzyme for the generation of GGDP. On the distal end of chromosome 5, a significant SNP (JHI-Hv50k-2016-351248) in the mini-core population that explained 10% of the variation in AT was ∼21 Mb from a geranylgeranyl diphosphate synthase gene. Geranylgeranyl diphosphate can be converted into phytyl diphosphate by the enzyme geranylgeranyl diphosphate reductase (GGDR), thus increasing the pool size for tocopherol biosynthesis. In the mini-core panel, a significant SNP on chromosome 6 was identified 12 Mb from a gene annotated as GGDR (Table 4, Figure 8) The most significant MTA explaining 23.5% of the variation in the amount of AT3 in the mini-core panel was attributed to a SNP (JHI-Hv50k-2016-510759) on chromosome 7. In the WBDC panel, a marker (S7H_639471725) was identified in the same vicinity and was significantly associated with several tocotrienols (Table 4). These two markers are only 67 kb (in the mini-core panel) and 122 kb (in the WBDC panel) from homogentysyl geranylgeranyl transferase (HGGT), the key enzyme for tocotrienol biosynthesis (Yang et al., 2011). A second marker (JHI-Hv50k-2016-507055) in the mini-core population accounted for 19% of the variation in GT and 14% for total tocopherols, and was 153 kb from homogentisate phytyltransferase, a key enzyme in the tocopherol biosynthetic pathway (Collakova & DellaPenna, 2001). Thus, a high-density SNP array in conjunction with a large and diverse panel facilitated the identification of two discrete QVEs that are associated with two key enzymes involved in tocopherol and tocotrienol biosynthesis.
The VTE3 gene that encodes a methyltransferase uses the products of the reactions catalyzed by HGGT and homogentisate phytyltransferase to generate dimethylgeranylgeranyl benzoquinols and dimethyl-phytyl benzoquinols. Three SNPs significantly associated with tocols were found in proximity to the VTE3 genes localized on three different chromosomes. The SNP markers on the proximal end of chromosomes 4 (S4H_6536060) and 5 (S5H_32887010) that were significantly associated with the variation in the amount of AT and DT3 in the WBDC panel were 6 and 12 Mb, respectively, from the VTE3 locus. The SNP (JHI-Hv50k-2016-207829) on the distal end of chromosome 3 in the mini-core panel was ∼40 Mb from a VTE3 (Table 4, Figure 8).
The VTE1 gene that encodes a tocopherol cyclase can use the byproducts of the HGGT and homogentisate phytyltransferase enzymes or the byproducts of VTE3 to produce either δ or γ tocols (Sattler, Cahoon, Coughlan, & Del-laPenna, 2003). In barley, the amount of γ tocols is much greater than that of the δ isoforms (Table 2). A preference for the 2,3-dimethyl-geranylgeranyl or phytyl-benzoquinol products generated by the VTE3 enzyme maybe one of the possible factors for the higher levels of γ tocols. A SNP (JHI-Hv50k-2016-351248) in the mini-core panel that explained 10% of the variation in AT was 15.6 Mb from a VTE1 gene (Figure 8).
The γ-tocopherol methyltransferase (TMT) enzyme encoded by the VTE4 gene is the last enzyme of the pathway that converts the γ or δ isoforms into α or β isoforms, respectively. In the WBDC panel, two significant SNPs on chromosome 6 associated with AT (S6H_479264279) and DT (S6H_483619550) were 5 and 9.6 Mb, respectively, from a VTE4 gene (Table 4, Figure 8).
We also identified eight SNPs in regions of the genome that were not identified in previous studies. Seven SNPs in the WBDC panel and one in the mini-core panel that accounted for >10% each of the variation in the various seed tocol isoforms (Figure 8, Supplemental Table S5). For the SNP from the mini-core population, the 45-Mb region surrounding the SNP retrieved more than 820 proteins. In the 12-Mb region surrounding these seven SNPs in the WBDC panel, more than 700 protein coding genes were retrieved. A number of proteins with kinase domains, methyl transferase domains, and various types of transcription factors were identified. However, domain-based annotation did not aid in precisely identifying any specific enzymes associated with the tocol pathway.

Phenotypic variation for tocols
Vitamin E plays an important role in regulating lipid peroxidation, which aids in quelling oxidative damage during seed dormancy. This is supported by genetic mutants and transgenic lines with reduced Vitamin E exhibiting a significant reduction in their seed longevity (Chen, Li, Fang, Shi, & Chen, 2016;Sattler et al., 2004). In general, high germination rates in seeds are of paramount importance for farmers. Particular for barley, for which a major use of the seeds is in the malting industry, germination represents a pivotal aspect of this process. Furthermore, levels of Vitamin E in processed malt were significantly correlated with their levels in unprocessed seeds (Do, Cozzolino, Muhlhausler, Box, & Able, 2015) and can be useful for supporting the viability of brewing yeasts during the fermentation process, which, in turn is beneficial for producing higher levels of ethanol (Zhang, Qin, Lu, Wan, & Zhu, 2016). Thus examining the genetic diversity of barley germplasm to identify accessions with high levels of tocols is important for developing varieties with good malting traits. The fact that the tocochromonal levels in processed malt are similar to the levels in unprocessed seeds (Do et al., 2015) suggests that these metabolites are very stable. This was also supported by our observation that the tocol profile of seeds from the WBDC panel collected almost a decade apart were significantly correlated ( Figure 2). Furthermore, the high heritability values for each of the tocol isoforms observed in both the WBDC and mini-core populations (Table 1) indicate that the phenotypic variation for tocol traits in barley is mostly under genetic control and can be harnessed in breeding programs.
To gain an appreciation for this vast diversity in the amount of seed tocols within these panels, one line with the highest or lowest concentrations was identified for each of the eight tocol isoforms, tocopherols, tocotrienols, and tocols ( Table 2). The total tocols in the WBDC panel were more abundant (p < .05) than in the mini-core panel. It can be speculated that the wild barley species that have prolonged dormancy may require higher levels of tocols to keep their seeds viable. On the same grounds, cultivated barley, represented by the mini-core panel have been selected over time for reduced dormancy levels, which could account for their lower tocol levels. The mean tocotrienol concentrations were almost identical between these two diverse panels. However, interesting differences could be discovered from examination of the individual isoforms ( Figure 2, Figure 3, Figure 4, Table 2). The mean total tocopherol level in the WBDC population was more than 40% higher than the mean of the mini-core panel. This pattern was explained by the ∼50% higher levels of the AT isoform in the WBDC panel than in the mini-core panel. The levels of GT3 in the two panels were <2 μg apart, with the mini-core panel being at the higher end. Interestingly, the mean AT3 levels showed a difference of ∼4 μg between the two panels, but higher levels of this metabolite were found in the WBDC panel. It has been reported that in mature barley grains, 20% of the total tocopherol pool and 40% of the total tocotrienol pool are comprised of their respective γ-tocochromanol isoforms (Schuy et al., 2019). In our analysis, we found that 50% of the total tocopherol and total tocotrienol pools in both the WBDC and mini-core panels comprised the AT and AT3 forms, respectively. This pattern of distribution has also been observed in several other studies in barley (Do et al., 2015;Graebner et al., 2015;Panfili, Fratianni, & Irano, 2003). In light of our study and the abovementioned reports, it can be suggested that the activity of γ-TMT, the final enzyme in the biosynthetic pathway of the α isoform, may not be a limiting factor in barley seeds and that, consequently, α isoforms dominate the Vitamin E profile in barley seeds.
Interestingly, the tocopherol concentrations showed a significant difference between the WBDC and mini-core panels. Though the average GT levels were nearly identical in these two diverse panels, the AT isoform, which is the ultimate product of this pathway, had nearly twice the concentration in the WBDC panel than in the mini-core panel. This suggests that the γ-TMT enzyme, which is important for the conversion from the γ to the α isoform in H. vulgare ssp. spontaneum is more efficient than in H. vulgare ssp. vulgare. This merits further detailed investigation and could be harnessed for increasing the levels of tocopherols in the seeds of cultivated barley.
The δ isoform was the least abundant of the eight forms of Vitamin E in barley. The average concentration of the DT isoform was ∼30% higher in the WBDC panel than in the mini-core panel, whereas the pattern was reversed for the DT3 isoform, which showed a mean value that was ∼50% higher in the mini-core panel than in the WBDC panel ( Table 2). The δ isoforms are the penultimate pathway products and thus could determine the amount of β isoforms. Average BT levels in the WBDC and mini-core panels were 50% higher than DT levels, whereas the levels of BT3 were nearly fourfold higher than those of DT3, which was consistent in both panels. On the basis of these observations, it can be speculated that enhancing the metabolic flux at the initial point in the pathway may not cause any bottlenecks in the conversion of δ to β isoforms in barley seeds.

4.2
Population structure in the mini-core panel The mini-core panel is a subset of the USDA-ARS Barley Core Collection (N = 2417) and captures most of its allelic diversity (Munoz-Amatriain et al., 2013). In the 2013 study using the 7,842 SNPs for genotyping, five major subpopulations within the core panel were reported. In our study using the 50k SNP genotyping platform, we identified five subpopulations within the mini-core panel supported by PCA. The same population stratification was also observed, based on the genetic kinship analysis of the lines in the mini-core panel ( Figure 5). Consistent with these observations, the population structure was associated with the geographic distribution of the accessions. Accessions from Asia (Subpopulation 2) were clearly separated from the accessions from Americas (Subpopulation 4) and those from Europe and South America (Subpopulation 5). These results are in line with other reports demonstrating a strong relationship between population structure and geography in barley (Russell et al., 2016;Sallam et al., 2017).

Linkage disequilibrium
The LD estimates for the mini-core panel were about fivefold higher than those reported for the WBDC panel (Table 3, Supplemental Figure S2) (Sallam et al., 2017).
In the previous study of the iCore population, great variability in the LD distribution was observed between the subpopulations, as well as significant differences in the distribution of LD across the seven barley chromosomes (Munoz-Amatriain et al., 2013). Other studies have reported that LD extends over a long range in cultivated barley and a very short range in wild barley but is intermediate in landraces (Caldwell, Russell, Langridge, & Powell, 2006;Morrell, Toleno, Lundy, & Clegg, 2005;Sallam, Endelman, Jannink, & Smith, 2015). The H. vulgare ssp. spontaneum has much lower LD than the H. vulgare ssp. vulgare and is comparable with outcrossing species like maize. This may be caused by an accumulation of outcrossing events, given the long evolution of this species, the higher frequency of chiasmata formation in inbreeding species, and the relatively recent shift in fertilization from an outcrossing to a selfing system (Morrell et al., 2005). Given this significantly different distribution of LD between the two panels used for the GWAS of tocols, significant MTAs were evaluated that took this aspect into consideration.

Candidate genes
Three studies of the QTLs associated with tocols in barley have been reported (Graebner et al., 2015;Oliver et al., 2014;Schuy et al., 2019). All three studies indicated that the distal region of chromosome 7 harbors a major QTL associated with tocols in barley. Consistent with these reports, in the current study, one shared segment was identified, on the basis of haplotype block analysis, in both populations that was associated with the HGGT gene, a key enzyme in the tocotrienol biosynthesis, and the VTE2 gene that is vital for tocopherol biosynthesis (Figure 7, Figure 8). The QVE7.4 locus located distally on the long arm of chromosome 7 explains 13% of the variation in DT.
In the vicinity of this QVE, phytoene synthase, the first committed enzyme of the carotenoid pathway that leads to the production of phytoene by combining two molecules of GGDP, was identified (Chamovitz, Misawa, Sandmann, & Hirschberg, 1992). It is interesting to note that the closely located QVE7.2 and QVE7.3 loci, both in proximity to VTE2 and the rate-limiting HGGT enzyme for tocotrienol biosynthesis, are in the same region of the genome. Enzymes that can channel the metabolite fluxes towards the production of tocopherols and carotenoids in close proximity provide a simple and efficient mechanism to respond rapidly to changes in external cues and/or internal developmental programs (Nutzmann, Huang, & Osbourn, 2016), as has been reported in carrot (Daucus carota L. var. sativus) (Koch & Goldman, 2005;Nutzmann et al., 2016). Two QTLs on chromosome 6 associated with tocols that were separated by ∼3 Mb have been reported earlier (Graebner et al., 2015;Oliver et al., 2014). In our GWAS analysis, we speculate that we have identified the same regions on chromosome 6 that are in proximity to the GGDR and TAT genes (Figure 7, Figure 8). These slightly discordant positions may be attributed to the differences observed between the individual genome builds used in each SNP mapping study.
In the current study, a cluster of significant SNPs mapped to the distal region of chromosome 1 in the WBDC panel whose positions overlapped two QTLs from another study that were mapped adjacent to each other ( Figure 8) (Oliver et al., 2014). In the WBDC panel, SNPs in proximity to TAT and DXP were identified, supporting the possibility that this chromosome region harbors a significant source of genetic material to modulate tocols in barley seeds. A significant marker associated with BT was also identified on chromosome 1 in a survey of 1536 spring barley lines from 10 different breeding populations (Graebner et al., 2015). On the basis of their reported coordinates, we speculate it may be associated with the same TAT gene identified in our analysis. Interestingly, a gene with cellular retinaldehyde-binding domain, a triple functional domain (CRAL-TRIO) and Golgi dynamics domain was identified in the same region (QVE1.5) (Figure 8). Recently, the identification of the first tocopherol-binding protein (TBP) in plants was reported in tomato (Solanum lycopersicum L.) (Bermudez et al., 2018) contained the CRAL-TRIO-Golgi dynamics domain. The CRAL-TRIO domain is also found in mammalian AT-binding protein (Sato et al., 1993;Zimmer et al., 2000). Analysis of the barley protein sequence by ChloroP (http://www.cbs.dtu. dk/services/ChloroP/, accessed 5 Aug. 2020) indicated this protein had a chloroplast transit peptide (73 amino acids; score = 0.504) and showed 89% similarity to the tomato TBP and 74% homology to the human TBP (Supplemental Figure S3). The identification of TBP in plants suggests tocopherols play a role in inter-and/or intraorganellar communication (Munoz & Munne-Bosch, 2019), which warrants further investigation.
Apart from the previously reported QTLs associated with tocols on chromosomes 7, 6, and 1, our analysis provides the first insights on chromosomes 2, 3, 4, and 5 that are associated with the precursors of tocol biosynthesis (Figure 7). The enzyme catalyzing the breakdown of chlorophyll in seed tissues to provide the phytyl moiety is still unknown (Pellaud & Mene-Saffrane, 2017;Zhang et al., 2014). The identification of a SNP marker on chromosome 5 in proximity to the StayGreen is noteworthy. Interestingly, the expression of this gene was higher in the inflorescence rachis, lemma, and palea tissues than in senescing leaves (Supplemental Figure S4) suggesting this gene may play a role in providing the phytyl moiety for tocol biosynthesis in seeds, which warrants further investigation. On chromosomes 3 and 5, MTAs in both the WBDC and mini-core panels were found in proximity to 2-c-methyl-d-erythritol 4-phosphate cytidylyltransferase and 3-dehydroquinolate synthase, two upstream enzymes involved in the MEP and tyrosine pathways, respectively ( Figure 8). The MTAs identified at the distal end of chromosome 4 and the proximal end of chromosome 7, respectively, were found to be in close proximity to geranylgeranyl diphosphate synthase and ESPS genes that encode two enzymes that are important for the generation of the tocol precursors GGDP and tyrosine. Though these observations do not necessarily mean that these enzymes underlie these SNP-trait associations, consideration of the important pathways for the generation of precursor metabolites helped provide a biochemical context for several of the novel MTAs identified in this study.

D ATA AVA I L A B I L I T Y S TAT E M E N T
All the marker data and phenotypic trait data associated with the tocol isoforms in the mini-core and WBDC panels