Variance component estimations and mega‐environments for sweetpotato breeding in West Africa

Abstract The current study was aimed at identifying mega‐environments in Ghana and evaluating adaptability of superior sweetpotato [Ipomoea batatas (L.) Lam.] genotypes from a targeted breeding effort. Three sets of genotypes were evaluated in multi‐environment trials (MET). Twelve sweetpotato varieties were evaluated across nine environments representing the main agro‐ecological zones in Ghana. MET analysis was conducted using a stage‐wise approach with the genotype × environment (G × E) table of means used as a starting point to model the G × E interaction for sweetpotato yield. Emphasis was given to the genetic correlation matrix used in a second‐order factor analytic model that accommodates heterogeneity of genetic variances across environments. A genotype main effect and G × E interaction of storage root yield explained 82% of the variation in the first principal component, and visualized the genetic variances and discriminating power of each environment and the genetic correlation between the environments. Two mega‐environments, corresponding to northern and southern trial sites, were delineated. Six breeding lines selected from the south and eight breeding lines selected from the north were tested and compared to two common check clones at five locations in Ghana. A Finlay–Wilkinson stability analysis resulted in stable performances within the target mega‐environment from which the genotypes were selected, but predominantly without adaptation to the other region. Our results provide a strong rationale for running separate programs to allow for faster genetic progress in each of these two major West African mega‐environments by selecting for specific and broad adaptation.


INTRODUCTION
Sweetpotato [Ipomoea batatas (L.) Lam.] is cultivated across a wide range of agro-ecological conditions, but storage root yields are low in many countries. Improving storage root yields can be achieved through genetic improvement and replacing old varieties by new ones, cultural practices such as timely planting, weed control, crop rotation and fertilizer input, and the use of pathogen-tested clean planting material (Grüneberg et al., 2015). As most farmers in Sub-Saharan Africa (SSA) do not have access to pathogen-tested planting material, they rely on "apparently" healthy planting material obtained by negative selection which, in some cases, can be as effective as using pathogen-tested planting material (Abidin, Akansake, Asare, Acheremu, & Carey, 2017). Good agricultural practices have been proven to boost sweetpotato productivity (Fuglie, Zhang, Salazar, & Walker, 1999;Lagnaoui, Alcázar, & Morales, 2000), but may not be accessible for smallholder farmers in SSA. The proportion of yield gap that can be met through improved varieties is generally estimated at 50%, but this is likely to be higher in West Africa where sweetpotato breeding is yet to be fully exploited (Andrade et al., 2009). The Food and Agricultural Organization of the United Nations (FAO) reports sweetpotato storage root yields across the world, with the lowest values (1.9 t ha −1 in 2016) in Ghana. However, actual yields are likely to be five times higher, as reported by the Ghanaian Ministry of Food and Agriculture, and similar to the values reported by FAO for its neighboring country, Burkina Faso (11.0 t ha −1 ; Ministry of Food and Agriculture, 2012).
Improving storage root yield through genetic improvement is a high priority in countries with low yields, although biotic and abiotic stresses can slow down genetic progress. An important biotic constraint in SSA is Sweetpotato Virus Disease (SPVD), especially in high virus-pressure areas. The SPVD is caused by synergistic co-infection by the whiteflytransmitted crinivirus, Sweetpotato Chlorotic Stunt Virus (SPCSV), and the aphid-transmitted potyvirus, Sweetpotato Feathery Mottle Virus (SPFMV; Clark et al., 2012;Gibson & Kreuze, 2015). Progress in breeding for SPVD resistance has been slow, most likely due to the hexaploid nature of sweetpotato and the reported recessive inheritance of resistance to SPCSV and SPFMV (Mwanga et al., 2002b;Mwanga, Yencho, & Moyer, 2002a). Furthermore, different strains of SPCSV are known to predominate in East and West Africa, and the implications of these differences for virus resistance breeding are not yet understood (Clark et al., 2012). Improved sweetpotato genotypes bred outside of Africa are usually unsuitable for SSA because in the majority of cases, they come from countries with lower virus pressure, and to a certain extent, better developed seed systems to manage virus pressure (Gibson & Kreuze, 2015). To be successful in West Africa, sweetpotato genotypes must have a good level of resistance to SPVD in the southern humid tropical agro-ecological zones (AEZs) where vector populations and host reservoirs can remain high year-round. They must also have a pronounced level of resistance to abiotic stresses in the droughtprone northern savannah areas (Mwanga et al., 2002a).
Most released sweetpotato varieties in Ghana have been exotic introductions, coming from the International Institute of Tropical Agriculture's (IITA) sweetpotato breeding program (which phased out in 1987), East Africa (comprising a number of landraces), and Bangladesh. Sweetpotato introductions can bring about considerable genetic diversity because of the extreme heterozygosity of the crop (Grüneberg et al., 2015), but in countries with intensive production and established breeding programs, a large genetic diversity is already available for breeding (David et al., 2018). Expression of attributes and traits across environments can vary differentially by genotype, a phenomenon known as genotype × environment (G × E) interaction in multi-environmental trial (MET) series. For any breeding program to be successful, a thorough understanding of G × E patterns in the target population of environments is essential. In cases where variance component estimations due to G × E (σ 2 G×E ) in METs are larger than variance component estimations due to genotypes (σ 2 G ), an analysis of G × E interactions should be undertaken (Fox, Crossa, & Romagosa, 1997). For sweetpotato storage root yield, the main commercial trait in this crop, large G × E interaction have been reported across environments in Kenya and Uganda (Grüneberg, Abidin, Ndolo, Pereira, & Hermann, 2004;Tumwegamire et al., 2016), over diverse environments in Peru (Grüneberg, Manrique, Zhang, & Hermann, 2005), in South Africa (Adebola, Shegro, Laurie, Zulu, & Pillay, 2013), and in Ethiopia (Gurmu, Hussein, & Laing, 2017). These studies highlight the importance of breeding regionally adapted material and testing new genotypes under conditions similar to the targeted population of environments. The target environment should be well-defined as a portion of the growing region with a fairly homogeneous environment that causes genotypes to perform similarly (Gauch & Zobel, 1997). The technique of subdividing growing regions in mega-environments was first introduced by researchers at the International Maize and Wheat Improvement Center (CIM-MYT), working with wheat (Triticum aestivum; Rajaram, 1994). The strategy of defining mega-environments allows a breeder to take specifically adapted genotypes into account and to identify locations suitable for field tests and selection. It is hypothesized that AEZs in southern and northern Ghana are contrasting, and exhibit a pronounced genotype × environment (G × E; σ 2 G×E > σ 2 G ) for storage root yield. It is also hypothesized that northern and southern environments in Ghana are representative of similar environment across the sub-region which is broadly characterized by longer rainy seasons in the southern zones and longer dry seasons in the northern savanna zones. Information on sweetpotato G × E in Ghana is limited, and a better understanding of sweetpotato G × E in Ghana is needed to make more informed choices in sweetpotato breeding for both Ghana and West Africa.

Crop Science
The International Potato Center's (CIP) sweetpotato breeding efforts in SSA are organized at a sub-regional level through support platforms to provide a foundation for prebreeding and long term population improvement and development of user-preferred varieties (Andrade et al., 2009). The Sweetpotato Support Platform for West Africa (SSP-WA) works closely with the Ghana National Agricultural Research and Extension Systems (NARES) to support breeding efforts in Ghana and across the region. This study had the following objectives: (i) to determine the magnitude of G × E interaction for storage root yield in multi-environmental trials (METs) across AEZs in Ghana, (ii) to define mega-environments for sweetpotato breeding in Ghana, and (iii) to identify superior breeding lines for each mega-environment.

MATERIALS AND METHODS
Three different datasets were used to address the three objectives. A total of 26 genotypes varying in origin and root flesh color (Table 1)  first dataset consisted of 10 germplasm introductions and landraces, most either released or currently important varieties in Ghana. The second dataset represented six advanced clones from the sweetpotato breeding program at SSP-WA selected in the South, while the third represented eight advanced clones selected in the North. Two check clones (CRI-Apomuden and CRI-Ligri) were common to all trials. Each dataset was evaluated at southern and northern locations in Ghana (Table 2; Figure 1). Soil groups in Africa as classified by the FAO, identify the southern locations as Acrisols whereas the northern locations are Lixisols (Grüneberg et al., 2019). All trials were planted in the main rainy season, and harvest was conducted between 4-6 mo after planting ( Table 2). The annual average rainfall ranges from 950-1200 mm. Good agricultural practices were used, with appropriate fertilization N/P/K rates (40:40:70 kg ha −1 ) and regular weeding with sweetpotato grown in rotation with other crops.
Trials were planted using randomized complete block design (RCBD) with two replicates, and plots consisting of four rows each 5 m, with a planting distance of 0.3 m between plants and 1 m between rows. The central 10 m 2 of every plot were harvested roughly 4 mo after planting. After harvesting all the available plants, all storage roots in a plot were counted to calculate number of roots per plant. Roots were separated into noncommercial (<100 g) and commercial (≥100 g) sizes, and yield was recorded for both fractions. Total storage root yield was expressed as tons per ha. Total biomass (fresh roots + vines) was weighed per plot and expressed as tons per ha. Harvest index (total root yield divided by biomass yield) was reported as a percentage. The SPVD was scored 8 wk after planting from 1-9, with 1 having no symptoms and 9 having severe symptoms. Phenotyping traits are fully described in Grüneberg et al. (2019).
Statistical analysis was based on linear mixed models and was performed using R (version 3.3.1) and ASReml. Using Dataset 1, variance components for the nine environments were estimated by fitting a simple mixed model to the single RCBD trial data, considering the genotype main effects as random. Broad-sense heritabilities (H 2 ) were estimated as the proportion of estimated genotypic variance over the sum of estimated genotypic variance and estimated variance of the error, divided by the number of replicates.
For the MET analysis, a stage-wise approach was used as elaborated in Piepho, Möhring, Schulz-Streeck, and Ogutu (2012) and van Eeuwijk, Bustos-Korts, and Malosetti (2016). In the first stage, the same nine models as above are fitted but considering the genotype effect as fixed, to obtain the Best Linear Unbiased Estimators (BLUEs) of the genotypes for each individual environment. This leads to a G × E table of BLUE means, which is used as a starting point to model the G × E interaction as described in Malosetti, Ribaut, and van Eeuwijk (2013) in the second stage. To predict the performance of a genotype in a given environment, Piepho (1994) and Piepho, Möhring, Melchinger, and Büchse (2008) show that Best Linear Unbiased Predictor (BLUP) often outperforms other procedures in terms of predictive accuracy. To this end, a mixed model considering genotype as a random effect is fitted in the second stage by residual maximum likelihood (REML). In the presence of G × E, the more Crop Science F I G U R E 1 Map of Ghana with indication of trial sites and agro-ecological zones (modified from FAO, 2005) realistic models often allow for heterogeneity of genetic variances and covariances across environments (Malosetti et al., 2013). The best fitting model was chosen by Akaike's information criterion (AIC) and took the following form: μ = μ + + + where μ is a fixed intercept, E j is a fixed environment effect, and ∼ (0, ∑ ) and ∼ (0, σ 2 ε ) are normally distributed and independent random effects. The variance-covariance matrix ∑ of the random genetic effect G i is parametrized using the second-order factor analytic model that accommodates heterogeneity of genetic variances and genetic covariances across environments in a parsimonious manner (Piepho, 1998). The correlation matrix corresponding to this estimated 9 by 9 variance-covariance matrix ∑ for storage root yield is given in Table 4. Since we are interested in determining mega-environments in Ghana, we made a graphical representation of the genotype main effects and the G × E effects (GGE), considering the following fixed GGE model: μ = μ + + 1 1 + 2 2 + ε in the second stage. The model describes the response variable, that is, the BLUE of genotype i in environment j, μ ij , as the result of the common fixed intercept term μ, a fixed environmental main effect corresponding to environment j, E j , plus two multiplicative terms, b i1 z j1 and b i2 z j2 , approximating (G + G × E), and finally the random term, ε ij , representing the error term. Because we are working on a two-way table of BLUEs, we cannot straightforwardly separate G × E from error. More details can be found in Yang, Crossa, Cornelius, and Burgueño (2009) and Malosetti et al. (2013). The results of the GGE analysis for storage root yield are presented in a biplot graph capturing genotypes and environment variation and covariation of the trials.
A similar linear first stage model was applied to each environment in the second and third dataset, considering genotype as a fixed effect and resulting in a G × E table of the estimated means. These means were then used to calculate the Finlay-Wilkinson model (Finlay & Wilkinson, 1963) with a single regression line on the environmental quality in the model μ = μ + + + + ε . The slope b i captures the environmental sensitivity or adaptability (Malosetti et al., 2013).

First dataset
The nine environments of the first dataset expressed a large variability. Therefore, the estimated means were calculated for each environment separately following a two-stage analysis strategy and these estimated means served as the basis for the G × E study. Estimated means for storage root yield ranged from 3-15.5 t ha −1 across environments, when averaging out the genotypes in the first dataset (Table 3). The σ 2 G variance component had large differences among locations for all traits. The magnitude of the variance component σ 2 G resulted in high heritabilities (H 2 ), which were greater than .75 for

Crop Science
T A B L E 3 Means, coefficient of variation, variance components and heritability for observed traits at nine locations (obtained from single trial analysis with random genotype effect)

2018
Storage root yield, t ha −1 mean 3.7 9.8 7. most traits and locations. A graphical presentation of storage root yield BLUPs shows a clear crossover between genotype estimates across environments ( Figure 2). As expected, low-yielding locations expressed a lower σ 2 G component than higher-yielding locations, visualized on Figure 2 for storage root yield.
The genetic correlations between the nine environments, given in Table 4, are also reflected in the GGE plot ( Figure 3; Table 5). Northern locations (Botanga, Nyankpala, and Tono) were highly correlated, indicated by the small angle between their environmental vectors. The southern locations grouped according to year. The GGE plot visualizes the genotypic adaptation, with the genotypes on the corners of the polygon most adapted to the environment pointing in that direction. The length of the environmental vectors gives an indication of the genetic variances in each trial with Botanga, Nyankpala, and Ohawu having the smallest genetic variance, as also shown in Figure 2. That means that these locations had the least discriminating power.

Second and third dataset
Trait estimates of advanced southern-selected breeding clones were compared with the estimates of advanced breeding clones from successive northern breeding trials in Table 6. Storage root yield was consistently higher for the northernselected advanced clones compared to the southern-selected advanced clones, evaluated across the country. Other traits were comparable between the two groups of genotypes. Check clones CRI-Apomuden and CRI-Ligri were common to both datasets, but their performance varied between years.
The Finlay-Wilkinson stability model used the relative root yield of each genotype compared to the mean of the two check clones. The subdivision of G × E sum of squares (Table 7) showed that the regression explained about three-fourths of the total G × E interactions for storage root yield. The performance of the advanced clones is expected to be higher than the mean of the check clones in the region of selection. Indeed, in Figure 4a, most genotypes performed better than the mean  of the two checks when they were evaluated in the Southern locations (Fumesua and Ohawu). This result is in line with the expectation because these southern clones have survived earlier selection steps performed in the South, mainly based on superior storage root yield, high number of roots per plant, and low incidence of SPVD. Evaluating the southern clones in the northern locations (Bawku, Nyankpala, and Wa) revealed only one genotype better than the mean of the checks. The genotype PGA14011-13 can therefore be seen as a broadly adapted genotype, associated with a small slope. All other southern clones fall below the reference line, meaning that their performance was below the mean of the two check clones. Evaluation of northern clones (Figure 4b) showed a good performance in the northern locations, except for PGN16130-4 which has been selected for its particular purple flesh color. Four northern clones had a positive performance (better than the mean of the two check clones) in both southern locations.

DISCUSSION
Total storage root yield across all three datasets ranged from 3-22 t ha −1 . The high yields corresponded with the yields found by Grüneberg et al. (2004)  consistent across the three datasets. Storage root yield was reported in this study as the sum of commercial and noncommercial root weight but the need to measure both commercial and noncommercial root weight is debatable (Grüneberg et al., 2015). Harvest index can be a predictor of yield stability as evidenced in wheat and maize (Zea mays; Bolaños & Edmeades, 1993). Intuitively, genotypes with a high harvest index are preferred, although farmers growing sweetpotato in low-yielding environments do not desire a very high harvest index because aboveground biomass is needed as planting material, or in some cases for use as vegetable or fodder.
Screening genotypes in multi-environmental trials requires an understanding of the G × E effects and yield stability across the environments. A two-stage mixed model analysis handled the variation within trials in a first step during the calculating of the G × E prediction means. In the second step, the prediction means from all trials-each with a specific genotypic variation-were combined to fit the GGE model. The GGE model is preferred over an additive main effects and multiplicative interaction (AMMI) model because it does not partition the genotype main effect G and the G × E effect, GE (Yan, Kang, Ma, Woods, & Cornelius, 2007). The authors point out that the effect G is always specific to the environment in which it is estimated. Moreover, GE becomes G when environments are subdivided into mega-environments. Sweetpotato root yields are highly affected by environment, also indicated by previous G × E studies (Adebola et al., 2013;Grüneberg et al., 2004Grüneberg et al., , 2005Gurmu et al., 2017;Tumwegamire et al., 2016). In agreement with a study from Papua New Guinea (Wera et al., 2018), the calculated megaenvironments were formed relating to the distinctiveness of major AEZ in which the trials were conducted. Furthermore, broad and specific adaptation of breeding clones from a targeted breeding effort in each mega-environment was confirmed after field evaluation in both the targeted and nontargeted breeding environment.
The division between South and North Ghana makes sense for agricultural reasons. Sweetpotato production in the southern parts of Ghana suffers from a high virus pressure compared to the northern parts of Ghana, due to the longer rainy season in the South which supports virus vector and host populations, while the long dry season in the North tends to reduce virus pressure. Differences in sweetpotato performance between the North and South may also be explained by differences in soil classification of the two megaenvironments. The mega-environments seem to coincide with the tropical forest environments in the South, and savannahs in the North which extend across West Africa. However, more studies will help to determine the representativeness of our mega-environments to the rest of the West African region.
Subdividing a growing region implies more work for the plant breeder and seed producers, but it also implies faster progress and higher yields (Gauch & Zobel, 1997 in the northern environments, with lower virus pressure, in which case, we have demonstrated that more progress can be made by conducting separate breeding programs for low and high virus-pressure environments, coinciding with the two mega-environments. As sweetpotato in the North was generally lower yielding, we expect that the farmers in such environments would be the main beneficiaries from a separate breeding program, which will allow us to select strongly for yield and preferred quality attributes, without the constraint presented by the need to select for high levels of virus resistance. This study uses one set of check clones for both the South and North, but the need for specific check clones for each mega-environment may arise. The commercial genotype Crop Science F I G U R E 4 Finlay-Wilkinson stability plot for storage root yield (t ha −1 ) of (a) six advanced southern clones and (b) eight advanced northern clones evaluated in Southern sites (Fumesua and Ohawu) and in Northern sites (Nyankpala, Bawku, and Wa), relative to mean of the two check clones, CRI-Apomuden and CRI-Ligri CRI-Apomuden is a highly demanded released variety in Ghana because of its high content of β-carotene (correlated with the orange flesh color) and high yield. The sweetpotato genotype CRI-Ligri, also known as Cemsa-74-228, originated from Cuba and has been officially released in Ghana. CRI-Ligri is also used as a globally adapted check genotype in CIP's sweetpotato breeding program, with sites in Ghana, Mozambique, Uganda, and Peru. CRI-Ligri had a high total root yield in East-Africa (33 t ha −1 reported by Grüneberg et al., 2004) and Abidin et al. (2017) reported a total root yield of 15.4 t ha −1 of CRI-Ligri grown in northern regions in Ghana, which is in line with our results.
Despite the differences in soil type and soil fertility, each trial in both North and South Ghana received the same amount of fertilizers to keep the differentiation between environments.
Differences among genotypes were larger in high-yielding environments than in low-yielding environments, observed in this study and confirmed by Grüneberg et al. (2005), which makes selection preferred in high-yielding environments. Indeed, the high-yielding environments are the discriminating environments in the GGE biplot. On the other hand, a challenging environment (e.g., high virus pressure) creates high variation in the performances which helps selecting the materials that stand out. Both favorable and lessfavorable environments should be used in the early yield testing stages of a sweetpotato breeding program to select for genotypes with wide adaptation. Breeding for high-yielding genotypes with wide adaptation is possible but it is not a guarantee that these genotypes will outperform the genotypes with specific adaptation to marginal environments when grown in Crop Science these marginal environments (Grüneberg et al., 2005). Therefore, check clones in a breeding program should include widely adapted (with low contribution to G × E) and specifically adapted genotypes that perform well in the targeted environment.
Until 2016, sweetpotato breeding in Ghana has been largely conducted in the South, with national releases made after limited testing in the North, resulting in some not adapted to the northern savannah environments at all. Our study confirmed that targeted breeding efforts in northern Ghana resulted in specifically adapted genotypes for the northern conditions. On the other hand, selection in the south is needed to find specifically adapted genotypes for the southern conditions. Both selection procedures have led to broadly adapted genotypes for all locations in Ghana.
In conclusion, improving sweetpotato root yield through breeding in Ghana can be achieved by selecting for broadly or specifically adapted genotypes. We have characterized the trial sites used for breeding purposes in Ghana and found a clear difference between northern and southern locations. Therefore, our results support the breeding strategy of selecting superior genotypes in both regions independently, while routinely evaluating for broad adaptation across northern and southern sites as part of the screening process. This will enable faster genetic progress in these two major West African mega-environments.