Multivariate Bayesian Analysis of On-Farm Trials with Multiple-Trait and Multiple-Environment Data

Agronomy Journa l • Volume 111, I s sue 6 • 2019 Outcomes or crop yields of on-farm trials are affected by environmental factors such as climate, weather, soil, and topography, as well as by organisms that live in the system where the trials are conducted. To farm successfully in a complex and constantly changing environment, it is important for farmers to be open to new ideas and for management practices to be more efficient. Until recently, farmers often conducted experiments to test new techniques that would improve their production systems. However, farmers understand that to draw conclusions with lower probability of error, it is crucial to make correct use of statistical science. Statistical science helps scientists and farmers determine whether the differences they observe are due to the different treatments (practices) or merely the result of chance alone, within certain levels of probability. For example, when farmers test new practices on their farms, they need to know whether the observed effects on crop yield or quality, soil organic matter, or water infiltration are the result of natural variation that occurs within the field or of the tested treatments. In general, the on-farm research process uses clearly defined statistical methods to identify and isolate the effects of natural variation so that the true effects of changing practices are more clearly detectable. For this reason, it is very important for breeding scientists and farmers to be well trained in statistical methods (or have good collaboration with statisticians) to be able to use the appropriate statistical techniques. Farmers often need to know the genetic properties of some target traits to select the best crop genetics. Since some traits are associated with each other, the selection of a trait may affect other traits, either negatively or positively (Cruz et al., 2012). Thus, it is important to determine the correlations between traits of interest, which may arise from gene links or pleiotropy. Multivariate Bayesian Analysis of On-Farm Trials with Multiple-Trait and Multiple-Environment Data


O
utcomes or crop yields of on-farm trials are affected by environmental factors such as climate, weather, soil, and topography, as well as by organisms that live in the system where the trials are conducted.To farm successfully in a complex and constantly changing environment, it is important for farmers to be open to new ideas and for management practices to be more efficient.
Until recently, farmers often conducted experiments to test new techniques that would improve their production systems.However, farmers understand that to draw conclusions with lower probability of error, it is crucial to make correct use of statistical science.Statistical science helps scientists and farmers determine whether the differences they observe are due to the different treatments (practices) or merely the result of chance alone, within certain levels of probability.For example, when farmers test new practices on their farms, they need to know whether the observed effects on crop yield or quality, soil organic matter, or water infiltration are the result of natural variation that occurs within the field or of the tested treatments.In general, the on-farm research process uses clearly defined statistical methods to identify and isolate the effects of natural variation so that the true effects of changing practices are more clearly detectable.For this reason, it is very important for breeding scientists and farmers to be well trained in statistical methods (or have good collaboration with statisticians) to be able to use the appropriate statistical techniques.
Farmers often need to know the genetic properties of some target traits to select the best crop genetics.Since some traits are associated with each other, the selection of a trait may affect other traits, either negatively or positively (Cruz et al., 2012).Thus, it is important to determine the correlations between traits of interest, which may arise from gene links or pleiotropy.
Direct and indirect selection for a trait can be implemented to obtain genetic gain, since genetic correlations between traits indicate the direction and magnitude of correlated responses to selection.However, univariate selection criteria may not generate satisfactory genetic gains for all traits assessed when the traits are correlated.This is because the traits are evaluated simultaneously and brought together in a single genotype, which makes selection more difficult, since it may be influenced by the correlations between traits, whose direction and magnitude depend on the association between them (Berilli et al., 2013).For this reason, it is also very important to perform indirect selection and calculate optimal multiple trait selection indices (Falconer and Mackay, 1996).Therefore, multivariate analysis is a useful approach to select genotypes holding a set of desirable traits and obtain favorable genetic gains.
Many plant breeding scientists, when conducting on-farm studies, measure and collect multiple response variables to increase their understanding of the complex nature of particular phenomena under field conditions.However, most of the analyses reported by breeding scientists and farmers are univariate (only one response at a time), which does not take advantage of correlated traits to increase the precision of parameter estimates or prediction accuracy.Two decades ago, using the univariate analysis paradigm was justified by the complexity of multivariate analyses and the lack of user-friendly software.Nowadays, there is much more user-friendly software for multivariate analysis (some of it is open access); however, the univariate analysis paradigm continues to dominate among breeders and farmers.For this reason, we believe that the reluctance to adopt the multivariate paradigm for analyzing on-farm and breeding data is mainly due to lack of appropriate training in multivariate statistical techniques.
Multivariate analysis refers to a broad category of methods when multiple response variables are measured on a set of experimental units or sampling objects.In statistical analysis, all random variables measured on each experimental or sampling unit are analyzed simultaneously.Multivariate methods have been described in crop, soil, and agronomic literature (Yeater et al., 2014).The use of multivariate methods is relatively limited, despite evidence that multivariate analyses are powerful tools for selecting cultivars in plant breeding programs and reducing the probability of wrong decisions when decisions are made based on the output of experiments conducted by breeders or farmers.Also, many studies that compare the use of multivariate mixed model analysis and traditional univariate analysis for selecting and identifying superior maize (Zea mays L.) lines have shown that the multivariate approach is better because it identifies the existing (co)variation between the response variables (Hair et al., 2007).
Multivariate analysis also improves accuracy when classifying and identifying superior genetic components (Coimbra et al., 2007;Schmit et al., 2016) and helps scientists understand the type of allele interaction involved in heredity and the relationships between specific traits (Bertan et al., 2009;Castro et al., 2013;Huang et al., 2015).
In this paper, we propose a multivariate Bayesian analysis to estimate random effects for genotype × environment and genotype × environment × trait combinations and to estimate genotypic and phenotypic correlations among traits.The data came from CIMMYT's Global Wheat Program and were collected in farmers' fields in the Yaqui Valley during three cropping seasons (2012-2013, 2013-2014, and 2015-2016) (Vargas Hernández et al., 2018).Four farmers participated in the study; each planted own trials in different parts of the field using own agronomic practices including irrigation.The five traits (grain characteristics) analyzed in this study were thousand-grain weight (W1000), grain moisture (MOIST), grain yield (GY), "black point" disease scores (PPB), and grain quality scores of yellow berry (PYB).
Note that in the paper of Vargas Hernández et al. (2018), the authors studied the interaction between wheat lines with environments in on-farm trials of bread and durum wheat in the Yaqui Valley region of southern Sonora, México, using a linear mixed model with the Factor Analytic (FA) model for a univariate trait, grain yield.This study used the same data sets of Vargas Hernández et al. (2018) where an environment was defined as the combination of farmer (and place within a farmer)-irrigation (full and reduced irrigations)-cropping season (yr).It should be noted that the same farmer could have planted trials in different parts of his/her farm, which were therefore considered different environments.

Phenotypic data
Bread and durum wheat lines were evaluated in on-farm trials for grain yield and other grain characteristics during three cropping seasons (2012-2013, 2013-2014, and 2015-2016) in the Yaqui Valley of southern Sonora, México.The genotypes belong to two crop types: bread wheat and durum wheat.In the first cropping season (2012-2013), 33 genotypes (17 bread and 16 durum wheats) were evaluated; in the second (2013-2014), 33 genotypes (14 bread and 19 durum wheats) were evaluated; finally, in the third cropping season (2015)(2016), only 24 genotypes were evaluated (12 bread and 12 durum wheat) (Vargas Hernández et al., 2018:Table 1).The complete data had a treatment (genotype) imbalance structure because the genotypes evaluated were different in all three cropping seasons.In total, there were 53 genotypes for bread and durum wheats.In addition, the number of on-farm trials was different in each cropping season: Six environments/locations in the 2012-2013 and 2015-2016 cropping seasons, and five in the 2013-2014 cropping season.The frequency of each genotype evaluated in each environment (trial) within each cropping season and the imbalance patterns are shown in Table 1.In each trial, the wheat lines were planted using a randomized complete block design with two replicates.Each experimental unit was six beds wide, and the bed size ranged 75-80 cm wide × 8 m long.
The notations of the individual environments were as follows: the letter F followed by a number stands for the different farms (F1 for farm 1, F2 for farm 2, and so on); the next letter represents the irrigation scheme (F for full irrigation, or R for reduced irrigation) applied in the same or different farms.Finally, lower-case letters 'a' or 'b' indicate different places where the trial on one farm was established under the same irrigation scheme, that is, one farmer using irrigation may have established two trials in different places on the same farm (this represents another environment within that particular farm).
In the combined analysis, we simply added two digits to indicate the cropping seasons (12 for 2012-2013; 13 for 2013-2014; and 15 for 2015-2016).For example, 12F1Fa stands for 'a' Table 1.Number of bread wheat (BW) and durum wheat (DW) lines evaluated in the different trials in each of the three cropping seasons (2012-2013, 2013-2014, 2015-2016).Each trial between cropping season corresponds to a different farmer and irrigation regime.Empty cells indicate an imbalance in the data (from Vargas Hernández et al., 2018)  plot (place on the farm of farmer 1 in the 2012-2013 cropping season under full irrigation).Different traits were measured: W1000 (gr), MOIST (%), GY (tons ha -1 ), PPB (%), and PYB (%).The experimental design in each environment (farmer, irrigation level, and cropping season) was a randomized complete block design with three replicates.
First, the best linear unbiased estimations (BLUEs) of genotypes were computed for each trait based on a two-stage analysis approach (Piepho et al., 2012).These BLUEs were obtained with the lme4 library of the R statistical software (R Core Team, 2018) under a frequentist framework.In the first stage, BLUEs were obtained by fitting the following mixed model in each trial (environment) in each cropping season: Trait ~μ + Replicates + Genotype + error, where the common mean (μ) and genotypes were treated as fixed factors, whereas replicates and model error were assumed as random.The replicate implies an independent repetition of the basic experiment which, as mentioned above, was a randomized complete block design.This implied that each genotype (treatment [TRT]) was applied in two independent experimental units.Afterward, BLUEs computed in this first stage for each environment were combined together for each cropping season.This means that for each cropping season under study, we built a data set composed of information on environments, crop type, genotypes, and one column for each trait included in the study.
The BLUEs across crop types were obtained using the following model: Trait ~μ + crop-type + Genotypes + error, where crop type and genotypes were assumed fixed and only the model error was assumed random.With the resulting BLUEs, for each cropping season we built a data set with information on environments, genotypes, and traits; with the information from each cropping season, we implemented the multivariate Bayesian model described in the next section.We attempted to implement a combined multivariate Bayesian analysis with the information from the three cropping seasons, but due to the large Genotype or TRT imbalance, this was not possible.For this reason, to be able to do a combined analysis, we adjusted the BLUEs from the first step also by cropping season, using the following mixed model: Trait ~μ + crop-type + Cropping season + Genotypes + error, where only the model error was assumed random and the other terms were assumed as fixed effects; these BLUEs were used for the multivariate Bayesian combined analysis.

Multivariate statistical Model
To jointly model multiple-trait and multiple-environment data, we propose the following mixed model: where Y is the matrix of response variables of order n × L, where L represents the number of traits, X is the incidence matrix of environments of order n × I, β is the fixed effect of environment× traits of order I × L , Z 1 is the incidence matrix of genotypes of order n × J , b 1 is the random effect of genotype × traits of order J × L , Z 2 is the incidence matrix of genotype × environment of order n × IJ, b 2 is the random effect of genotype × environment × traits of order IJ × L and e is the residual matrix of order n × L. Also, b 1 is distributed under a matrix variate normal distribution as NM J×L (0, G g , Σ t ) with Σ t an unstructured covariance matrix of order L × L, ( ) , all independent of each other, where ⊗ denotes a Kronecker product, with Σ E an unstructured covariance matrix of order I × I. g G is the genomic relationship matrix.The matrix variate normal distribution is a generalization of multivariate normal distribution.In particular, the (n × p) random matrix, M, is distributed under a matrix variate normal distribution denoted as M ~ NM n×p (H,Ω,Σ), if and only if, the (np × 1) random vector vec(M) ~ N n×p (H,Σ ⊗ Ω), where NM n×p denotes the (n × p) dimensional matrix variate normal distribution, H is a (n × p) location matrix, Σ is a (p × p) first covariance matrix, and Ω is a (n × n) second covariance matrix (Srivastava and Khatri, 1979).The first matrix corresponds to the rows and the second to the columns of the matrix normal distribution M. vec (.) and ⊗ are the standard vectorization operator and Kronecker product, respectively.
The Bayesian multiple-trait and multiple-environment (BMTME) model resulting from Eq. [1] was implemented by Montesinos-López et al. (2016).Here we provide a modified version of the original BMTME model proposed by Montesinos-López et al. (2016).The modified BMTME model assumes as prior for the covariance matrices of traits and environments an inverse Wishart (IW) distribution which is a natural conjugate prior for the covariance matrix of a multivariate normal distribution.This distribution has two parameters (Σ denotes a positive definite scale matrix and ν denotes the degrees of freedom) and it is the multivariate extension of the inverse-γ distribution used as prior for variances.Two reasons for using IW distribution are (i) it is a conjugate prior for the covariance matrix of multivariate normal distributed variables, which implies that when it is combined with the likelihood function, it will result in a posterior distribution that belongs to the same distributional family; and (ii) it ensures positive definiteness of the covariance matrix.Its main disadvantage is that this distribution is quite informative when variances are small, resulting in a strong effect of the prior distribution on the parameter estimates.The modified full conditionals of this modified BMTME model, which supports the Gibbs sampler given above, is given in Appendix A.
Below we outline the Gibbs sampler for estimating parameters of interest in the BMTME model.The ordering of draws is somewhat arbitrary; however, we suggest the following order: Step The Gibbs sampler of the BMTME was implemented in the R-software (R Core Team, 2018).A total of 20,000 iterations were performed with a burn-in of 15,000 and a thinning of 5, so that 1000 effective samples were used for inference that was used to eliminate potential problems due to the autocorrelation.The convergence of the Markov chain Monte Carlo chains was monitored using trace plots, autocorrelation function (ACF), and Gelman-Rubin diagnostics.The uncertainty of all posterior parameters was evaluated with the posterior standard deviation of each posterior parameter.

resuLts
The results are given in four sections.In the first section, we give the results of the analysis of cropping season 2012, in the second section, the results for cropping season 2013, and in the third section, the results for cropping season 2015.Finally, the results of the combined analysis (information on all cropping seasons) are given in section four.For each section we provide the posterior parameter estimates of the β coefficients, the genetic correlations between traits and environments, and the residual correlation between traits, resulting from fitting the Bayesian multivariate model given in Eq. [1].

cropping season 2012
Table 2 gives the posterior means and posterior standard deviations (SDs) for the β coefficients (β).For the GY variable, differences in posterior means were small across six environments, where the minimum was 7.408 (in environment F1Ra) and the maximum was 9.127 (in fully irrigated environment F2F).For MOIST, the differences in the estimates of the β coefficients were also small since the minimum was 2.827 (in environment F3F), while the maximum was 4.946 (in environment F1Fb).Moderate differences were observed in the posterior means for W1000 and PYB, since for W1000 the minimum value was 41.925 (in environment F1Ra) and the maximum was 50.136 (in environment F3F), while for PYB, the minimum was 0.9 observed in environment F3F and the maximum was 7.898 in environment F1Fb.However, the largest differences between the β coefficients were observed for PPB, with a minimum value of 0.991 (in environment F1Fa), while its largest β coefficient was 20.182 (in environment F3F).It is important to note that each trait was measured on a different scale; for this reason, the observed differences between the coefficients of the traits were expected.However, the analysis can also be performed with standardized variables without any significant loss of information.
Table 2 also presents the posterior means and posterior SDs of variance-covariance (and correlations) components of traits and environments.First, note that the genetic covariances (and correlations) of traits are very low since the correlation with the largest absolute value was 0.270 between PPB and W1000.These low correlations can also be observed in Fig. 1, which shows that MOIST and PYB have large positive loadings in component 1 (which explains 29.76% of total variability).While traits PPB and W1000 also have negative loadings in component 1, trait GY has the largest positive loading in component 2 (this component explains 24.74% of total variability).In contrast, the genetic covariance (and correlation) of environments is from moderate to high, since all the correlations were larger than 0.59.It is very important to point out that some genetic correlations between traits are positive while others negative; this means that, when consider jointly, traits varied differently (Fig. 1). Figure 1 also shows that the environments are positively correlated and that they form three groups.
With regard to the residual covariances (and correlations) between traits, most of them are close to zero, except for the correlation between traits W1000 and GY, which was 0.283, and the correlation between traits PPB and W1000, which was 0.329.On the other hand, with regard to the similarity between the observed and predicted values of the whole data set, the lowest prediction accuracy in terms of Pearson's correlation and mean square error of prediction (MSEP) was observed in trait PYB, the best in terms of Pearson's correlation was observed in trait PPB, and the best in terms of MSEP was in trait MOIST.In general, the similarity between the observed and predicted values was very good.

cropping season 2013
Table 3 shows the posterior means and posterior SDs for the β coefficients and variance-covariance components for cropping season 2013.There are no large differences in the β coefficients for all traits in the 5 environments.For trait MOIST, the minimum was 3.374 (in environment F2F) and the maximum was 4.432 (in environment F1Ra).For trait W1000, the minimum was 77.151 (in environment F1Rb) and the maximum 79.225 (in environment F1Fb).For trait PPB, the minimum was 19.849 (in environment F1Rb) and the maximum was 21.040 (in environment F1Fb).Finally, for trait PYB, the minimum was 11.215 (in environment F2F) and the maximum was 12.966 (in environment F1Ra).Again, since each trait was measured on a different scale, the differences between the coefficients between traits were expected.
Table 3 also shows the posterior means and posterior SDs of variance-covariance (and correlations) components of traits and environments for cropping season 2013.First, note that 6 of the 10 genetic covariances (and correlations) of the trait components are moderate or high, while 4 out of 10 are low.The two higher correlations belong to GY versus MOIST and between PPB and W1000.Also, the biplot in Fig. 2 confirms the high correlation between MOIST and GY (which form a group by grouping component 2) and between W1000 and PPB (they are also in the same group in Fig. 2a); here the first principal component explains 63.14% of total variability, while the second principal component explains 36.79%.The genetic covariances (and correlations) of environments were high, higher than 0.827.Also, it is very important to point out that all the genetic correlations between environments are positive.However, not all environments are very similar, which can be corroborated in Fig. 2, which shows that there are three groups.In the residual covariances (and correlations) between traits, 4 out of 10 correlations were high (those between GY and MOIST; PPB and W1000; PYB and W1000, and PYB and PPB), and 5 were low, with correlations of less than 0.352.Finally, with regard to the similarity between the observed and predicted values for the whole data set, two out of five traits were in good agreement in terms of  Pearson's correlation and MSEP, while the agreement was low for the other 3 traits, since the Pearson's correlations were lower than 0.541 and higher than 1.944 in terms of MSEP.In general, the similarity between the observed and predicted values was very good for two traits and not good for the other three traits.

croPPInG seAson 2015
The posterior means and posterior SDs for the β coefficients and variance-covariance components for cropping season 2015 are given in Table 4.There are moderate differences between the β coefficients for trait GY in the 5 environments, since the minimum was 6.311 (in environment F1Fa) and the maximum was 9.181 (in environment F1Rb).For trait MOIST, the minimum was 3.527 (in environment F1Ra), while the maximum was 4.579 (in environment F1Fb).For trait W1000, there were also moderate differences in the β coefficients between environments, since the minimum was 42.326 (in environment F2F) and the maximum was 47.657 (in environment F1Fa).For trait PPB, there were large differences between environments, since the minimum was 1.083 (in environment F1Ra) and the maximum was 22.873 (in environment F1Fb).Finally, for trait PYB, large differences between environments were also observed in the β coefficients, since the minimum was 2.003 (in environment F1Ra) and the maximum was 13.765 (in environment F2F).Again, since each trait was measured on a different scale, the observed differences between the coefficients between traits were expected.
The posterior means and posterior SDs of variance-covariance (and correlations) components of traits and environments for cropping season 2015 are given in Table 4.Only 2 out of 10 genetic covariances (and correlations) of the trait components are moderate and 4 correlations are close to zero.The two larger correlations belong to GY versus MOIST and PYB versus MOIST.Figure 3 shows that the first component explains 43.83% of the total variance and the second component explains 26.58%.The genetic covariances (and correlations) of environments are from moderate to high, since all the correlations were higher than 0.584.All the genetic correlations between traits are positive; however, not all environments are very similar, which can be observed in Fig. 3.For the residual covariances (and correlations) between traits, 2 out of 10 correlations were high (those between GY and MOIST, and between PYB and MOIST) and 7 were low, with correlations less than 0.32.Finally, with regard to the similarity between the observed and predicted values for the whole data set, in five traits there was good agreement in terms of Pearson's correlation and MSEP, since the Pearson's correlations were larger than 0.961 and the MSEP were lower than 3.798.In general, the similarity between the observed and predicted values was very good for the five traits.

combined cropping seasons
For the combined information of the three cropping seasons, Table 5 gives the posterior means and posterior SDs for the β coefficients.There are no large differences in the β coefficients for trait GY between the 5 environments, since the minimum was 7.205 (in environment F1Ra) and the maximum was 8.344 (in environment F1Rb).For trait MOIST, the differences in the estimates of the β coefficients are also small, since the minimum was 3.375 (in environment F1Fa), while the maximum was 4.413 (in environment F1Fb).Also, small differences were observed in the estimates of the β coefficients for traits W1000 and PYB, since for W1000, the minimum value was 55.334 (in environment F1Ra) and the maximum was 58.417 (in environment F1Fa), while for PPB large differences were observed, where the minimum was 8.101 in environment F1Fa and the maximum was 16.466 in environment F1Fb.For PYB, large differences were also observed between environments, since the minimum β coefficient was 5.790 and the maximum was 10.280.It is important to recall that each trait was measured on a different scale, and therefore, the differences between the coefficients between traits were expected.
The posterior means and posterior SDs of variance-covariance (and correlations) components of traits and environments are given in Table 5.First note that most genetic covariances (and correlations) of the traits are low, since only 2 out of 10 are moderate or large.Figure 4 (biplot of traits) shows that the first principal component explains 39.90% of the total variance, while the second principal component explains 32.78%.Figure 4 shows three groups of traits, one composed of traits MOIST and GY, another composed of trait PPB and the last one formed by traits W1000 and PYB.In contrast, the genetic covariances (and correlations) of environments are from moderate to high, since all the correlations were larger than 0.662.It is very important to point out that all the genetic correlations between environments are positive and more than two groups are formed (Fig. 4).With regard to the residual covariance (and correlations) between traits, most of them are close to zero, with the exception of the correlation between traits MOIST and GY, which was 0.532, and the correlation between traits PPB and W1000, which was 0.667.On the other hand, with regard to the similarity between the observed and predicted values for the whole data set, there was good agreement, since all the traits had a Pearson's correlation of at least 0.873, while the largest MSEP was 4.133.In general, the similarity between the observed and predicted values was very good.dIscussIon Posterior means and posterior standard deviations of β coefficients, genetic covariances (for trait and for environments) and residual covariance (for traits) are very important for breeders to better understand the genetic and residual architecture of lines for specific traits of interest.For this reason, in this paper we propose using multivariate models to estimate these parameters in the multi-trait and multi-environment context.The advantage of using the proposed multivariate analysis to estimate covariance parameters is that it significantly helps to increase the precision of the parameters because it takes into account the genetic and residual correlation between the traits and environments under study.Because the BMTME uses IW distributions as priors for covariance matrices, it is important to point out that the IW distribution is a key element to estimate the genetic (of traits and environment) and residual covariance (correlation) matrices, given that the posterior distributions of the covariance matrices also have IW distributions with augmented parameters that combine the information of the prior distribution and the likelihood information (Alvarez et al., 2014).
Implementation of multivariate analysis is not rare in agricultural research.For example, Schulthess et al. (2017) reported that multivariate analysis improves parameter estimates.Authors like Calus and Veerkamp (2011), Jia and Jannink (2012), Jiang et al. (2015), Montesinos-López et al. (2016), He et al. (2016) and Schulthess et al. (2017) reported better prediction accuracy of multivariate analysis compared to univariate analysis.These authors also agree that the genetic correlation between traits is the basis for the benefit of multivariate analysis; that is, that the larger the degree of relatedness between traits, the greater the benefit of multivariate analysis.
However, multivariate analysis is computationally more demanding than univariate analysis, and for this reason, many times its implementation is not practical.The computational cost is greater when the Bayesian approach is implemented.However, the multivariate Bayesian approach has the advantage that it is more parsimonious and provides a more informative and powerful analysis, since (i) it allows prior information to be incorporated; (ii) it does not need good starting values for parameters of interest such as the restricted maximum likelihood; (iii) it increases the precision of parameter estimates (smaller standard errors); (iv) conclusions can be drawn about the correlations between the dependent variables, notably, the extent to which the correlations depend on the individual and on the group level; (v) testing whether the effect of an explanatory variable on dependent variable Y1 is larger than its effect on Y2, when Y1 and Y2 data were observed (totally or partially) on the same individuals, is possible only by means of a multivariate analysis; (vi) if one wishes to carry out a single test of the joint effect of an explanatory variable on several dependent variables, then a multivariate analysis is also required; such a single test can be useful, e.g., to avoid the danger of chance capitalization which is inherent to carrying out a separate test for each dependent variable; and (vii) it does not have strong problems of identifiability.
It is important to point out that the multivariate approach proposed in this paper is different from the conventional multitrait (or multi-environment) approach, since the proposed BMTME model makes it possible to simultaneously estimate the genetic correlation of traits and genetic correlation of environments, which is not possible in a conventional multi-trait analysis performed by the standard software packages that only estimates the genetic correlation of traits and only works well for multi-trait data.The standard software packages can only accommodate a covariance matrix as the Kronecker product of two covariance matrices (for example, pedigree matrix, and genetic covariance of traits), while the BMTME can accommodate a covariance matrix as the Kronecker product of three valid covariance matrices (for example, pedigree matrix, genetic covariance of traits and genetic covariance of environments) and this covariance matrix corresponds to the three-way interaction of the genotype × environment × trait term.Another advantage of the proposed BMTME model is that the probability of having convergence problems decreases even when more correlated traits are added to the multivariate analysis.
For this reason, the proposed multivariate model allows increasing prediction accuracy and parameter estimates; however, it is documented that the increase in accuracy is dependent on the absolute difference between the genetic and residual correlations between traits, i.e., the larger the differences, the greater the gain in accuracy.A second advantage of the proposed multivariate analysis is that it obtains unbiased estimates of secondary traits selected under indirect selection.A model including information on the correlated trait on which selection was based, is able to correct for this type of selection.Multivariate analysis also allows predicting a trait when lines (genotypes) have been measured for other traits, especially in situations where missing information occurs due to, for example, line damage or practical and technical problems during data collection.The proposed model also improves the precision of the selection index because the optimal genetic correlation and random genetic effects of each trait of the lines are calculated simultaneously.
However, the proposed BMTME model also has disadvantages; some of them are (i) if the parameter estimates of genetic correlations are very different from the unknown true values (highly biased), then a multivariate analysis could do as much harm as it might do good; and (ii) multivariate analyses require more computing time and increased computer memory to analyze the data.Also, more storage and verification is needed.
Finally, we found that the proposed multivariate Bayesian multi-trait and multi-environment model successfully fitted three cropping season data sets (2012, 2015, and combined) out of four.By successful fitting we mean that the predicted values resulting from the BMTME model are very close to the real (observed) phenotypes, since the correlation between the observed and predicted values was larger than 0.84 for all traits under study in three data sets.However, for the cropping season 2013 data set, in three out of five traits, the correlation between the observed and predicted values was lower than 0.6.We also found that the environments in the four data sets under study  In the simplification of some calculations, the following properties were involved: ( ) ( ) ( ) ( ) ( ) Full conditional for Σ t  [A4] Full conditional for Σ E

Table 2 .
Cropping season 2012.Posterior means and posterior standard deviations (SDs) of beta coefficients, variance-covariance (upper diagonal), and correlation (lower diagonal) components for the genetic (traits and environments) and residual (traits) components.†

Table 3 .
Cropping season 2013.Posterior means and posterior standard deviations (SDs) of beta coefficients, variance-covariance (upper diagonal) and correlation (lower diagonal) components for the genetic (traits and environments) and residual (traits) components.†

Table 4 .
Cropping season 2015.Posterior means and posterior standard deviations (SDs) of beta coefficients, variance-covariance (upper diagonal) and correlation (lower diagonal) components for the genetic (traits and environments) and residual (traits) components.†

Table 5 .
All cropping seasons.Posterior means and posterior standard deviations (SDs) of beta coefficients, variance-covariance (upper diagonal) and correlation (lower diagonal) components for the genetic (traits and environments) and residual (traits) components.†