Optimum and Decorrelated Constrained Multistage Linear Phenotypic Selection Indices Theory

Some authors have evaluated the unconstrained optimum and decorrelated multistage linear phenotypic selection indices (OMLPSI and DMLPSI, respectively) theory. We extended this index theory to the constrained multistage linear phenotypic selection index context, where we denoted OMLPSI and DMLPSI as OCMLPSI and DCMLPSI, respectively. The OCMLPSI (DCMLPSI) is the most general multistage index and includes the OMLPSI (DMLPSI) as a particular case. The OCMLPSI (DCMLPSI) predicts the individual net genetic merit at different individual ages and allows imposing constraints on the genetic gains to make some traits change their mean values based on a predetermined level, while the rest of them remain without restrictions. The OCMLPSI takes into consideration the index correlation values among stages, whereas the DCMLPSI imposes the restriction that the index correlation values among stages be null. The criteria to evaluate OCMLPSI efficiency vs. DCMLPSI efficiency were that the total response of each index must be lower than or equal to the single-stage constrained linear phenotypic selection index response and that the expected genetic gain per trait values should be similar to the constraints imposed by the breeder. We used one real and one simulated dataset to validate the efficiency of the indices. The results indicated that OCMLPSI accuracy when predicting the selection response and expected genetic gain per trait was higher than DCMLPSI accuracy when predicting them. Thus, breeders should use the OCMLPSI when making a phenotypic selection.

I n a two-stage context, Cerón-Rojas et al. (2019) described and evaluated the unconstrained optimum and decorrelated multistage linear phenotypic selection indices (OMLPSI and DMLPSI, respectively) theory and concluded that OMLPSI efficiency when predicting the net genetic merit was higher than the DMLPSI efficiency and that breeders should use the OMLPSI when making phenotypic selection. The main difference between the two indices is that although the OMLPSI takes into consideration the correlation values among stages when predicting the net genetic merit, the DMLPSI imposes the restriction that the correlation values among stages be null when it makes the prediction. The main characteristic of the OMLPSI (DMLPSI) in a two-stage context is that at Stage 1, OMLPSI (DMLPSI) is a partial index, but at Stage 2, it is a complete index. This selection procedure is called the part and whole index selection method (Young, 1964;Saxton, 1983) and is valid for any number of stages. The OMLPSI (DMLPSI) is more efficient than the independent culling method because it uses all available information at each

Optimum and Decorrelated Constrained Multistage Linear Phenotypic Selection Indices Theory
J. Jesus Cerón-Rojas, Fernando H. Toledo, and Jose Crossa* ABSTRACT Some authors have evaluated the unconstrained optimum and decorrelated multistage linear phenotypic selection indices (OMLPSI and DMLPSI, respectively) theory. We extended this index theory to the constrained multistage linear phenotypic selection index context, where we denoted OMLPSI and DMLPSI as OCMLPSI and DCMLPSI, respectively. The OCMLPSI (DCMLPSI) is the most general multistage index and includes the OMLPSI (DMLPSI) as a particular case. The OCMLPSI (DCMLPSI) predicts the individual net genetic merit at different individual ages and allows imposing constraints on the genetic gains to make some traits change their mean values based on a predetermined level, while the rest of them remain without restrictions. The OCMLPSI takes into consideration the index correlation values among stages, whereas the DCMLPSI imposes the restriction that the index correlation values among stages be null. The criteria to evaluate OCMLPSI efficiency vs. DCMLPSI efficiency were that the total response of each index must be lower than or equal to the single-stage constrained linear phenotypic selection index response and that the expected genetic gain per trait values should be similar to the constraints imposed by the breeder. We used one real and one simulated dataset to validate the efficiency of the indices. The results indicated that OCMLPSI accuracy when predicting the selection response and expected genetic gain per trait was higher than DCMLPSI accuracy when predicting them. Thus, breeders should use the OCMLPSI when making a phenotypic selection.
stage and incorporates the genetic correlations between traits in the prediction. The OMLPSI (DMLPSI) combines the single-stage linear phenotypic selection index (LPSI) theory (Smith, 1936;Hazel, 1943) with the independent culling selection method (Cochran, 1951;Young, 1964;Cunningham, 1975;Xu and Muir, 1992) and is useful for selecting more than one trait in the multistage selection context. Breeders apply the OMLPSI (DMLPSI) mainly in animal and tree breeding where, due to early culling, OMLPSI (DMLPSI) is a cost-saving strategy for improving several traits because they do not need to measure all traits at each stage. The OMLPSI (DMLPSI) increases selection intensity on traits measured at an earlier age, and, with fixed facilities, this index selects a greater number of individuals at an earlier age Muir, 1991, 1992).
The OMLPSI values may have a non-normal distribution after the first selection stage, and to derive selection intensities for more than two stages, this index requires numeric multiple integration techniques. To solve this problem, the DMLPSI minimizes the mean squared difference between the index and the net genetic merit at each stage under the restriction that the covariance between the DMLPSI values at different stages be zero, thus preventing the correlation between DMLPSI values at different stages. Under this restriction, truncation points and selection intensities can be determined for a fixed total proportion before the breeder carries out selection, and the selected individual index values after the first selection stage may be normally distributed (Xu and Muir, 1992). Nevertheless, due to the indicated restriction, the DMLPSI selection response and accuracy after the first stage could be lower than the OMLPSI selection response.
One additional problem with the OMLPSI (DMLPSI) expected genetic gain per trait (or multitrait selection response) is that its values can increase or decrease in a positive or negative direction without control. In the single-stage context, Kempthorne and Nordskog (1959) developed the restricted LPSI that allows imposing restrictions equal to zero on the expected genetic gain of some traits. Other authors (Mallard, 1972;Harville, 1975;Tallis, 1985) extended the Kempthorne and Nordskog (1959) approach and developed a single-stage constrained LPSI (SCLPSI) that attempts to make some traits change their expected genetic gain values based on a predetermined level while the rest of the traits remain without restrictions. Itoh and Yamada (1987) showed that in reality there is only one optimum SCLPSI; that is, the Mallard (1972), Harville (1975), and Tallis (1985) indices are the same.  and Cerón-Rojas and Crossa (2018, Chapter 9) extended the DMLPSI and OMLPSI to the constrained context, respectively. The  index, however, is not an optimum constrained multistage index because their approach is based on the single-stage Tallis (1962) constrained index theory, which is not an optimum index (see Cerón-Rojas and Crossa, 2018, Chapter 3, for details).
Based on the Mallard (1972) constrained phenotypic single-stage index theory, which is an optimum singlestage constrained index (see Cerón-Rojas and Crossa, 2018, Chapter 3, for details), in this work, we extend the OMLPSI and DMLPSI to the constrained multistage selection context. We will denote the OMLPSI and DMLPSI as OCMLPSI (optimum constrained multistage LPSI) and DCMLPSI (decorrelated constrained multistage LPSI), respectively. The main difference between the OCMLPSI and the DCMLPSI is that the OCMLPSI imposes only one restriction when solving the OMLPSI equations to obtain its vector of coefficients, whereas the DCMLPSI imposes two restrictions. The OCMLPSI solves the OMLPSI equations subject to the restriction that the covariance between the OCMLPSI and some linear combinations of the genotypes involved be equal to a vector of predetermined proportional gains (or constraints) imposed by the breeder, whereas the DCMLPSI imposes the additional restriction that the covariance between DCMLPSI values at different stages be zero. This additional restriction negatively affects the DCMLPSI selection response and expected genetic gain values per trait after the first stage.
One of the purposes of conducting a multistage selection is to reduce the cost and still obtain a reasonable gain. This means that the OCMLPSI and DCMLPSI could also be optimized with respect to aggregated economic gain and cost associated with obtain measures on each trait, but in this work, that problem was not considered. In a two-stage context, Namkoong (1970) has detailed how this last problem could be solved for the OMLPSI, whereas Xu and Muir (1992) have described that problem in the DMLPSI context.
We compared the relative efficiency of OCMLPSI and DCMLPSI under the assumption that the net genetic merit and the OCMLPSI and DCMLPSI values have joint multivariate normal distribution. We corroborated the normality assumption at Stage 2 using graphical methods and normality tests (Shapiro and Wilk, 1965;Mardia, 1980). Under this assumption, the regression of the net genetic merit on any linear function of the phenotypic values is linear (Kempthorne and Nordskog, 1959) and the selection response and expected genetic gain per trait results for two or more stages can be summarized arithmetically (Cochran, 1951;Young, 1964). We used two criteria to compare the efficiency of both indices. The first criterion was that the total selection response of each index must be lower than or equal to the SCLPSI selection response (Young, 1964;Saxton, 1983;Cerón-Rojas et al., 2019). The second criterion was that the expected genetic gain per trait values should be similar to the predetermined gains or constraints imposed by the breeder. We used one real  (Xu and Muir, 1992;Cerón-Rojas et al., 2019). This last result indicates that until Stage N − 1, each index is partial, but at Stage N, I N = b¢ 1 x 1 + b¢ 2 x 2 + … +b¢ N x N is a whole index. Young (1964) called the foregoing procedure the part and whole index selection method. Xu and Muir (1992) called that selection procedure selection index updating because as traits become available, each subsequent index contains all traits available up to that stage. This method is more efficient than the independent culling selection method because it uses the genetic correlation among traits and all available information at each stage to predict the net genetic merit (Saxton, 1983). In addition, the independent culling selection method cannot impose constraints on the expected genetic gain of each trait, as the constrained index does.

Genotypic and Phenotypic Covariance Matrices
Let g¢ = [g 1 g 2 … g n ], x¢ i = [z 1 z 2 … i n z ], and y¢ = [y 1 y 2 … y n ] be vectors, as defined in the above subsections. Thus, the genotypic covariance matrix of vectors x i and g for N stages is whereas the phenotypic covariance matrix of vector y is Var(y) To obtain the OCMLPSI (DCMLPSI) parameters, we need the following matrices: which are submatrices of P and G, respectively. In Appendix A (Eq. [A1] to [A3]), we describe a method to estimate P and G. Now suppose that the number of traits selected up to Stage i − 1 is n i -1 and that at Stage i we select n i traits, such that n i £ n i − 1 (or n i -1 < n i ). Then, according to the part and whole index selection method, at Stage i, we shall have n i -1 + n i traits. This means that the phenotypic covariance matrix [Q (i -1)i ] obtained with the n i -1 traits selected at Stage i -1 and the total n i -1 + n i traits will be of size n i -1 (n i -1 + n i ) and can be written as where s jc is the jcth phenotypic covariance value for j = 1, 2, …, n i -1 and c = 1, 2, …, (n i -1 + n i ). In addition, n i -1 and (n i -1 and one simulated dataset, each with four traits, to validate OCMLPSI efficiency vs. DCMLPSI efficiency. The results of both datasets indicated that the OCMLPSI is the most efficient index for predicting the net genetic merit, and its accuracy when predicting the selection response and estimating the expected genetic gain per trait was higher than the DCMLPSI accuracy when predicting the selection response and estimating the expected genetic gain per trait. Thus, breeders should use OCMLPSI when making a constrained phenotypic selection. Results of this study are the first ones comparing (with real and simulated data) the relative efficiencies of the OCMLPSI vs. DCMLPSI using the total selection response and expected genetic gain pert trait as the main criteria to compare the efficiency of both indices.

Objectives of the Constrained Multistage Linear Phenotypic Selection Indices
Let m j be the population mean of the jth trait before selection. One of the main OCMLPSI (DCMLPSI) objectives is to change m j to m j + d j , where d j is the jth ( j = 1, 2, …, r; r = the number of constrained traits) constrained trait or the jth predetermined proportional gain imposed by the breeder on the OCMLPSI (DCMLPSI) expected genetic gain per trait (Mallard, 1972;Cerón-Rojas and Crossa, 2018, Chapter 3). Additional OCMLPSI (DCMLPSI) objectives are (i) to maximize the selection response; (ii) to predict the net genetic merit (H = w¢g, where w¢ = [w 1 w 2 … w n ] and g¢ = [g 1 g 2 … g n ] are 1 ´ n vectors of economic weights and true unobservable breeding values, respectively); and (iii) to select individuals with the highest H values as parents of the next generation.

The Part and Whole Phenotypic Index Selection Method
Let y¢ = [y 1 y 2 … y n ] be a 1 ´ n vector of scores for n traits and assume that we can select only n i of them at Stage i (i = 1, 2, …, N; N = number of stages) such that after N stages, n = n 1 + n 2 + … + n N , where n i < N < n. We can partition y into N subvectors as y¢ = [x 1 x 2 … x N ], where x¢ i = [y 1 y 2 … i n y ] is the subvector of y at Stage i (i = 1, 2, …, N). This means that at this stage, the ith index is be a transforming matrix; then, for each stage, we can construct an index as + n i ) are the numbers of rows and columns of matrix Q (i -1)i , respectively. Equation [1b] indicates that Q (i -1)i is a nonsquare and nonsymmetric matrix. Matrix Q (i -1)i is useful for imposing the restrictions that make the DCMLPSI values independent among stages (see Cerón-Rojas et al., 2019, for details).

Selection Response at Stage i
At Stage i, the selection response (R i ) is the ith net genetic merit (H = w¢g) mean of the selected population and can be written as where k i is the selection intensity (Xu and Muir, 1992;Cerón-Rojas et al., 2019), s H = ′ w Cw is the standard deviation of H = w¢g, Var(g) = C is the covariance matrix of g, and i HI r is the correlation between H = w¢g and the index at Stage i (I i = b¢ i x i ). For N stages, the total selection response is R t = R 1 + R 2 + … + R N (Cochran, 1951;Young, 1964). Equation [2] indicates that the genetic gain that can be achieved in R i by selecting for several traits simultaneously within a population of animals or plants is the product of k i , s H , and i HI r (Kempthorne and Nordskog, 1959). Selection intensity is limited by the rate of reproduction of each species, whereas s H is beyond human control; hence, the greatest opportunity for increasing selection progress is by ensuring that i HI r is as large as possible (Hazel 1943). Equation [2] is a useful criterion for comparing the efficiency of different types of indices to predict the net genetic merit (H = w¢g; e.g., OCMLPSI efficiency vs. DCMLPSI efficiency). We would expect that the greater Eq. [2] is, the more effective OCMLPSI (DCMLPSI) is at predicting H = w¢g. In the multistage selection index context, however, one main restriction is that the whole OCMLPSI (DCMLPSI) selection response be lower than or equal to the SCLPSI response (Saxton, 1983;Cerón-Rojas et al., 2019).

Expected Genetic Gain per Trait at Stage i
The expected genetic gain per trait at Stage i (E i , or multitrait selection response) is the covariance between the true breeding value vector (g) and the I i = b¢ i x i value weighted by its standard deviation ( and multiplied by the selection intensity (k i ), so that We defined all the parameters of Eq.
[3] previously. In the univariate and single-stage breeding scheme, Eq. [3] is the same as the selection response. For N stages, the total expected genetic gain per trait is E t = E 1 + E 2 + … + E N (Cochran, 1951;Young, 1964).
In the OCMLPSI context, we will minimize the mean squared difference between the net genetic merit H = w¢g and the index I i = b¢ i x i {i.e., E(H -I i ) 2 ]} with respect to the vector of coefficients b i (i = 1, 2, …, N) under the assumption that Eq. [3] values are equal to the d j ( j = 1, 2, …, r; r = number of constraints) values imposed by the breeder. The resulting vector of coefficients (b i ) should maximize Eq.
[2] and make the Eq.
[3] values be near the d j value. In the DCMLPSI context, however, it is necessary to impose the additional restriction that the DCMLPSI values among stages are independent, as we shall see in the next two subsections.

The OCMLPSI Vector of Coefficients at Stage i
Let d¢ = [d 1 d 2 … d r ] be a vector 1 ´ r of constraints or predetermined proportional gains per trait imposed by the breeder, and is the jth element of vector d¢. In addition, let U¢ be a Kempthorne and Nordskog (1959) matrix (n -r)n (n = number of traits and r = number of constraints) of 1s and 0s, where 1 indicates that the trait is constrained and 0 indicates that the trait has no constraints (see Cerón-Rojas and Crossa, 2018, Chapter 3, for details). According to the single-stage Mallard (1972) constrained index theory, to obtain the OCMLPSI vector of coefficients at Stage i, we need to minimize the mean squared difference between the net genetic merit H = w¢g and the index Suppose that matrices Q ii , U, and A¢ i , and vectors d and w are known at Stage i; then, it is necessary to minimize the function with respect to the vector of coefficients b i and the vector of Lagrange multipliers u¢ = [u 1 u 2 … u r − 1 ]. The OCMLPSI vector of coefficients at Stage i is and I i is an identity matrix of the same size as Q ii . When D = U, the vector of coefficients of Eq.
[5] imposes null restrictions, and when D = U and U is a null matrix, Eq.
[5] is equal to d i = Q ii −1 A i w, the vector of coefficients of the OMLPSI (Cerón-Rojas et al., 2019). Thus, the OCMLPSI is more general and includes the multistage null phenotypic restricted index (Kempthorne and Nordskog, 1959; and the OMLPSI as particular cases (Cerón-Rojas and Crossa, 2018, Chapter 9).

The DCMLPSI Vector of Coefficients at Stage i
Let I Di -1 = b¢ i -1 x i -1 and I Di = b¢ i x i be the DCMLPSIs at Stages i − 1 and i, respectively. We shall obtain the DCMLPSI vector of coefficients at Stage i with the additional restriction that the covariance between the DCMLPSI values until Stage i − 1 with the I Di = b i ¢x i values be null. Let J¢ i − 1 = [I D1 I D2 … I D(i -1) ] be a vector of DCMLPSIs values until Stage i -1 such that the covariance between I Di and J i − 1 will be null. Xu and Muir (1992) and  showed that the covariance between I Di k Oi value in a two-stage breeding scheme according to Eq. [A6] (Appendix B), whereas we obtained the k Di values according to the Xu and Muir (1992) respectively. We defined all the parameters of Eq.

Efficiency when Predicting the Net Genetic Merit
According to Lande and Thompson (1990) and Moreau et al. (1998), the efficiency of the indices when predicting the net genetic merit, in percentage terms, is Ø = 100(T -1) [10] where T = R Oi /R Di , R Oi denotes the OCMLPSI selection response and R Di the DCMLPSI selection response. Therefore, when Ø is null, the efficiency of both indices is the same; when Ø > 0, the efficiency of the OCMLPSI is higher than that of the DCMLPSI, and when Ø < 0, DCMLPSI efficiency is higher than OCMLPSI efficiency for predicting the net genetic merit. An additional criterion for comparing the indices' efficiency is that the total selection response R t = R 1 + R 2 of each index should be lower than or equal to the single-stage constrained index selection response (R = ks I ), i.e., R t £ R (see Cerón-Rojas et al., 2019, for details).

Adjusting the OCMLPSI Covariance Matrices at Stage 2
At Stage 2, the phenotypic covariance matrix is whereas the genotypic covariance matrix is These matrices are affected by prior selection on I 1 = b¢x 1 . It is thus necessary to adjust them to take into consideration the ( ) is the DCMLPSI vector of coefficients at Stage i − 1, Q (i -1)i was defined in Eq. [1b] and b i is the DCMLPSI vector of coefficients at Stage i. Thus, to obtain b i , we need to minimize the mean squared difference between H = w¢g and -1) and assume that matrices Q ii , Q i(i -1) , U, and A i and vectors d and w are known. To minimize with respect to the vector of coefficients b i and the vector of The only difference between Eq.
[6] is the term which are the maximized OCMLPSI and DCMLPSI selection responses at Stage i, respectively. Although in Eq.
[2] the selection response can take any value, in Eq.
[8a] and [8b], R Oi and R Di give the maximum value of Eq.
[2] for the OCMLPSI and DCMLPSI, respectively. In addition, in practice k Oi and k Di are obtained with a different method; therefore, their values are generally different (i.e., k Oi ¹ k Di ). In this work, we obtained the I 1 = b¢ 1 x 1 effects on them. According to Cochran (1951) and Cunningham (1975), both matrices can be adjusted as follows: [ ] 1 1 1 1 21 21 and 1 1 1 1 where P* and G* are the adjusted matrix, a = k O1 (k O1 -u), k O1 is the selection intensity at Stage 1, u is the truncation point when I 1 = b¢ 1 x 1 is applied, P 1 = Var(x 1 ) and G 1 = Cov(x 1 , g

Test of the OCMLPSI (DCMLPSI) Normality Assumption
Several authors (Shapiro and Wilk 1965;Mardia 1980; Mohd-Razali and Bee-Wah 2011; Rani Das and Rahmatullah Imon, 2016) have given details of how to perform a normality test procedure on a dataset and many statistical packages provide graphs and normality tests.
We corroborated the OCMLPSI (DCMLPSI) normality assumption at Stage 2 with a simulated dataset using a graphical method (histograms) and analytical test procedures (the Shapiro-Wilk and Kolmogorov-Smirnov normality test). The corroboration procedure was as follows. In a two-stage context, let p = q 1 q 2 be the fixed total proportion retained, where q 1 and q 2 denote the proportion selected at Stage 1 and 2, respectively, and let n be the size of the simulated dataset at Stage 1; then, nq 1 will be the size of the selected individuals at Stage 1. We used the information of nq 1 individuals at Stage 2 to construct graphs and statistical tests to corroborate the OCMLPSI (DCMLPSI) normality assumption.

Real Dataset
The number of genotypes in this real data set was 3330 and the vector of economic weights (w) was w¢ = [19.54 −3.56 17.01 −2.51]. This dataset comes from a commercial egg poultry line (Akbar et al., 1984) and we used it to illustrate the indices' theoretical results obtained in this work. The estimated phenotypic (P ) and genotypic (Ĉ ) covariance matrices among the rate of lay (RL, number of eggs), age at sexual maturity (SM, d)

Simulated Datasets
These datasets are available in the Application of a Genomics Selection Index to Real and Simulated Data repository, at http://hdl. handle.net/11529/10199. They were simulated by Ceron-Rojas et al. (2015) with QU-GENE software (Podlich and Cooper, 1998) using 2500 molecular markers and 315 quantitative trait loci (QTLs) for eight phenotypic selection cycles (C0-C7), each with four traits (T 1 , T 2 , T 3 , and T 4 ), 500 genotypes and four replicates for each genotype. The authors distributed the markers uniformly across 10 chromosomes and the QTLs randomly across the 10 chromosomes to simulate maize (Zea mays L.) populations. A different number of QTLs affected each of the four traits: 300, 100, 60, and 40, respectively. The common QTLs affecting the traits generated genotypic correlations of −0.5, 0.4, 0.3, −0.3, −0.2, and 0.1 between T 1 and T 2 , T 1 , and T 3 , T 1 and T 4 , T 2 , and T 3 , T 2 and T 4 , T 3 and T 4 , respectively. The economic weights for T 1 , T 2 , T 3 , and T 4 were 1, −1, 1, and 1, respectively. We used four phenotypic selection cycles (C1-C4) with p = 0.01, 0.10, and 0.30 in each cycle. At Stage 1 we selected T 1 , T 2 , and T 3 , where T 1 and T 2 were constrained with vector d¢ =

Real Data
Truncation Points, Proportion Retained, and Selection Intensities for Two Stages Figure 1 shows the relationship among the truncation points (u 1 and u 2 ), the total proportion retained (p = q 1 q 2 ) and the heights of the ordinate of the normal curve: z(u 1 ) = We found the OCMLPSI selection intensity for Stages 1 [k 1 = z(u 1 )/ q 1 ] and 2 [k 2 = z(u 2 )/q 2 ] according to Eq. [A6] (Appendix B) as follows. For a fixed value of p = q 1 q 2 (e.g., p = 0.05), we used an iterative process with an R code. By successively changing the possible values of q 1 (q 2 = p/ q 1 ), u 1 , and u 2 , we found the maximum value of the estimated total OCMLPSI (DCMLPSI) selection response, t 1 2ˆR R R = + (Fig. 2). For example, for the real dataset and p = 0.05, the estimated total OCMLPSI selection response was I for Stages 1 and 2, respectively. Thus, for this dataset, the values of the truncation points (u 1 = 0.710 and u 2 = 0.81), proportions retained (q 1 = 0.24 and q 2 = 0.21) and selection intensity (k 1 = 1.30 and k 2 = 1.37), at both stages, were those associated with the maximum estimated total OCMLPSI selection response t R = 69.75 value. Table 1, presents additional truncation points, proportions retained, selection intensities for p = q 1 q 2 = 0.10, 0.20, and 0.30, associated to the OCMLPSI and DCMLPSI selection responses.

Estimated OCMLPSI Selection Response for Two Stages
In the one-stage case, the selection intensity for p = 0.05 was k = 2.063, and the SCLPSI selection response was R = 71.66 (see Cerón-Rojas and Crossa, 2018, Chapter 3, for details). According to the results detailed in the paragraph above and to Young (1964) and Saxton (1983)  selection response explained 90.72, 89.40, and 88.30%, respectively, of the estimated SCLPSI selection response.
The results of the last two subsections indicate that the average of the estimated total DCMLPSI and OCMLPSI selection responses explained 90 and 97.60%, respectively, of the average estimated SCLPSI selection response for all p values. This means that the average of the estimated total OCMLPSI selection response was 7.60% closer to the estimated SCLPSI selection response than the average of the estimated total DCMLPSI selection response. We explain the loss of DCMLPSI efficiency by noting that when DCMLPSI obtained its vector of coefficients, it incorporated an additional restriction, which made the DCMLPSI values independent at different stages. Xu and Muir (1992) and  indicated that the loss of efficiency is justified because their method for obtaining the selection intensities and total responses gives the breeder the opportunity to implement an unlimited number of selection stages, which otherwise would be very difficult or impossible to do.

Estimated OCMLPSI Expected Genetic Gain per Trait for Two Stages
Let p = 0.05 (k 1 = 1.30 and k 2 = 1.37); then, the estimated OCMLPSI expected genetic gains per trait (Appendix A, Fig. 2. Distribution of the total estimated optimum and decorrelated constrained multistage linear phenotypic selection index (OCMLPSI and DCMLPSI, respectively) selection response values for a real dataset, and the fixed total proportion retained (p) = 0.05 and 0.10. Table 1. Real data for total proportion (p) retained, estimated optimum and decorrelated constrained multistage linear phenotypic selection index (OCMLPSI and DCMLPSI, respectively) truncation points (u 1 and u 2 ), proportions retained (q 1 and q 2 ), selection intensities (k 1 and k 2 ), and selection response ( 1 R , 2 R , and t

2ˆR
R R = + ) for Stages 1 and 2. Values of R correspond to the one-stage estimated constrained linear phenotypic selection index selection response.  Table 2 presents additional estimated expected genetic gains per traits RL, SM, EW, and BW for both stages and p = 0.10, 0.20, and 0.30. For 0.10, 0.20, and 0.30, the estimated total expected genetic gains per traits RL and SM explained 60.33, 47.67, and 39% of each d¢ = [3 −1] value, respectively. Thus, for this dataset, the estimated expected genetic gains per trait underestimated the d¢ = [3 −1] values. We explained the loss of DCMLPSI accuracy, noting that when the DCMLPSI obtained its vector of coefficients, it incorporated an additional restriction to make the DCMLPSI values independent among stages. The average estimated DCMLPSI expected genetic gain per trait efficiency was 54.67%, whereas the average estimated OCMLPSI expected genetic gain per trait efficiency was 85% for all p values. Thus, the average of the estimated OCMLPSI accuracy associated with d¢ = [3 −1] values was 35% higher than the average of the estimated DCMLPSI accuracy associated with d¢ = [3 −1] for the real dataset.
The results of the above four subsections indicate that the accuracy of both indices was higher when they predicted the selection response than when they estimated the expected genetic gain per trait. However, for the real data, the efficiency of the OCMLPSI when predicting the selection response and estimating the expected genetic gain per trait was higher than the DMLPSI efficiency when predicting the selection response and estimating the expected genetic gain per trait.

OCMLPSI Efficiency vs. DCMLPSI Efficiency to Predict the Net Genetic Merit Equation [
10] is a tool for determining OCMLPSI efficiency vs. DCMLPSI efficiency when predicting the net genetic merit in percentage terms. The estimated average OCMLPSI efficiency to predict the net genetic merit in percentage terms is 100(97.604/90.019 -1) = 8.426%, where 97.604 and 90.019 are the average of the estimated total OCMLPSI and DCMLPSI selection responses (Table 1) for all p values, respectively, and 8.426% is OCMLPSI efficiency with respect to DCMLPSI efficiency, in percentage terms, to predict the net genetic merit. Thus, for the Akbar et al. (1984) real dataset, the estimated average OCMLPSI efficiency was 8.426% higher than the estimated average DCMLPSI efficiency for predicting the net genetic merit. The results in this section indicate that although the average of the total OCMLPSI selection response, for all p values, overestimated the average of the SCLPSI by 0.73%, the average of the total DCMLPSI selection response, for all p values, underestimated the average of the SCLPSI by 4.30%. Thus, for this simulated dataset, the OCMLPSI was the best predictor of the net genetic merit, and its accuracy when predicting the selection response was higher than the DMLPSI accuracy for predicting the selection response.

Estimated OMLPSI and DMLPSI Expected Genetic Gains per Trait
for four simulated selection cycles and p = q 1 q 2 = 0.30 in a two-stage context. Each t ′ E value was associated with the mean values of traits T 1 , T 2 , T 3 , and T 4 . In addition, in both indices, traits T 1 and T 2 were constrained by vector d¢ = [5 −2] values. The average of the estimated total OCMLPSI t ′ E values associated with traits T 1 and T 2 (5.76 and −2.30, respectively) overestimated the d¢ = [5 −2] values by 15.20%. However, the average of the estimated total DCMLPSI t ′ E values associated with traits T 1 and T 2 (5.05 and −2.02, respectively) overestimated the d¢ = [5 −2] values by only 1.0%. Nevertheless, note that at Stage 2, the averages of the estimated total DCMLPSI expected genetic gains per trait associated with traits T 1 , T 2 , and T 3 (0.02, −0.01, and 0.0, respectively) were practically null. This means that DCMLPSI efficiency occurred at Stage 1, when restriction matrix S (i -1)i was null [S (i -1)i = 0] and 1 1 ′ = b b . Thus, for this dataset, the average of the estimated total DCMLPSI expected genetic gains per trait was more efficient for predicting the d¢ = [5 −2] values than the OCMLPSI, but the highest DCMLPSI efficiency occurred at Stage 1, when 1 1 ′ = b b and the estimated standard deviations of OCMLPSI and DCMLPSI values were the same.
We also estimated the total expected genetic gains per trait of both indices for p = 0.01, 0.10, and 0.20 (data not shown); however, in all cases, those values were higher than the d¢ = [5 −2] values. For example, for p = 0.10, the averages of the estimated total OCMLPSI and DCMLPSI expected genetic gains per trait associated with d¢ = [5 −2] were 8.78 and −3.51, and 7.58 and −3.03, respectively.
The difference between the OCMLPSI and DCMLPSI expected genetic gains per trait is due to the different number of genotypes used to estimate the parameters. That is, in the real dataset, the number of genotypes was 3330, but in the simulated data, the number of genotypes was only 500, which represents only 15% of the size of the genotypes used in the real dataset to estimate the parameters of the Table 3. Simulated data for total proportion retained (p) = q 1 q 2 = 0.01, 0.10, and 0.30, and estimated optimum and decorrelated constrained multistage linear phenotypic selection indices (OCMLPSI and DCMLPSI, respectively) responses ( 1 R , 2 R , and t 1 2ˆR R R = + ) and single-stage constrained linear phenotypic selection index (SCLPSI) responses ( 0.01 R , 0.10 R , and 0.30 R ) for four simulated selection cycles in a two-stage breeding scheme. indices. This means that the number of genotypes used to estimate the indices' parameters was an important factor for both indices in the real and simulated data. The results of the real and simulated datasets indicated that the OCMLPSI is the most efficient index for predicting the net genetic merit, and its accuracy when predicting the selection response and estimating the expected genetic gain per trait was higher than DCMLPSI accuracy when predicting the selection response and estimating the expected genetic gain per trait.

Normality Test for the Estimated OCMLPSI and DCMLPSI Values at Stage 2
We used the simulated dataset in Cycle 2 to test the normality assumption of the estimated OCMLPSI and DCMLPSI values at Sage 2. In Cycle 1, the number of genotypes was 500. For p = q 1 q 2 = 0.05 and 0.30, the q 1 values for OCMLPSI were 0.22 and 0.55, whereas those values for DCMLPSI were 0.06 and 0.31, respectively. Then, at Stage 2, (0.2)(500) = 110 and (0.55)(500) = 270 were the number of genotypes for OCMLPSI, whereas for DCMLPSI, the number of genotypes were (0.06)(500) = 30 and (0.31)(500) = 155. We used these last numbers of genotypes to construct histograms (Fig. 3) of the estimated OCMLPSI and DCMLPSI values at Stage 2.
According to the histograms constructed for the estimated OCMLPSI values, when the number of genotypes changed from 110 (Fig. 3a) to 270 (Fig. 3b), the estimated OCMLPSI values were closer to the normal distribution. The same was true for the estimated DCMLPSI values ( Fig. 3c and 3d).
We describe now the Shapiro-Wilk and Kolmogorov-Smirnov normality test results of the estimated OCMLPSI and DCMLPSI values at Stage 2 (Cycle 2) when the number of genotypes was 110 and 270 for OCMLPSI, and 30 and 155 for DCMLPSI. With the simulated dataset, we tested the null hypothesis that the estimated OCMLPSI and DCMLPSI values at Stage 2 have normal distribution.
The statistical value of the Shapiro-Wilk test should be close to 1.0 to accept the null hypothesis, whereas the statistic value of the Kolmogorov-Smirnov test should be close to 0.0 to accept the null hypothesis (Rani Das and Rahmatullah Imon, 2016). In the present case, for the values associated with OCMLPSI (110 and 270), the statistic values of the Shapiro-Wilk were 0.958 and 0.989, whereas the statistic values of the Kolmogorov-Smirnov were 0.080 and 0.044, respectively. Thus, we believe that for the estimated OCMLPSI values, the null hypothesis was true. In a similar manner, for the values associated with DCMLGSI (30 and 155), the statistic values of the Shapiro-Wilk were 0.967 and 0.991, whereas the statistic values of the Kolmogorov-Smirnov were 0.094 and 0.029, respectively. We again accept that the estimated DCMLPSI values approach the normal distribution.

DISCUSSION Criteria Used to Evaluate the Relative Efficiency of the Indices
A criterion used to evaluate OCMLPSI efficiency vs. DCMLPSI efficiency when predicting the net genetic merit was that the estimated total OCMLPSI and DCMLPSI selection response must be lower than or equal to the single-stage estimated OCLPSI selection response. Additional criteria were the ratio of the OCMLPSI selection response over the DCMLPSI selection response and the estimated expected genetic gain per trait or multitrait selection response. The estimated total selection response of both indices predicted the mean value of the net genetic merit in the progeny population, whereas the estimated expected genetic gain values indicated how close the estimated mean values of the traits are to the predetermined proportional gains (or constraints) imposed by the breeder in each selection cycle. Both parameters are good criteria for comparing the efficiency of the indices, depending on the method used to estimate the vector of coefficients of each index.
for Stages 1 and 2 in four simulated selection cycles with total proportion retained (p) = q 1 q 2 = 0.30. Traits T 1 and T 2 were constrained with vector d¢ (a vector of constraints or predetermined proportional gains per trait imposed by the breeder) = [5 −2] values on the two indices.

Selection Intensities
The selection intensities (k 1 and k 2 ) of both indices had three main parts: the proportions retained (q 1 and q 2 ), the truncation points ( Saxton (1983), who used a numerical integration method to obtain truncation points, proportion retained, and selection intensities in a two-stage context. Saxton (1983) applied a two-stage selection scheme in two ways: first, by selecting three traits and then two traits; and second, by first selecting the last two traits and later the first three traits. Under the first scheme, Saxton (1983) found that the estimated total selection response overestimated the single-stage LPSI response by 3.8%, but under the second, he found that the estimated total selection response overestimated the single-stage LPSI response by only 1.5%. These results were very similar to the results obtained by Cerón-Rojas et al. (2019) when they used real data. This mains that, at least in a two-stage context, Equation [A6] was a good method to obtain the truncation points, proportion retained, and selection intensities.

Number of Restrictions Imposed on the Indices
The OCMLPSI solved the OMLPSI equations subject to the restriction that the covariance between the OCMLPSI and some linear combinations of the genotypes involved be equal to a vector of predetermined proportional gains (or constraints) imposed by the breeder. However, in addition to the latter restriction, the DCMLPSI imposed the restriction that the covariance between DCMLPSI values at different stages be zero. The latter restriction decreased DCMLPSI Fig. 3. Histograms of the estimated optimum and decorrelated constrained multistage linear phenotypic selection index (OCMLPSI and DCMLPSI, respectively) values at Stage 2 for simulated dataset in Cycle 2, and the fixed total proportion retained (p) = q 1 q 2 = 0.05 and 0.30 when the number of genotypes was 110 (Fig. 3a) and 270 (Fig. 3b) for OCMLPSI, and 30 (Fig. 3c) and 155 (Fig. 3d) for DCMLPSI. efficiency after Stage 1, and as a result, its selection response and expected genetic gain were lower than the OCMLPSI selection response and expected genetic gain for the real and simulated datasets at Stage 2. Muir (1991, 1992) and  indicated that the loss of DCMLPSI efficiency after Stage 1 is justified because their method for obtaining the selection intensities and total responses gives the breeder the opportunity to implement an unlimited number of selection stages, which would otherwise be very difficult or impossible to do. At Stage 1, when the additional DCMLPSI restriction was null, the DCMLPSI and OCMLPSI vectors of coefficients were the same, as we would expect. Incidentally, this corroborated that both indices were applications of the SCLPSI to the multistage context.
According to Muir (1991,1992), the restriction that made the covariance between DCMLPSI values at different stages be zero is similar to the Kempthorne and Nordskog (1959) restriction imposed on the expected genetic gain per trait, which makes some traits not change their mean values while the rest of the trait means remain without restrictions. In effect, the DCMLPSI used a projector matrix (e.g., K Di ) to project the OMLPSI vector of coefficients (d i ) into a space smaller than the original space of d i , whereas Kempthorne and Nordskog (1959) used a projector matrix to project the single-stage LPSI vector of coefficients into a space smaller than the original space of the LPSI vector of coefficients. The reduction of the space into which the Kempthorne and Nordskog (1959) matrix projects the LPSI vector of coefficients is equal to the number of zeros that appears in the expected genetic gain per trait, and the selection response and accuracy decrease as the number of restrictions increases (Cerón-Rojas and Crossa, 2018, Chapter 3). Nevertheless, it is not clear whether under the Xu and Muir (1992) restrictions the expected genetic gain per trait, the selection response, and the accuracy decrease as the number of stages increases. If this is true, the Xu and Muir (1992) method could not give the breeder the opportunity to implement an unlimited number of stages, because the expected genetic gain per trait, the selection response, and the accuracy will decrease as the number of stages increases and soon would be null.
In the DMLPSI context,  compared the estimated single-stage LPSI selection response with the estimated DMLPSI selection response for two and three stages and found that at Stages 2 and 3, the estimated total DMLPSI selection response explained only 92 and 87%, respectively, of the estimated LPSI selection response. That is, at Stage 3, the estimated total DMLPSI selection response was lower (5%) than at Stage 2.

Another Way of Writing the OCMLPSI and DCMLPSI Vectors of Coefficients
We wrote the OCMLPSI and DCMLPSI vectors of coefficients (b i = K Oi d i and b i = K Oi d i , respectively) as a projection of the OMLPSI vector of coefficients (d i = Q ii −1 A i w) into a space that is perpendicular to the space generated by the columns of matrix M i (V i ) made by the projector matrices K Oi and K Di which are idempotent (K Oi = K Oi 2 and K Di = K Di 2 ). This is the simplest way of writing the OCMLPSI and DCMLPSI vectors of coefficients. However, there is another way of writing the OCMLPSI and DCMLPSI vectors of coefficients based on the Tallis (1985) approach.
The Tallis (1985) approach requires a proportionality constant which, according to Itoh and Yamada (1987), represents the regression coefficient of the net genetic merit (H = w¢g) on Q ii where d 0 is the DCMLPSI vector of predetermined restrictions. There are some problems associated with the proportionality constant. For example, if the proportionality constant is positive, it is appropriate for the DCMLPSI (OCMLPSI) vector of coefficients, and there is no problem; however, if the proportionality constant is negative, the indices will move the population means in the opposite direction to the predetermined desired direction.  developed a constrained multistage selection index as an extension of the DMLPSI developed by Xu and Muir (1992) based on the Tallis (1962) index. Using the Akbar et al. (1984) real data, we found that the  index was not optimum. The average of the estimated  selection response for p = 0.05, 0.10, 0.20, and 0.30 explained only 68.55% of the one-stage SCLPSI, whereas the average of the estimated total OCMLPSI and DCMLPSI index selection responses explained 97.60 and 90%, respectively, of the estimated SCLPSI. Similarly, the estimated total  expected genetic gain values per trait explained only 10.17% of the vector d¢ = [3 −1] values for both stages. These results indicated that, in effect, the  index is not optimum and breeders should not use it.

Another Constrained Multistage Index
Cerón-Rojas and Crossa (2018) applied the OCMLPSI to the Hicks et al. (1998) dataset, but they used the Young (1964) method to obtain the selection intensities for two stages; thus, their results were approximations because the Young (1964)

The Multivariate Normality Assumption of Both Indices
The multivariate normality assumption of the estimated OCMLPSI (DCMLPSI) values was the basis for developing the OCMLPSI (DCMLPSI) theory. Under this assumption, the total OCMLPSI selection response and expected genetic gain per trait for two or more stages, is the sum of each response and expected genetic gain per trait obtained at each stage. We corroborated the normality assumption at Stage 2 using histograms and normality tests. When at Stage 1 the number of genotypes was 500 and the total proportion retained was 5 or 30%, at Stage 2 the estimated OCMLPSI values approach the normal distribution in a similar manner as the estimated DCMLPSI values. These results indicate that the correlations between the estimated OCMLPSI values do not affect the normality distribution of the estimated OCMLPSI values, at least for the simulated dataset. This means that when the size of the population at Stage 1 is high (e.g., 500 or more), the correlations between the estimated OCMLPSI values cannot affect the normality distribution of the estimated OCMLPSI values in a two-stage context.

CONCLUSIONS
We described the OCMLPSI and DCMLPSI theory and evaluated it in a two-stage context. Based on the estimated total selection response and the total expected genetic gain per trait of each index, we determined their efficiency using a real and a simulated dataset. We found that the OCMLPSI is the most efficient index for predicting the net genetic merit, and its accuracy for predicting the selection response and estimating the expected genetic gain per trait was higher than DCMLPSI accuracy for predicting the selection response and estimating the expected genetic gain per trait. Thus, breeders should use the OCMLPSI when making a selection, not the DCMLPSI.

APPENDIX A The Phenotypic Model to Estimate the Variance Components
In this work, we estimated matrices P and G (Eq. [1a]) using restricted maximum likelihood (REML) because this estimation method does not require a specific design or balanced data and can be used to estimate genetic and residual variance and covariance in any arbitrary pedigree of individuals. In addition, the expectation and maximization algorithm allows computing the REML for the variance components (Lynch and Walsh, 1998, Chapter 27;Cerón-Rojas and Crossa, 2018, Chapter 2). Let q q q q m = + + y 1 Zg e be the phenotypic model where y q is a g ´ 1 (g = the number of genotypes in the population) vector of phenotypic averages, which has multivariate normal distribution (NMV) with mean 1m q and covariance matrix V q ; 1 is a g ´ 1 vector of ones, m q is the mean of the qth trait, Z is an identity matrix g ´ g; g q ? NMV(0,

Estimating Matrices G and P using the Expectation and Maximization Algorithm
The expectation and maximization algorithm allows computing the REML for the variance components 2 g q s and 2C iq = 2As giq + 2Is eiq = 2Cov(y i , y q ) is the covariance of y q and y i , and s giq and s eiq are the additive and residual covariances, respectively, associated with the covariance of y q and y i . Thus, one way of estimating s giq and s eiq is using the following equation: and