Journal list menu

Volume 2, Issue 1 190003 p. 1-8
Original Research
Open Access

Lidar and RGB Image Analysis to Predict Hairy Vetch Biomass in Breeding Nurseries

Nicholas P. Wiering

Corresponding Author

Nicholas P. Wiering

Dep. of Agronomy and Plant Genetics, Univ. of Minnesota, 1991 Upper Buford Circle, St. Paul, MN, 55108

Corresponding author ([email protected]).Search for more papers by this author
Nancy J. Ehlke

Nancy J. Ehlke

Dep. of Agronomy and Plant Genetics, Univ. of Minnesota, 1991 Upper Buford Circle, St. Paul, MN, 55108

Search for more papers by this author
Craig C. Sheaffer

Craig C. Sheaffer

Dep. of Agronomy and Plant Genetics, Univ. of Minnesota, 1991 Upper Buford Circle, St. Paul, MN, 55108

Search for more papers by this author
First published: 01 August 2019
Citations: 8


Core Ideas

  • Early-season biomass is conventionally phenotyped by subjective visual estimates.
  • RGB image data are highly predictive of biomass in hairy vetch breeding plots.
  • Lidar and RGB image data can be combined to accurately predict sward biomass.
  • Remote sensing could increase genetic gain potential for biomass in cover crops.

Hairy vetch (Vicia villosa Roth) is an annual legume grown as a forage and cover crop. To improve cover crop function, traits such as biomass production are of high interest for cover crop breeders. However, direct phenotypic methods for biomass production are destructive. Breeders have thus relied on subjective, visual scoring of biomass, which is generally correlative but not quantitative or absolute. We evaluated two low-cost remote sensing tools, lidar and red–green–blue (RGB) image analysis, for their potential to predict biomass in vivo. We evaluated these tools in two common forage breeding scenarios, spaced-plant and sward-plot nurseries, at three Minnesota locations following the winter of 2016–2017. Ground cover from RGB image binarization had a significant and linear relationship with aboveground biomass in spaced plants (R2 = 0.93) and sward plots (R2 = 0.89). However, once the image area from sward plots became saturated with vegetative pixels, a near-exponential relationship would occur. Because of the prostrate growth habit of hairy vetch, RGB image analysis was more appropriate at lower plant densities, such as spaced-plant nurseries. Conversely, the dimensionality of lidar sensing gave it greater predictive ability at higher plant densities where RGB analysis could not detect vertical increases in biomass. Lidar measures of sward-plot height were also linearly and strongly related to dry-matter biomass in sward plots (R2 = 0.80). When we combined RGB and lidar data to predict sward-plot biomass in a multiple mixed-effect regression model, we were able to explain more biomass variation than with the use of either phenotypic tool as a single predictor (R2 = 0.94).


  • DM
  • dry matter
  • RGB
  • red–green–blue
  • Winter annual legume cover crops, such as hairy vetch, provide soil coverage to stabilize the topsoil and provide biologically fixed N2 for use by subsequent crops in a rotation (17). Hairy vetch has exhibited high tolerance to winter stress (8; 43) and high biomass production (7; 39), equating to greater fixed N2 potential. Breeding efforts are ongoing to improve both the winter survivability and biomass yield of hairy vetch, an outcrossing species, for cover cropping in the midwestern United States (43). Plant breeders have traditionally relied on visual approximations to phenotype biomass in vivo (11) because it is nondestructive and generally time efficient.

    Forage crop breeders often phenotype biomass in two contrasting nursery designs at various cycles within their breeding programs (11). The first is spaced-plant nurseries, where segregating genotypes are evaluated individually in a grid or lattice design, usually separated by 0.3 to 1.0 m in any direction. Spaced-plant designs, without a companion species, do not incorporate plant-to-plant competition that exists at higher plant densities, such as in a production setting. However, they do enable the evaluation and selection of individual genotypes. Spaced-plant nurseries can also provide greater selection pressure for winter-hardiness, as plants are more susceptible without the insulating effect of neighboring cover. Conversely, sward plots encompass many genotypes within a given area (1–10 m2) at planting densities that reflect the end use of the crop. These plots can enable the evaluation of heterogeneous breeding populations and have been shown to provide greater heritability potential than spaced plants for biomass improvement (6; 30).

    Direct measures of biomass, which are destructive, limit germplasm advancement for crop improvement if genotypes cannot be maintained by alternative means. Although visual estimates of biomass have exhibited moderate to high correlations with fresh biomass weights for various forage species (38; 33), these measures are still categorical and subjective. To optimize gains in a legume cover crop such as hairy vetch, breeders would benefit from quantitative, reproducible, yet nondestructive measures of ground cover and biomass. Two potential approaches to nondestructively measure ground cover and biomass are lidar and red–green–blue (RGB) image analysis.

    Lidar is a remote sensing system that sends and receives a pulsed near-infrared laser signal (23). The time necessary for this signal to reflect from a surface and return to the sensor determines the distance between the sensor and surface. By scanning across a physical area with multiple sensors or by grid-serpentine sampling with a single sensor, a three-dimensional point cloud can be obtained. Lidar has demonstrated accurate predictions (R2 > 0.80) for crop density in wheat (Triticum aestivum L.; 36; 18); crop height in Miscanthus giganteus (J.M. Greef & Deuter ex Hodkinson & Renvoize; 44), rice (Oryza sativa L.; 40), and maize (Zea mays L.; 20); canopy volume in tree species (34); and biomass in maize (20), alfalfa (Medicago sativa L.; 24), and wheat (19). Numerous other forage crops have exhibited moderate to high correlations between lidar and biomass (15; 37; 16).

    An RGB image analysis involves extracting red, green, and blue color values from individual pixels within an image and numerically or spatially analyzing the data. Using these raw data, numerous computational methods can partition pixels as vegetative or non-vegetative, a process referred to as binarization. The proportion of vegetative to non-vegetative pixels can then be used as a metric for fractional vegetation cover, ground cover, canopy cover, leaf area index, light interception, etc. This method is now a conventional practice for determining vegetative cover in field crop research. It has provided accurate measures of canopy cover in tomato (Solanum lycopersicum L.; 10), soybean [Glycine max (L.) Merr.; 28], wheat (25), and various turf species (32). However, predicting aboveground biomass with canopy cover information has shown varied success. 35 compared RGB-derived canopy cover to biomass measures in nine cover crop species and found highly variable correlation values. The strongest positive correlation was found in Lupinus albus L. (white lupine) and Vicia sativa L. (common vetch; R2 = 0.76). Poor predictions for biomass in certain species in that study, such as Lens culinaris Medik. (lentil; R2 = 0.07) and Pisum sativum L. (pea; R2 = 0.06), may be due to the difficulty in capturing vertical growth with RGB image data. Methods do exist, however, to derive three-dimensional information from sequentially taken two-dimensional images using techniques known as Structure from Motion photogrammetry (41). This technique has shown moderate to high prediction accuracies for biomass in various plant species (13; 9; 1), but that topic was not a focus of this study.

    In the previous decade, considerable effort has gone toward improving the accuracy, precision, and throughput of phenotypic methods for crop species (2), shaping the phenomics movement as we know it. Such improvements could increase the genetic gain per unit of time, which is generally the goal for breeding programs. The remote sensing technologies utilized in such high-throughput phenotyping schemes, including unmanned aerial vehicles and hyperspectral cameras, can enable the efficient capture of a wealth of information from any specified wavelengths that may correlate with a trait of interest. Such equipment can be costly, however, and requires vast computational power and expertise to process the data.

    Lidar and RGB image analytics are both relatively low-cost remote sensing options for plant or plot measures. Currently, a single lidar sensor can be acquired for less than US$120. In comparison, RGB analysis can be executed on image files from even the most basic of digital cameras. Neither of these technologies have been evaluated for efficacy as phenotypic tools for the improvement of hairy vetch. If the application proves successful for biomass estimation, breeders could cull undesirable phenotypes before the onset of flowering. This ability to control pollen flow, coupled with greater phenotypic accuracy, could increase the genetic gain per unit of time for biomass cost effectively.

    Hairy vetch has a prostrate growth habit at early vegetative stages. As such, lidar would probably not provide meaningful biomass estimates if the plant samples were flat against the soil surface. Ground cover determined from image analysis, however, would be an intuitive option for biomass estimation of a seemingly two-dimensional subject. To our knowledge, these remote sensing technologies have not been evaluated individually, or in combination, to estimate the biomass of a plant species with a growth habit comparable to hairy vetch.

    Our objectives were to: (i) determine winter survival differences, if any, between spaced-plant and sward-plot nursery environments; (ii) evaluate the relationship between biomass and ground cover from RGB image analytics in both spaced plants and sward plots; (iii) evaluate the relationship between biomass and lidar-derived measures of plot height in sward plots; and (iv) evaluate a linear model that uses both RGB-derived ground cover and lidar as covariates to predict biomass in sward plots.

    Materials and Methods

    Hairy vetch germplasm was seeded in the first week of September of 2016 at Becker (45.39° N, 93.89° W), Rosemount (44.68° N, 93.07° W), and St. Paul (45.00° N, 93.17° W), MN. The respective soil types at these locations are Hubbard–Mosford loamy sand (sandy, mixed, frigid Entic Hapludolls), Tallula silt loam (coarse-silty, mixed, superactive, mesic Typic Hapludolls), and Waukegan silt loam (fine-silty over sandy or sandy-skeletal, mixed, superactive, mesic Typic Hapludolls). The Becker location, however, experienced near-complete stand loss due to winter stress and so was excluded from the majority of analyses. The experiment was a split-plot design, with planting arrangement as the whole-plot factor and accession as the subplot factor. Each location included four blocks containing all treatments. The six hairy vetch accessions included a commercial check (Purple Prosperity), early- and late-flowering commercial variety-not-stated (VNS) accessions (labeled V15 and V07, respectively), and three experimental breeding populations from the University of Minnesota. All of these were open-pollinated accessions (composite cultivars) with comparable growth characteristics but variable winter-hardiness scores. Planting arrangements included spaced plants (1 seed m−2) and sward plots (36 seeds m−2). Emergence was noted 1 mo after planting. At spring green-up (April), winter survival was assessed on an individual plant basis.

    At first sign of flower (25 May), a common timing for cover crop termination, spaced plants and sward plots were imaged using an Olympus TG-2 digital camera with factory settings. Images were captured from a fixed 2-m height. A white polyvinyl chloride reference quadrat was placed at ground level around each plot. Images were cropped to the margins of the reference quadrat and were analyzed using the well-established Canopeo application (27) in MATLAB (The Mathworks). This program determines ground cover percentage per image area based on a vegetative pixel selection criterion (Fig. 1) calculated from ratios of red/green and blue/green (26) and the excess green index (31). Lidar measurements of sward plots were recorded the same day that digital images were captured. A single LiDAR-Lite (v2; PulsedLight, Inc.) sensor, which operated at a wavelength of 905 nm, was used to obtain measures of vegetative height (cm). Sensing was initialized using an Arduino Uno circuit board, and the data were recorded with a tablet computer. A custom platform was constructed that would facilitate measurement from a fixed height (1 m) in a consistent pattern that would best represent the square-meter sward-plot area (Fig. 2). The lidar sensor was manually moved along a wire rail on our mobile platform. The rail pattern on our platform consisted of five diagonal crossings of the plot area in zigzag fashion. Approximately 30 pulsed samples from the discrete return lidar system were recorded within each diagonal section. The distance from the sensor to the plant material was subtracted from 100 cm (the height of the platform) to determine the vegetative height at that particular point. A total plot mean was obtained by taking the mean vegetative height of each diagonal section (Fig. 3).

    Within 24 h of lidar recording and image capture, the aboveground biomass was harvested from each plot and immediately weighed. A subset of material from each planting arrangement was dried by forced air (35°C) for final dry matter (DM) weight determination.

    Details are in the caption following the image

    An RGB image binarization of spaced-plant hairy vetch from the Canopeo software application.

    Details are in the caption following the image

    Custom platform for measuring plant height of sward plots via lidar sensing: (A) data collection system for plant height estimates ; (B) 1-m-tall polyvinyl chloride platform to systematically subsample the 1-m2 sward-plot area; and (C) the lidar sensor was manually guided along a wire rail while data points were simultaneously collected.

    Details are in the caption following the image

    Lidar sensor collection pattern: (A) the lidar sensor was manually moved along a wire rail, indicated by the diagonal black lines, to effectively represent vegetative height within the sward plot area; (B) equation to derive mean vegetative height of the sward plot.

    Statistical Methods

    The data were analyzed in the R statistical environment (v3.4.2; 29). Winter survival variation between the spacing treatments was evaluated using a generalized linear model with a logit link. To evaluate the relationship between RGB-derived ground cover and DM biomass for spaced plants, a linear mixed-effect model was fit with the lme4 package (5) using square-root transformations to account for variance heterogeneity:

    Equal variance was only achieved when both the response and fixed predictor were square-root transformed. To allow comparisons among models by marginal R2 values, the response variable DM biomass was transformed across all model fits.

    To determine the relationship between RGB-derived ground cover and DM biomass in sward plots, a piecewise regression model was fit (12), combining two separate linear mixed-effect models. The breakpoint (94% ground cover) was determined using the segmented package (21). Only the St. Paul location had ground cover values exceeding 94%. The segmented mixed-effect model was defined as:
    To evaluate the relationship between lidar-derived vegetative height and DM biomass of sward-plots, a linear mixed-effect model was fit:
    Lastly, a multiple linear regression model was fit using both lidar height and RGB-based ground cover as explanatory variables:

    Spearman's correlation coefficient was used to evaluate the strength of the relationship between the ranked variables for each phenotypic approach by location. Wald's chi-square test was used to evaluate the significance of any one predictor in the fitted model. Marginal R2 values, calculated according to 22, were used to determine the amount of biomass variation explained by the fixed predictor variable in each fitted mixed-effect model using the MuMIn package (4). Graphical representations of the data were made on the original scale for biological interpretation.


    Winter survival was highly variable across the three Minnesota locations (Becker, 4.0%; St. Paul, 68.4%; Rosemount, 41.5%). Plants in sward plots had significantly greater winter survival than spaced plantings (p = 0.001). Spearman's rank correlation coefficient (ρ) between spaced-plant and sward-plot biomass, according to hairy vetch accession, was 0.63 (p < 0.001). Survival percentage for hairy vetch accessions was positively correlated with DM biomass on an individual plant basis at the time of harvest (r = 0.55; p < 0.001), which probably contributed to the positive correlation between spaced-plant and sward-plot biomass.

    Vegetative ground cover of spaced plants, determined from RGB image analysis, had a strong and linear relationship with DM biomass at the time of imaging (ρ = 0.97 in St. Paul; ρ = 0.95 in Rosemount; Fig. 4). Ground cover was a significant predictor of biomass in the mixed-effect model, explaining 93% of the variation in DM biomass (p < 0.001; R2 = 0.93). Coefficient estimates for fixed effects for all fitted models can be found in Table 1.

    Table 1. Estimated regression coefficients for fixed effects among biomass prediction models.
    Model Plot type Fixed effect Estimate SE 95% lower limit 95% upper limit t value
    A spaced plant √(RGB canopy cover) 1.43 0.03 1.36 1.50 41.79***
    B sward plot √(RGB canopy cover) 1.66 0.07 1.52 1.80 24.18***
    C sward plot lidar height 1.03 0.07 0.89 1.17 15.11***
    D sward plot lidar height + 0.72 0.07 0.57 0.86 9.96***
    √(RGB canopy cover) 1.08 0.17 0.74 1.42 6.42***
    • *** Significant at a level of 0.001.
    • Coefficient estimates from a multiple regression model.

    Ground cover on sward plots exhibited a strong and segmented relationship with DM biomass (ρ = 0.97 in St. Paul; ρ = 0.98 in Rosemount; Fig. 5). The relationship was linear up to approximately 94% ground cover. As the image area became saturated with vegetative pixels (>94%), DM biomass increased sharply in a near-exponential manner. For ground cover values <94%, RGB-based ground cover explained 89% of the variation in DM biomass (p < 0.001; R2 = 0.89). For ground cover values >94%, RGB-based ground cover explained 76% of the variation (p < 0.001; R2 = 0.76).

    Details are in the caption following the image

    Relationship between dry matter (DM) biomass and RGB-derived ground cover of spaced plants. Points and linear regression lines are plotted by location (RSM = Rosemount; STP = St. Paul, MN).

    Lidar approximations of sward-plot vegetative height had a strong and linear relationship with DM biomass (ρ = 0.93 in St. Paul; ρ = 0.91 in Rosemount; Fig. 6). From the mixed-effect model, lidar-derived height (cm) explained a significant amount of variation in DM biomass (p < 0.001; R2 = 0.80). When both RGB-derived ground cover and lidar-derived vegetative height were added as covariates in a multiple regression model, the marginal R2 value increased from 0.80 to 0.94.

    Details are in the caption following the image

    Relationship between dry matter (DM) biomass and RGB-derived ground cover in sward plots. Points and linear regression lines are plotted by location (RSM = Rosemount; STP = St. Paul, MN).


    Predicting Biomass of Spaced Plants

    This study demonstrates the usefulness of RBG image analysis and lidar for nondestructive estimates of spring biomass in hairy vetch within two common forage breeding scenarios. The first scenario is spaced-plant selection nurseries, where breeders evaluate and select individual genotypes. For overwintering plants, this environment makes plants more vulnerable to winter-kill due to the lack of vegetation or residue in the vicinity to catch snow or provide insulation. The increased survival of hairy vetch in sward plots supports this notion.

    Details are in the caption following the image

    Relationship between dry matter (DM) biomass and lidar-derived vegetative height in sward plots. Points and linear regression lines are plotted by location (RSM = Rosemount; STP = St. Paul, MN).

    Hairy vetch has a prostrate morphology early in its vegetative life. Unless competing for light with neighboring plants, hairy vetch will grow horizontally while indeterminately producing vine-like stems from its central crown. This growth seems to continue until the ground in the vicinity is saturated with photosynthetic tissue. Hairy vetch plants then invest in vertical growth, aided by tendrils, which provide support for their flimsy stems. Spaced plants were imaged and harvested just prior to the onset of this vertical growth. Intuitively, the image-derived ground cover correlated highly with the harvested biomass of the prostrate plants. Because of this early-growth characteristic of hairy vetch, lidar measurements would probably not provide accurate estimates of vegetative height or biomass for spaced plants.

    Predicting Biomass of Sward Plots

    The next breeding scenario we investigated was sward plots. This environment is more reflective of eventual cultivar use and encompasses competition among plants, which is essentially absent in spaced-plant nurseries. Due to the increased competition in this environment, plants take to vertical growth sooner than they would without competition. In our study, image-based ground cover of sward plots was linearly related to harvested biomass up to a point, approximately 94% ground cover. Biomass then increased near-exponentially until the image area was virtually saturated with vegetative pixels. Increases in biomass beyond this point cannot be predicted accurately with RGB analysis because the data cannot account for vertical growth.

    Lidar sensing is more appropriate when predicting biomass at high plant densities in sward plots. We found the relationship between harvested biomass and lidar-derived vegetative height to be linear and highly correlative. Early vigor and biomass production are sought-after traits for hairy vetch, but as an N2–fixing plant, the final biomass at the time of termination is additionally important because it generally equates to the amount of N fixed and stored in the tissue. To make predictions during this critical period, lidar sensing would be more meaningful than RGB-based ground cover.

    Multiple Regression Analysis with RGB and Lidar Data

    In addition to comparing lidar and ground cover for biomass predictions in sward plots, we fit a model including both as covariates. By combining them, we were able to account for more variation in our response variable than we could with either variable as a lone predictor, as evidenced by the marginal coefficients of determination. The RGB-based ground cover was a better predictor of lower sward-plot biomass values when plants were still prostrate or the canopy had yet to close. Complementing this, lidar sensing was more predictive at higher sward-plot biomass values, post canopy closure.

    Considerations and Limitations

    Aside from the indirect measures of in situ biomass discussed, the mass of dried aboveground biomass at the time of physiological maturity (i.e., seed harvest) is often used as a proxy for the maximum amount of biomass produced per plant or per plot. Hairy vetch maturation and senescence can be highly indeterminate, however. Some determinate-like genotypes readily senesce and drop foliage shortly after seedpods are mature, while some genotypes flower for an extended period of time, where an individual plant can simultaneously be vegetative, flowering, and have mature seedpods among nodes of the same stem. In a preliminary evaluation of RGB-based image analysis in a spaced-plant hairy vetch nursery at Rosemount, MN, in 2016 (unpublished data), images were taken of spaced plants just prior to the onset of flowering (imaged 15 May), when plant biomass production is approaching its maximum. The RGB-derived ground cover values were compared with DM biomass at seed maturity (19 July). The RGB-based ground cover values at imaging time explained virtually none of the variation in DM biomass at seed maturity (R2 = 0.016). A selection scheme where individuals are selected based on DM biomass at physiological maturity would probably favor the selection of indeterminate genotypes, which are not necessarily desirable depending on breeding objectives. Given this comparison, we would not recommend using direct measures of DM biomass at the time of seed maturity as a proxy for in-season biomass production.

    As is evident in Fig. 4 to 6, location had a significant effect on accrued biomass at the time of sampling. The differences in winter survival between the Rosemount and St. Paul locations probably explains most of this variation. Although Rosemount had lower survival than St. Paul, low survival was also accompanied by more winter injury. The higher winter stress and subsequent injury probably explains the lower biomass values obtained from spaced plants in Rosemount.

    The RGB image analysis and lidar sensing can provide low-cost, quantitative, and objective estimates of biomass at critical periods of plant or stand development of hairy vetch, but some limitations exist with each technology. If spaced or sward plants were grown with a companion crop, or if weed presence was substantial, simple binarization of pixels based on RGB values would not be useful. Advanced computation techniques would be necessary to filter out the presence of other species (14; 3). The same scenarios would complicate lidar sensing as well, however, as no downstream analysis could interpret if height data points were due to a companion or weedy species.


    Obtaining reliable and nondestructive estimates of vegetative biomass in a cover crop breeding program, for both spaced and sward plots, is necessary to increase genetic gain potential. Traditionally, breeders have relied on subjective visual scores to approximate vigor and biomass. The prostrate growth habit of hairy vetch at an early vegetative stage makes it a favorable crop for soil coverage and makes phenotyping the biomass at this stage an easy target for RGB image analysis. Our results demonstrated a strong relationship between biomass and image data captured from a consumer-grade digital camera for spaced plants and sward plots. Lidar measures were more promising for biomass estimation as sward-plot densities increased. Predictions were greatest in sward plots when using both RGB-derived ground cover and lidar-derived height as covariates in a multiple regression model. In closing, these established technologies intuitively provide accurate and quantitative estimates of vegetative biomass, a prioritized trait for cover crop species, in two commonly used breeding nursery designs.

    Data Availability

    Data are available from the Dryad Digital Repository (42).


    Funding for this study was provided through the Agricultural Growth, Research, and Innovation Program sponsored by the Minnesota Department of Agriculture. We additionally want to thank Austin Dobbels and Dr. Reagan Noland for technical assistance devising the lidar–Arduino data collection system.