Journal list menu

Volume 51, Issue 3 p. 377-388
TECHNICAL REPORT
Open Access

Effect of water sampling strategies on the uncertainty of phosphorus load estimation in subsurface drainage discharge

Babak Dialameh

Babak Dialameh

Dep. of Biosystems and Agricultural Engineering, Michigan State Univ., East Lansing, MI, 48824 USA

Contribution: Conceptualization, Data curation, Formal analysis, ​Investigation, Methodology, Software, Visualization, Writing - original draft

Search for more papers by this author
Ehsan Ghane

Corresponding Author

Ehsan Ghane

Dep. of Biosystems and Agricultural Engineering, Michigan State Univ., East Lansing, MI, 48824 USA

Correspondence

Ehsan Ghane, Dep. of Biosystems and Agricultural Engineering, Michigan State Univ., East Lansing, MI, 48824, USA.

Email: [email protected]

Contribution: Conceptualization, Formal analysis, Funding acquisition, ​Investigation, Methodology, Project administration, Resources, Supervision, Validation, Writing - review & editing

Search for more papers by this author
First published: 23 February 2022
Citations: 3

Assigned to Associate Editor Wei Zheng.

Abstract

Accurate phosphorus (P) load estimation in subsurface drainage water is critical to assess the field-scale efficacy of conservation practices. The HydroCycle-PO4 instrument measures real-time total reactive P (TRP) concentration without the need for sample filtration, thereby enabling comparative evaluation of different sampling strategies. The main objective of this study was to evaluate the effects of water sampling strategies on the uncertainty of P load estimation. Hourly TRP concentration and hourly drainage discharge measurements formed the reference P load dataset. Four hypothetical water sampling strategies were evaluated: (a) time-proportional discrete sampling, (b) time-proportional composite sampling, (c) flow-proportional discrete sampling, and (d) flow-proportional composite sampling. All sampling strategies underestimated TRP load compared with the reference dataset. Total reactive P load underestimation changed from 0.2 to 51% as time-proportional discrete sampling intervals increased from 3 h to 14 d. Total reactive P load underestimation changed from 12 to 43% as the time-proportional compositing scenario increased from 1 to 7 d, each with one aliquot per day. In the case of flow-proportional discrete sampling scenario, the lowest (0.6%) and the highest (–5.1%) uncertainties were observed when 1- and 5-mm flow intervals were used. The relative error based on the results provided by the flow-proportional composite sampling ranged from 0.2% when using 1-mm flow interval to –6.7% when using 5-mm flow interval. In conclusion, the flow-proportional sampling strategies provided a more accurate estimate of cumulative P load with fewer number of samples because a greater portion of samples were taken at higher flow rates compared with time-proportional sampling strategies.

Abbreviations

  • DRP
  • dissolved reactive phosphorus
  • TRP
  • total reactive phosphorus.
  • 1 INTRODUCTION

    Subsurface drainage has enhanced economical crop production in poorly drained soils (Evans & Fausey, 1999). However, drainpipes can rapidly transfer nutrients to surface water bodies (Fausey et al., 1995; Ghane & Askar, 2021). Phosphorus (P) is a key nutrient transported by subsurface drainage that causes eutrophication and overstimulates the growth of harmful algal blooms (Chen et al., 2018; Ding et al., 2018; Wilson et al., 2019). There is an urgent need to reduce P transport from subsurface-drained fields to downstream water bodies. Edge-of-field monitoring of conservation practices is the first step to develop strategies that reduce P load from subsurface drainage discharge. However, edge-of-field monitoring requires efficient water sampling strategies to accurately calculate P load (Daniels et al., 2018; Dantas Mendes, 2020; Harmel et al., 2018).

    The water sampling strategy affects uncertainty in P load estimates (Birgand et al., 2011; Moatar et al., 2013; Shih et al., 1994). There are different water sampling strategies for drainage discharge (e.g., water sampling on a daily, weekly, fortnightly, and monthly basis), which are selected depending on budget and equipment limitations (Supplemental Table S1). Long sampling intervals might be sufficient for some indicators, contaminants, or purposes, but they can fail to account for short-term P variability (Bowes et al., 2015; Fones et al., 2020). Long sampling intervals fail to capture P concentration fluctuations, particularly during precipitation events when drainage discharge rapidly rises and recedes (Luo et al., 2012). Selecting the most appropriate sampling strategy is essential for accurate P load calculation in drainage discharge with temporal P variations (Villa et al., 2019). Therefore, there is a need to evaluate the uncertainty associated with different water sampling strategies in estimating P load in drainage discharge.

    For subsurface drainage application, we only found Williams et al. (2015) that had investigated the effect of sampling strategies on the accuracy of dissolved reactive P (DRP) load estimation in drainage discharge. The authors used sub-daily (2–4 h) intervals during storm events at some sites, and other sites were samples daily and the daily concentrations were interpolated to estimate hourly DRP concentrations at other sites. The authors recommended a minimum time interval to accurately estimate the annual DRP load when using a time-proportional sampling strategy. However, to our knowledge, there has been no study that has evaluated the accuracy of P load estimation in drainage discharge using flow-proportional sampling strategies. Flow-proportional sampling strategies are commonly used at edge-of-field monitoring. Therefore, there is a need to evaluate the accuracy of P load estimation of flow-proportional sampling strategies.

    In situ sensors have been developed for measuring high-frequency nutrient concentration. Liu et al. (2020) used such sensors for measuring high-resolution nitrate concentration, but we did not find any study that had used high-frequency P sensors in subsurface drainage application.

    This study aimed to evaluate the effects of different sampling strategies on P load estimation uncertainty: time-proportional discrete, time-proportional composite, flow-proportional discrete, and flow-proportional composite sampling. The value of this study is that it informs decision-making about the suitable sampling strategies needed to evaluate the P-reducing benefits of conservation practices.

    Core Ideas

    • The purpose of the monitoring project should dictate the sampling strategy.
    • All sampling strategies underestimated TRP load compared with the reference dataset.
    • An increase in time-proportional discrete sampling interval increased TRP load error.
    • The flow-proportional sampling strategies needed fewer samples.
    • The flow-proportional sampling strategy was more advantageous than other strategies.

    2 MATERIALS AND METHODS

    2.1 Study site

    This study was conducted on a private farm located in Lenawee County, Michigan, USA, from January 2019 to July 2020 (Figure 1). The drainage area was 22.5 ha. The dominant soil type was Blount loam (fine, illitic, mesic Aeric Epiaqualfs), which was classified as a poorly drained soil. Subsurface drainage pipes were installed at an average 0.75-m depth with 12-m drain spacing. This system was under conventional free drainage during the study period.

    Details are in the caption following the image
    Geographic location and drainage layout of the study site

    The cropping system during the study period was corn (Zea mays L.)–soybean [Glycine max (L.) Merr.] rotation with cereal rye as a winter cover crop. The conservation tillage practice was no-till. Manure was surface broadcasted at a rate of 43.2 kg P ha−1 in January 2019 and 5.3 kg P ha−1 in December 2019. Commercial fertilizer (product formulation 9–18–9 with sulfur) containing 2.9 kg P ha−1 was applied in May 2020.

    2.2 Precipitation data

    Precipitation data were collected using the microclimate sensor suite ATMOS-41 (METER Group, Inc.). This device is built-in with a high-resolution precipitation sensor (0.017-mm resolution). This sensor only measures rainfall; therefore, we used the snow water equivalent data from the National Oceanic and Atmospheric Administration weather station located 13.5 km away at the Adrian Lenawee County Airport to complete the precipitation data.

    2.3 Measurement of drainage discharge and P concentration

    2.3.1 Hourly drainage discharge measurement

    We combined two methods to measure hourly drainage discharge from January 2019 to July 2020. The first method measured drainage discharge with a metal-edge sharp-crest 45° V-notch weir (Agri Drain Corp.), which was installed in a 25-cm Agri-Drain water-level control structure. This method was used only when two conditions were satisfied: (a) water flowed inside of the V-notch (i.e., did not exceed the weir capacity), and (b) water-level in the structure's downstream chamber did not exceed the V-notch apex height. A HYDROS-21 water-depth sensor (METER Group) hourly measured the head of water inside the V-notch weir. Then, a calibrated V-notch equation was used to calculate the hourly drainage discharge based on the head. The second method measured hourly drainage discharge using a TIENET-350 area-velocity sensor (Teledyne ISCO), which was installed inside a pipe located downstream of the control structure. This method was used only when either of these two conditions was met: (a) water flowed over the V-notch weir (exceeding the weir capacity), and (b) water level in the downstream chamber of the structure exceeded the height of the V-notch apex. In both methods, the drainage discharge rate was estimated using hourly area-velocity and water-depth measurements. The area-velocity sensor provided high flow rates, and the V-notch weir provided low flow rates.

    2.3.2 High-frequency P concentration measurement using the HydroCycle-PO4

    We used an in situ HydroCycle-PO4 (Sea-Bird Scientific) to measure high-resolution P concentration at a 2-h interval (Supplemental Figure S1). The HydroCycle-PO4 measures P concentration based on a heteropoly molybdenum-blue complex with phosphate that can be detected colorimetrically (Murphy & Riley, 1962). Because the P concentration measurement is conducted on an unfiltered sample, the values represent total reactive P (TRP) (Rice et al., 2017). The term “reactive” refers to the inorganic form of P that is readily bioavailable for uptake.

    The HydroCycle-PO4 has an average accuracy of −5 to 20% at the lower and upper ends of the operating range, respectively (Snazelle, 2018). Johengen et al. (2017) performed an assessment of precision of the HydroCycle-PO4 by computing the standard deviations and coefficients of variation of the five replicate measurements for five different concentration trials (from 0.01 to 0.4 mg L−1). The standard deviation of the mean ranged from 0.0005 to 0.0020 mg L−1 across the five trials, and the coefficient of variation ranged from 0.14 to 5.78 percent. Also, the sensor has a maximum and minimum P detection limit of 1.2 and 0.002 mg L−1, respectively.

    The TRP concentration in drainage discharge was measured from January 2019 to July 2020. Total reactive P concentration measurements were retrieved from the data logger every week and processed using CycleHost software (Sea-Bird Scientific). Finally, the 2-h interval TRP concentrations were linearly interpolated to obtain hourly TRP concentrations.

    Although the sensor used a single copper screen mesh with 7.5-μm onboard filters to reduce sediment intake, we observed sediment accumulation inside the unit. Thus, we performed weekly cleaning of the instrument to maintain high-quality data. New cartridges were installed every 3–4 mo to maintain proper functioning of the HydroCycle-PO4.

    2.4 Calculating the reference hourly TRP load

    Hourly TRP concentrations (Section 2.2.2) and hourly drainage discharge measurements (Section 2.2.1) were used to calculate the reference hourly TRP load as
    Loa d ref = 4.44 × 10 5 i = 1 n Q i C i \begin{equation}{\rm{Loa}}{{\rm{d}}_{{\rm{ref}}}} = 4.44 \times {10^{ - 5}}\left( {\mathop \sum \limits_{i = 1}^n {Q_i}{C_i}} \right)\end{equation} (1)
    where Loadref is the reference TRP load (kg ha−1), 4.44 × 10−5 is a conversion factor to adjust for units, Qi is the hourly drainage discharge (m3 h−1), and Ci is the hourly TRP concentration (mg L−1). Any day that had missing data for any number of hours was eliminated from the analysis.

    2.5 Description of sampling strategies for TRP load estimation

    2.5.1 Time-proportional discrete sampling

    The reference hourly TRP concentration was subsampled to create eight hypothetical time-proportional discrete frequencies: 3-h, 6-h, 12-h, 1-d, 2-d, 3-d, 7-d, and 14-d intervals. The load for each sampling frequency was estimated by multiplying the TRP concentration of the selected sample and the cumulative drainage discharge during the relevant sampling interval as follows:
    Loa d est = 4.44 × 10 5 × j = 1 n Q j C j \begin{equation}{\rm{Loa}}{{\rm{d}}_{{\rm{est}}}} = 4.44 \times {10^{ - 5}} \times \left( {\mathop \sum \limits_{j = 1}^n {Q_j}{C_j}} \right)\end{equation} (2)
    where Loadest is the estimated load (kg ha−1), Qj is the cumulative drainage discharge during the sampling interval (m3 h−1), and Cj is the TRP concentration in the middle of the sampling interval (mg L−1).

    The starting point for our artificial subsampling was 11 Jan. 2019, which was the day that we started collecting TRP concentration data. The TRP load for each sampling strategy was estimated with 24 iterations due to the possibility of different starting times during a day (from 00:00 to 23:00). The calculation uncertainty for each load is minimized by performing several iterations, as outlined in Williams et al. (2015).

    There were some gaps in TRP concentration dataset. Malfunctioning and removal of the instrument for regular maintenance accounted for 211 d of no concentration data. No-flow condition also occurred for 10 d due to freezing of water inside the control structure during winter and drying of water inside the control structure. As a result, 305 d of flow data out of 566 d were used in the analysis.

    2.5.2 Time-proportional composite sampling

    Time-proportional composite sampling involves the collection of numerous aliquots collected at regular intervals over a specified period. These aliquots form a composite sample that is representative of the sampling period. To investigate the effect of time-proportional compositing scenarios on the accuracy of TRP load estimation, we estimated the daily TRP load with various compositing scenarios based on the reference dataset described in Section 2.4.

    The reference hourly TRP concentration dataset was subsampled to create 20 hypothetical time-proportional composite sampling scenarios: 1-, 2-, 3-, and 7-d composites, each with one, two, four, six, and eight aliquots per day. The TRP concentration in the composite sample was considered equal to the average of the TRP concentrations over the sampling interval. Then, we calculated the TRP load by multiplying the average TRP concentration with its associated drainage discharge during the sampling interval. We did not account for any potential effect of bottles remaining in the automated sampler until retrieval and any effect that delayed filtering may have on P concentration.

    2.5.3 Flow-proportional discrete sampling

    Flow-proportional sampling is a strategy widely used in water quality monitoring programs (Kladivko et al., 1991; Schleppi et al., 2006; Stone et al., 2000; Ulén & Persson, 1999; Wang & Kladivko, 2003). In this strategy, the water sample is collected when a specified volume or depth of water passes the monitoring point. Four volumetric depths (1.0, 2.0, 3.0, and 5.0 mm) were selected as flow intervals. These hypothetical manual flow-proportional discrete sampling scenarios were implemented on the reference hourly TRP concentration dataset.

    2.5.4 Flow-proportional composite sampling

    The reference hourly TRP concentration was the basis to create hypothetical flow-proportional composites. Because 1-L bottles are commonly used in automated samplers, each bottle can hold six 150-ml aliquots without risk of bottle overflow, for a total volume of 900 ml per bottle. To estimate the TRP concentration for each sample bottle, the reference hourly TRP concentration was subsampled for each aliquot based on four flow depth intervals (1.0, 2.0, 3.0, and 5.0 mm). Then, the six aliquots per bottle were averaged to generate the TRP concentration for each sample bottle.

    2.6 Calculating uncertainty of TRP load estimation

    The relative error quantifies the uncertainty in load estimates (Harmel et al., 2006; Williams et al., 2015). The relative error was calculated as
    e % = Loa d est Loa d ref Loa d ref × 100 \begin{equation}e\left( \% \right) = \left( {\frac{{{\rm{Loa}}{{\rm{d}}_{{\rm{est}}}} - {\rm{Loa}}{{\rm{d}}_{{\rm{ref}}}}}}{{{\rm{Loa}}{{\rm{d}}_{{\rm{ref}}}}}}} \right) \times 100\end{equation} (3)
    where e is the uncertainty (%), Loadest is the estimated TRP load based on the specific sampling interval or compositing scenario (kg ha−1), and Loadref is the reference hourly TRP load (kg ha−1).

    The difference between the estimated load and reference load was calculated after each iteration of the TRP load estimation, which resulted in a distribution of uncertainty values. The bias and precision of the P load estimate were determined from this distribution. We used the distribution median (e50) as a measure of bias and computed precision as the difference between the 95th (e95) and 5th (e5) percentiles of the distribution (Williams et al., 2015).

    3 RESULTS AND DISCUSSION

    3.1 Relationship between hourly drainage discharge and hourly TRP concentration

    The hourly TRP concentrations in our study varied from 0.007 to 1.161 mg L−1, with an average of 0.136 mg L−1. These hourly TRP concentrations were generally higher than those reported by previous studies, which used larger sampling intervals ranging from daily to monthly (Daigh et al., 2015, 2017; Tiemeyer et al., 2009). Our higher TRP concentrations may be explained by the finer hourly dataset that did not require immediate sample filtration (Harmel et al., 2006, 2018; Massri et al., 2021), as opposed to the coarser temporal resolution used in previous studies. The previous studies collected samples with longer sampling intervals that could have missed higher P concentrations occurring at peak flows. Hourly drainage discharge rates in our study varied between 0.016 and 0.062 mm h−1. Differences in soil, climate, drainage system, and agronomic practices also may explain the difference in P concentration between our study and previous studies (King et al., 2015).

    Temporal variations in hourly drainage discharge and hourly TRP concentration indicated that TRP concentration tended to increase during high flow events, whereas it remained steady during baseflow (Figure 2). A similar relationship was reported by other studies (Bende-Michl et al., 2013; Stamm et al., 1998; Vidon & Cuadra, 2010). Our combined results show the importance of high-frequency P sampling to accurately measure P concentration variation during high flow events. Subsequently, an accurate measure of P concentration variation leads to an accurate evaluation of the P transport dynamics (i.e., rapid P changed during storm events, event hysteresis pattern, flushing, and flashiness). The hourly TRP concentration and drainage discharge measurements during two other periods are provided in Supplemental Figures S2 and S3.

    Details are in the caption following the image
    Hourly total reactive P (TRP) concentration and drainage discharge in a subset of reference dataset from 18 Mar 2020 to 17 May 2020

    3.2 Effect of time-proportional discrete sampling frequency on TRP load estimation

    The accuracy of TRP load estimation was considerably affected by sampling frequency (Table 1) (Supplemental Figure S4). The estimated TRP load decreased as sampling interval increased from 3 h to 14 d, which led to underestimation of TRP load compared with the reference hourly load (Table 1). This underestimation became considerable when the sample collection frequency was longer than 1 d. For example, the estimated TRP load based on 14-d sampling interval was 0.67 kg ha−1, which was 51% less than the reference TRP load. Sampling intervals longer than 1 d often miss sharp increases in P concentration during the rising limb of the event flow hydrograph and the peak P concentration during storm events; thus, they do not accurately represent P variation (Section 3.1). Therefore, the shorter the sampling interval, the better representation of the rapid variation of P concentration during storm events (Supplemental Figure S5).

    TABLE 1. Uncertainty indicators (relative error, bias, and precision) for total reactive P (TRP) loads estimated using different time-proportional discrete sampling intervals for the entire period of the study
    Sampling interval Common sampling method Total number of samples that needs to be analyzed TRP load Relative error in TRP load Bias (e50) Precision (e5 to e95)
    kg ha−1 %
    1 h (reference) automated 7,320 1.37
    3 h automated 2,440 1.36 −0.2 −0.6 −1/−0.3
    6 h automated 1,220 1.34 −1.4 −1.5 −2.8/−1.3
    12 h automated 610 1.29 −5.3 −4.8 −7.8/−2.6
    24 h automated 305 1.20 −12.2 −12.0 −18.8/−5.6
    48 h manual 154 1.10 −19.2 −20.4 −34.5/−2
    72 h manual 100 0.99 −27.2 −28.3 −40.4/−13.7
    7 d manual 44 0.78 −42.7 −41.1 −59/−27.7
    14 d manual 22 0.67 −51.0 −47.8 −72/−36
    • Note. e50, median representing bias; e5 and e95, 5th and 95th percentiles representing precision, respectively.

    Bias increased from −0.6 at the 3-h sampling interval to −47.8 at the 14-d sampling interval (Figure 3; Table 1). The precision of P load estimation generally decreased as the sampling interval increased, which was due to high temporal variations in P concentration in drainage discharge. Because P concentration dramatically varies from baseflow to event flows, different sampling intervals might produce considerably different P load estimates, especially when using long sampling intervals.

    Details are in the caption following the image
    Bias and precision of total reactive P load estimation using different sampling intervals (e50, median representing bias; e5 and e95, 5th and 95th percentiles representing precision, respectively)

    The total number of collected samples varied from only 22 samples when using a 14-d interval to 2,440 samples when using a 3-h interval (Table 1). The decrease in the total number of collected samples from 2,440 to 22 resulted in an increase of error from −0.2 to −51.0%. Therefore, the total number of collected samples in the time-proportional discrete sampling strategy considerably affected the P load estimates.

    3.3 Effect of time-proportional composite sampling scenarios on TRP load estimation

    The number of aliquots did not considerably affect precision and bias in P load estimation (Table 2), which was consistent with the study of Harmel and King (2005). Compositing scenarios using one and eight aliquots had the lowest and highest precision, respectively. Precision did not considerably differ for composite samples from 1- to 7-d intervals. By contrast, bias was considerably affected by compositing interval. The 1-d composite with one aliquot is analogous to discrete sampling with 1-d sampling interval. The 1-d composite had the lowest average bias of −12.5, whereas the 7-d composite had the highest average bias of −42.8 across varying numbers of aliquots. These results indicate that the 1-d composite with 1 aliquot per day is a reliable and cost-effective strategy for P load estimation.

    TABLE 2. Uncertainty indicators (bias and precision) for the total reactive P (TRP) loads estimated by different time-proportional composite sampling scenarios for the entire period of the study
    Compositing scenario Number of aliquots per day Total number of samples that needs to be analyzed TRP load Relative error in TRP load Bias (e50) Precision (e5 to e95)
    kg ha−1 %
    Reference 7,320 1.36
    1-d composite 1 305 1.20 −12.4 −12.0 −18.8/−5.6
    2 305 1.20 −12.4 −12.4 −14.3/−11.1
    4 305 1.20 −12.4 −12.6 −12.9/−12.2
    6 305 1.20 −12.4 −12.7 −12.9/−12
    8 305 1.20 −12.4 −12.6 −12.7/−12.5
    2-d composite 1 154 1.10 −19.7 −19.2 −23/−16.1
    2 154 1.10 −19.7 −20.0 −20.3/−18.9
    4 154 1.10 −19.7 −19.9 −20.3/−18.9
    6 154 1.10 −19.7 −20.0 −20.3/−18.9
    8 154 1.10 −19.7 −19.7 −20.1/−19.5
    3-d composite 1 100 0.99 −27.7 −28.1 −30/−23.7
    2 100 0.99 −27.7 −27.8 −28.0/−26.9
    4 100 0.99 −27.7 −27.7 −28.2/−26.6
    6 100 0.99 −27.7 −27.8 −28.0/−26.9
    8 100 0.99 −27.7 −27.4 −27.9/−27.3
    7-d composite 1 44 0.78 −43.0 −42.7 −45.6/−33.6
    2 44 0.78 −43.0 −42.7 −43.0/−42.5
    4 44 0.78 −43.0 −43.1 −43.6/−42.8
    6 44 0.78 −43.0 −42.7 −43.0/−42.5
    8 44 0.78 −43.0 −42.8 −42.8/−42.6
    • Note. Relative errors calculated for the TRP loads were averaged across different numbers of aliquots (n = 1, 2, 4, 6, and 8 aliquots per day). e50, median representing bias; e5 and e95, 5th and 95th percentiles representing precision, respectively.

    The 1-d composite had the lowest uncertainty in TRP load estimation compared with 2-, 3-, and 7-d composites (Table 2). The highest relative error (43.0%) was observed for the 7-d composite, suggesting that longer compositing intervals considerably underestimated TRP load. The total number of composite samples varied from 44 to 305 when implementing 1- and 7-d composite scenarios, respectively (Table 2). The decrease in the total number of collected samples from 305 to 44 resulted in an increase of error from −12.4 to −43%. Therefore, the total number of collected samples in the time-proportional composite sampling strategy considerably affected the P load estimates.

    The results presented in this section assume there is no error from delayed filtering of sample. We used HydroCycle-PO4 instrument that provides real-time P concentration, whereas under field conditions, samples remain in the automated sampler until they are retrieved. Thus, the delay in sample filtration after sample collection generates uncertainty in DRP concentration measurements (Harmel et al., 2006, 2018; Massri et al., 2021). Therefore, the actual error from the time-proportional composite sampling strategy is expected to be slightly higher than the errors reported in this study because P concentration has been shown to decrease over time if not filtered.

    3.4 Effect of flow-proportional discrete sampling on TRP load estimation

    The analysis of flow-proportional discrete sampling scenarios showed that the flow interval can affect the accuracy of TRP estimations. However, this effect may not be substantial. The shorter 1-mm flow interval scenario underestimated the TRP load for 0.2%, whereas there was a 5.1% underestimation in TRP load when using the longer 5-mm flow interval scenario (Table 3). All scenarios underestimated the TRP load for the study period. Overall, the accuracy of flow-proportional discrete sampling in TRP load estimation was acceptable with any sampling interval. High accuracy is obtained by using this sampling strategy because a greater portion of samples is taken at higher flow rates (Ulén & Persson, 1999) (Supplemental Figure S6). Therefore, the shorter the flow interval, the better representation of the variation of P concentration during a storm event.

    TABLE 3. Uncertainty indicators (relative error, bias, and precision) for total reactive P (TRP) loads estimated using different flow-proportional sampling scenarios (discrete sampling and six-aliquot compositing)
    Sampling scenario Flow interval Total number of samples that needs to be analyzed TRP load Relative error in TRP load estimation
    mm kg ha−1 %
    Reference 7,320 1.37
    Discrete sampling 1 389 1.37 −0.2
    2 190 1.33 −3.1
    3 127 1.32 −2.9
    5 85 1.30 −5.4
    Six-aliquot compositing 1 66 1.36 −0.5
    2 34 1.27 −7.2
    3 24 1.30 −4.8
    5 16 1.34 −1.9

    Even though there were no major differences among relative errors of the flow-proportion discrete sampling strategies, the numbers of samples that needed to be analyzed were considerably different, from 389 samples when using 1-mm flow interval to 85 samples when using 5-mm flow interval. The six-aliquot compositing also followed the same trend of requiring a smaller number of samples for the 1-mm interval compared with the 5-mm flow interval. Therefore, the larger sampling intervals can provide reasonably well estimates of P load while costing much less for water analysis.

    3.5 Effect of flow-proportional compositing sampling scenario on TRP load estimation

    The accuracy TRP load estimation was not sensitive to the flow interval when using flow-proportional compositing sampling. Generally, no trend was observed between the relative error in TRP load estimation and flow interval (Table 3). The highest error (−7.2% underestimation) was produced by the 2-mm flow interval with 34 composite samples. The least error was observed when using the 1-mm flow interval with 66 analyzed samples.

    3.6 Practical application of the findings

    According to the results, the total number of samples that needed to be collected and analyzed was substantially different when selecting various sampling intervals (Tables 1–3). Although fewer samples were collected and analyzed in flow-proportional sampling strategies than time-proportional sampling strategies, lower uncertainty in TRP load estimation was observed when using flow-proportional sampling strategies. For example, if 5% error in P load estimation is the target, 85 and 610 samples need to be analyzed when using flow-proportional and time-proportional sampling strategies, respectively. The flow-proportional sampling strategy best represents the cumulative P load in drainage discharge because a greater portion of samples is taken at higher flow rates (Figure 4); however, time-proportional sampling strategies are simpler to be implemented (Harmel et al., 2003). Therefore, the flow-proportional sampling strategies provided a more accurate estimate of cumulative P load at a lower analysis cost compared with time-proportional sampling strategies.

    Details are in the caption following the image
    An example of the difference in sample timing between time-proportional and flow-proportional sampling. TRP, total reactive P

    During the decision-making process, the suitable sampling strategy is the one that provides a balance between the purpose of the study and the budget. Flow-proportional sampling strategies can estimate cumulative P loss during a certain period with a fewer number of water samples compared with time-proportional sampling strategies. However, they fail to record the P concentration in drainage discharge at each time step, especially during storm events when P concentration fluctuates rapidly. Generally, a shorter flow interval is suitable for smaller drainage areas, when P load is the objective, because a smaller drainage area conveys a lower volume of water. Similarly, a longer flow interval is suitable for larger drainage areas.

    High-frequency time-proportional discrete sampling strategy is needed if P transport dynamics (rapid P changed during storm events, event hysteresis pattern, flushing, and flashiness) are of interest to the monitoring program. If other low-frequency sampling strategies are used to assess P transport dynamics, they cannot capture rapid changes in P concentration during peak flow, thereby creating bias in the results. If accurate P transport dynamics is required, automated samplers or real-time sensors like HydroCycle-PO4 are needed. Due to the limited capacity of automated samplers, they cannot hold enough bottles. Therefore, water samples need to be retrieved from the field frequently, especially during storm events or when time or flow interval is short. Furthermore, real-time sensors eliminate any error related to delayed filtering.

    Under field conditions, depending on the selected sampling frequency, time-proportional discrete sampling can be performed either with manual grab samplers (i.e., sampling intervals longer than a day) or using automated samplers (i.e., sub-daily frequencies). An automated sampler can be used for all four sampling strategies. However, flow-proportional strategies rely on an external flow/depth sensor to send a signal to the data logger, and then the data logger sends another signal to the automated sampler to trigger sampling. This means that there is a greater chance of losing concentration data due to failure or malfunction of one of the sensors because there are more parts involved in the sampling process. A time-proportional strategy has a lower chance of losing the concentration data because sampling does not rely on the flow/depth sensor.

    A simple diagram showing the cost and relative error of the four sampling strategies for the study period is provided in Figure 5. For all sampling strategies, except for the 1-to-14-d discrete time-proportional sampling, an automated sampler is needed, so we included the cost of one automated sampler (assuming US$5,000) for each strategy. For all strategies, a standard flow sensor is required to estimate the P load (assuming $5,000) for each strategy. We assumed $10 per sample as the cost of chemical analysis. The flow-proportional sampling strategies produced almost the same accuracy for estimating P load as the high-resolution time-proportional sampling strategies (3, 6, and 12 h) with lower cost (Figure 5). The cost analysis is presented in Supplemental Table S2.

    Details are in the caption following the image
    A simple diagram of the differences in cost and relative error of the four sampling strategies over 305 d. All sampling strategies included a cost for an automated sampler, except for the 1-to-14-d discrete time-proportional sampling. The cost of US$10 per sample was included in the analysis. The cost of a flow sensor was included for all strategies

    4 CONCLUSIONS

    It is critical to obtain accurate P load estimates from subsurface-drained fields to comparatively evaluate conservation practices. In this study, we evaluated the accuracy of the following sampling strategies in P load estimation in drainage discharge: (a) time-proportional discrete sampling, (b) time-proportional composite sampling, (c) flow-proportional discrete sampling, and (d) flow-proportional composite sampling. Our study resulted in eight key conclusions:
    • All sampling strategies underestimated TRP load compared with the reference dataset. This underestimation should be considered in P budget calculations.
    • As time-proportional discrete sampling intervals increased from 1 to 14 d, underestimation of TRP load changed from 12 to 51%. The uncertainty in TRP load estimation declined as the sampling interval decreased.
    • The underestimation of TRP load changed from 12 to 43% as the time-proportional compositing scenario increased from 1- to 7-day composite, each with one aliquot per day.
    • The number of aliquots (n = 1, 2, 4, 6, and 8 aliquots per day) collected for the 1- to 7-d time-proportional composite did not substantially affect the accuracy of load estimations.
    • In the case of flow-proportional discrete sampling strategies, both discrete and composite sampling produced accurate results (relative error from −0.2 to −7.2%).
    • The flow-proportional sampling strategies provided a more accurate estimate of cumulative P load at a lower analysis cost compared with time-proportional sampling strategies.

    The purpose of the monitoring project should dictate the sampling strategy. If calculating the cumulative P load during a certain period is the main purpose of a monitoring program, flow-proportional sampling strategies (either discrete or composite) can be used to provide fairly accurate results with a smaller P budget underestimation. If P transport dynamics (rapid P changed during storm events, event hysteresis pattern, flushing, and flashiness) are of interest, high-frequency time-proportional sampling strategies are recommended. The high-frequency time-proportional sampling strategies can be performed with automated samplers or real-time sensors. This study provides new insight about the accuracy of each sampling strategy as stand-alone method, thereby helping the user make the best decision for choosing a sampling strategy. Even though this study deals with TRP, the findings apply to DRP because both are comprised of dissolved form of P.

    ACKNOWLEDGMENTS

    The authors express gratitude to Jason Piwarski for assistance with data collection and creating the map. Funding for this study was provided by the Michigan Department of Agriculture and Rural Development (Grant 791N7700580).

      AUTHOR CONTRIBUTIONS

      Babak Dialameh: Conceptualization; Data curation; Formal analysis; Investigation; Methodology; Software; Visualization; Writing – original draft. Ehsan Ghane: Conceptualization; Formal analysis; Funding acquisition; Investigation; Methodology; Project administration; Resources; Supervision; Validation; Writing – review & editing.

      CONFLICT OF INTEREST

      The authors declare that there is no conflict of interest.