A Global Perspective on Integrated Strategies to Manage Soil Phosphorus Status for Eutrophication Control without Limiting Land Productivity.

Unnecessary accumulation of phosphorus (P) in agricultural soils continues to degrade water quality and linked ecosystem services. Managing both soil loss and soil P fertility status is therefore crucial for eutrophication control, but the relative environmental benefits of these two mitigation measures, and the timescales over which they occur, remain unclear. To support policies toward reduced P loadings from agricultural soils, we examined the impact of soil conservation and lowering of soil test P (STP) in different regions with intensive farming (Europe, the United States, and Australia). Relationships between STP and soluble reactive P concentrations in land runoff suggested that eutrophication control targets would be more achievable if STP concentrations were kept at or below the current recommended threshold values for fertilizer response. Simulations using the Annual P Loss Estimator (APLE) model in three contrasting catchments predicted total P losses ranging from 0.52 to 0.88 kg ha depending on soil P buffering and erosion vulnerability. Drawing down STP in all catchment soils to the threshold optimum for productivity reduced catchment P loss by between 18 and 40%, but this would take between 30 and 40+ years. In one catchment, STP drawdown was more effective in reducing P loss than erosion control, but combining both strategies was always the most effective and more rapid than erosion control alone. By accounting for both soil P buffering interactions and erosion vulnerability, the APLE model quickly provided reliable information on the magnitude and time frame of P loss reduction that can be realistically expected from soil and STP management. Greater precision in the sampling, analysis, and interpretation of STP, and more technical innovation to lower agronomic optimum STP concentrations on farms, is needed to foster long-term sustainable management of soil P fertility in the future.


Abstract
Unnecessary accumulation of phosphorus (P) in agricultural soils continues to degrade water quality and linked ecosystem services. Managing both soil loss and soil P fertility status is therefore crucial for eutrophication control, but the relative environmental benefits of these two mitigation measures, and the timescales over which they occur, remain unclear. To support policies toward reduced P loadings from agricultural soils, we examined the impact of soil conservation and lowering of soil test P (STP) in different regions with intensive farming (Europe, the United States, and Australia). Relationships between STP and soluble reactive P concentrations in land runoff suggested that eutrophication control targets would be more achievable if STP concentrations were kept at or below the current recommended threshold values for fertilizer response. Simulations using the Annual P Loss Estimator (APLE) model in three contrasting catchments predicted total P losses ranging from 0.52 to 0.88 kg ha -1 depending on soil P buffering and erosion vulnerability. Drawing down STP in all catchment soils to the threshold optimum for productivity reduced catchment P loss by between 18 and 40%, but this would take between 30 and 40+ years. In one catchment, STP drawdown was more effective in reducing P loss than erosion control, but combining both strategies was always the most effective and more rapid than erosion control alone. By accounting for both soil P buffering interactions and erosion vulnerability, the APLE model quickly provided reliable information on the magnitude and time frame of P loss reduction that can be realistically expected from soil and STP management. Greater precision in the sampling, analysis, and interpretation of STP, and more technical innovation to lower agronomic optimum STP concentrations on farms, is needed to foster long-term sustainable management of soil P fertility in the future.

A Global Perspective on Integrated Strategies to Manage Soil Phosphorus Status for Eutrophication Control without Limiting Land Productivity
Paul J. A. Withers,* Peter A. Vadas, Risto Uusitalo, Kirsty J. Forber, Murray Hart, Robert H. Foy, Antonio Delgado, Warwick Dougherty, Harri Lilja, Lucy L. Burkitt, Gitte H. Rubaek, Dan Pote, Kirsten Barlow, Shane Rothwell, and Phillip R. Owens D iscovered in 1669 by Hennig Brandt and recognized in the 19th century as a critical nutrient for crop growth, phosphorus (P) is now a widely used input into modern agriculture, largely in the form of inorganic fertilizers and recycled manures (Ashley et al., 2011;Sharpley et al., 2018). From the mid-20th century, agricultural intensification across the developed world has been transformed by a rapid increase in fertilizer use, which in turn has led to considerable accumulation of P in soils due to an excess of P inputs over crop P removals (Li et al., 2015;Weaver and Wong, 2011). This build-up of soil P is now recognized as a historical legacy with lasting and complex impacts on ecosystem functioning (Kleinman, 2017;Macdonald et al., 2016;Sharpley et al., 2013). The accelerated and now endemic loss of soil P in surface runoff and drainage contributes to eutrophication, which has become such a major environmental issue that significant reductions in P loadings from agricultural land to inland and coastal waters are required (Carpenter, 2005;Fischer et al., 2017). Phosphorus losses from agricultural soils are hard to mitigate because they occur in both dissolved and particulate form and are transported from highly variable source areas depending on soil type, the extent of soil P accumulation, erosion vulnerability, and hydrological connectivity to the waterbody (Kleinman et al., 2011;Withers and Bowes, 2018). While soils are only one of multiple nutrient sources that affect freshwater and marine ecosystems, managing soils and soil P status represents an important strategy for the mitigation of eutrophication.
Modern agriculture has come to rely on maximizing the availability of P in soils through "insurance" application rates of highly soluble P fertilizers to build up soil P fertility and minimize risks to productivity (Withers et al., 2014). This makes it harder to manage resulting soil P losses. While recommended threshold soil test P (STP) values based on routine soil sampling and analysis have been identified to help gauge the likely yield response to applied P on farms (Bai et al., 2013;Nawara et al., 2017;Speirs et al., 2013), many regions with intensive agriculture have STP concentrations above the threshold values required for optimum agricultural crop production (here these threshold values are termed the agronomic optimum: Gourley et al., 2015;IPNI, 2015;Tóth et al., 2014). This accumulation even extends into subsoils in some regions (Rubaek et al., 2013). Maintaining high STP concentrations confers no agronomic benefit and is therefore not only wasteful of P resources, but it risks further environmental damage through increased loss of soil P (Valkama et al., 2011;Withers et al., 2017). Monitoring STP status is especially relevant to eutrophication risk because it governs the release of dissolved inorganic P (here termed soluble reactive P, SRP) in runoff from soils, which is highly bioavailable to aquatic biota (Ekholm et al., 2009;Reynolds and Davies, 2001). In contrast, STP status has been shown to have less impact on the P content of suspended particles in land runoff, which is already high (typically 1 g kg -1 ) at low STP levels (e.g., Withers et al., 2017). Additional management strategies are consequently required to mitigate the transfer of particulate P (PP) due to erosion (Dodd and Sharpley, 2016).
Managing soil P loss must therefore rely on reducing both transport (runoff and erosion) risk and STP concentrations. However, the scale of necessary soil P controls to achieve the desired improvements in water quality, and the timescales over which they must be operative, have not been clarified. Legislation to help reduce STP levels to the agronomic optimum has been generally slow to implement despite widespread eutrophication problems, and there is a lack of general guidelines on ways to manage a reduction in STP and the amount by which to reduce it. Even if soil P concentrations were reduced to the agronomic optimum, it is not clear whether this is sufficient to reduce P concentrations in runoff sufficiently to help alleviate eutrophication problems (Cassidy et al., 2017;Duncan et al., 2017). The concentrations of SRP or total P required to limit algal growth in flowing and standing waters are extremely low (20-100 mg L -1 ) in relation to the variably high concentrations of P (often >1-2 mg L -1 ) in runoff from agricultural soils (Chambers et al., 2012;Withers and Bowes, 2018). Ideally, any reduction in P inputs to lower STP status warranted for environmental gain must not negatively affect crop productivity. Producers require appropriate evidencebased advice on appropriate management of soil P to ensure the future sustainable intensification of agriculture. This paper considers the scale of the challenge to reverse eutrophication from legacies of soil P accumulation without affecting agricultural output. We examine the relationship between STP and SRP concentrations in natural and simulated runoff, and in surface and subsurface flow, from different soils and landscapes across three continents (North America, Europe, and Australia) to provide the context for the need to reduce STP for environmental gain. Using the Annual P Loss Estimator (APLE) model (Vadas et al., 2018), we predict the relative impact of STP reductions and erosion control measures on dissolved P and PP transfer in three contrasting catchments (Finland, United Kingdom, and United States) to help guide mitigation policies and the timescales required for them to take effect. We identify how far STP levels need to fall to minimize eutrophication risk and how an integrated approach to soil P management will deliver most benefits to water quality and related ecosystem services. APLE modeling was conducted previously for a portion of the United States Chesapeake Bay catchment and performed well against measured STP drawdown field data (Vadas et al., 2018). It is included here as a comparison to the UK and Finland catchments.

Soil Test Phosphorus versus Runoff Soluble Reactive Phosphorus Relationships
National datasets from Australia, Finland, Ireland, Spain, the United Kingdom, and the United States were collated and analyzed to determine the effect of soil P fertility on SRP concentrations in runoff from agricultural land (Table 1). Soil P fertility was measured by the standardized STP method used in each country for determining P fertilizer needs (Australia: sodium bicarbonate at pH 8.5 with an extraction time of 16 h [P- Colwell, Colwell, 1963] Olsen, Olsen et al., 1954]; United States: acidic [pH 2.5] extraction with ammonium fluoride, acetic acid, ammonium nitrate and nitric acid for 2 h [P-Mehlich 3, Mehlich, 1984]) and reported in milligrams per kilogram of soil or, for Finland, Ireland and some datasets in the United Kingdom, milligrams per liter of soil. These methods cover either quantity-or intensity-type measurements (Nawara et al., 2017). Soil sampling depth was 0-5 to 0-20 cm for cultivated soils and 0-1 to 0-5 cm for grassland soils. The SRP in runoff was measured colorimetrically after filtering through a 0.45-mm cellulose or 0.2-mm polycarbonate filter and reported in milligrams per liter. The datasets included both surface runoff and subsurface flow (drainage) hydrological pathways generated by either natural rainfall, simulated rainfall in situ (arable or grassland), or simulated rainfall indoors using repacked soils in runoff boxes or undisturbed soil monoliths (30 cm diam., 40 cm deep) for simulated drainage (Table 1). Drain flow data generated under natural rainfall can be considered representative of runoff P delivered to a watercourse or connecting ditch. All other data represent P mobilized in surface runoff but not necessarily delivered to a watercourse due to the possibility of further P retention in the landscape. The field sites selected for runoff monitoring in each country were either replicated field experiments with a gradient of soil P status, or soils collected from them, or fields on farms within catchment monitoring programs and varying in soil type and STP status. The datasets therefore included data subsets, allowing both individual field site and cross-site relationships within a country. Any sites with recent (<6 mo) inputs of fertilizer or manure were excluded from the analysis as far as possible. Collectively, the field sites represent a wide range of climatic conditions, soil types, hydrological conditions, and farming systems across the developed world, where eutrophication of freshwaters linked to intensive agriculture is a particular concern. Monitoring periods varied from site to site. Where flow was monitored continuously, runoff P concentrations were normalized according to the amount of flow (i.e., reported as flow-weighted SRP concentrations). For discrete runoff sampling protocols (e.g., repeated grab samples from drains), runoff P concentrations were reported as the arithmetic mean. Further details of the datasets and experimental conditions used in each country are given in the Supplemental Material.

Evaluation of Soluble Reactive Phosphorus Response
For ease of comparison, individual site data within each country were grouped and categorized into six subset types representing the different hydrological pathways and monitoring procedures: natural surface runoff (natural surface), natural subsurface runoff (natural drain), simulated surface runoff in boxes (simulated boxes), simulated subsurface runoff (simulated drainage), simulated in situ runoff from arable land (simulated arable) and simulated in situ runoff from grassland (simulated grass). Not all countries had all data subsets (Table 1). For each dataset, SRP response to increasing STP was fitted by either a linear (SRP = a + b × STP), or an exponential function (SRP = a×e (k×STP) , where the terms a, b, and k are fitted parameters, and e is Euler's number) as the preferred models. Selection of the preferred model was based on Akaike's information criterion (Akaike, 1974). Fitting and comparisons were made using Prism 8 software (Graph Pad, Inc.).
As linear or exponential fitting did not always allow prediction of runoff SRP concentrations at low STP due to large negative intercepts, all datasets were also fitted in R by a second-order polynomial model, SRP = a + bxSTP + cxSTP 2 , where a, b, and c are fitted parameters and where this was significantly (P < 0.05) preferred over a linear fit. The SRP values corresponding to 0.5, 1, and 2 times the range in agronomic optimum STP concentrations recommended in each country were then estimated from the different response functions fitted. The threshold STP ranges chosen for each country represented national recommendations appropriate for the soil types and crop types examined (Australia: 29 to 55 mg kg -1 [Gourley et al., 2007]; Finland: 6-10 mg L -1 [Valkama et al., 2011], Ireland: 5-10 mg L -1 [Wall and Plunkett, 2016]; Spain: 8-20 mg kg -1 [Delgado et al., 2016]; United Kingdom: 16-25 mg kg -1 [Defra, 2010]; and United States: 30-80 mg kg -1 [IPNI, 2015]).
Soluble reactive P response to increasing STP across countries was also compared using a common STP method: P-Olsen. Country-specific STP data for Finland and Ireland were all converted to P-Olsen based on conversion equations using detailed method comparisons: a set of 268 Finnish soil samples determined for both P-Acetate and P-Olsen gave a nonlinear relationship P-Olsen = 54.9 × P-Acetate 0.2824 -56.9 (R 2 = 0.77), (Uusitalo, unpublished data, 2016); a subset of 199 Irish soils gave a relationship P-Olsen = 5.96 × P-Morgan 0.773 (R 2 = 0.74) (Foy et al., 1997). For Australia and the United States, experimental datasets where P-Olsen was measured directly and available were used. As before, the SRP response was fitted by a preferred exponential function where this was statistically equal or better than a linear fit. We further tested if one curve would adequately describe some, or all, subset data combined by comparing R-squared, sums-of-squares and model Sy.x statistics. A standardized range in agronomic optimum P-Olsen concentrations of between 10 and 40 mg kg -1 was chosen for predicting SRP concentrations in relation to eutrophication control targets.

APLE Model Predictions
APLE is a field-scale, annual time-step, spreadsheet model that simulates soil P dynamics in two topsoil layers, and dissolved and sediment bound P loss with surface water runoff and erosion (Vadas et al., 2009;. APLE requires the following inputs: initial P-Mehlich-3 STP (mg kg -1 ), soil clay and organic matter (%); annual P export in harvested crops (kg ha -1 ); annual P application rates, types, and methods of application for manure and fertilizer; annual degree of soil mixing of topsoil layers; annual precipitation and runoff (cm); and annual erosion (kg ha -1 ). We used APLE to simulate P loss from agricultural land in three contrasting catchments as a function of different STP levels over time as well as P transport rates of runoff and erosion. To enable scenarios of STP drawdown, soil P levels were set by first simulating an initial 10-yr period of P addition so that STP at the end of 10 yr was at a given target agronomic optimum level, and then adjusting annual P additions to maintain STP at the target level for 40 more years. We set annual crop P export to typical local rates for all 50 yr. This then provided the current optimum default for the model runs.
To compare the effect of lowering P transport rates relative to P drawdown, erosion rates were reduced by 2.5% yr -1 during the STP rundown period to represent incremental implementation of more aggressive soil conservation measures throughout the catchment. This is a theoretical and conservative rate of implementation, but consistent with recent data from Europe (Kertész and Madarász, 2014) and concerns over a potential future ban on the use of glyphosate to control weeds in no-till or reduced till systems. We also incrementally reduced soil mixing of the two simulated layers along with erosion to represent an increase in soil P stratification and greater dissolved P loss that comes with no or minimum tillage (Dodd and Sharpley, 2016).
A description of the catchments, selection of model parameters for each catchment, and the catchment categories of STP, runoff, and erosion applied in the model are described in Supplemental Material and summarized in Tables 2 and 3. The APLE model was then run with the following scenarios: 1. Current STP: This represented P loss for current STP and P transport conditions. Simulated STP levels were based on measured soil data collected in the different catchments and estimates of runoff and erosion rates for current land use and soil types (see Table 3). 2. Optimum scenario: Same as the "Current" scenario, but assumed target STP levels for all cropland were at local agronomic optimum. These simulations represented P loss from agricultural land that is managed to maintain the target agronomic STP and assumed modern crop production output would be optimum. This corresponds to management strategy A in Rowe et al. (2016). 3. Below-optimum scenario: Same as the "Optimum" scenario, but assumed target STP levels for all cropland were below the currently recommended agronomic optimum. These simulations represented a lower limit of P loss associated with cropping systems that were agro-engineered to more efficiently access and assimilate available soil P and give similar yield performance. This corresponds to management strategy B in Rowe et al. (2016). 4. Drawdown scenario: Same as the "Current" scenario, but assumed STP levels are drawn down from current levels over a period of 40 yr. We simulated P drawdown by setting up the default STP simulations but eliminated annual P additions when soil P was above agronomic optimum. If STP decreased below the target agronomic optimum during drawdown, we simulated P additions to maintain it at the agronomic optimum. 5. Drawdown-transport scenario: Same as the "Drawdown" scenario, but also reduced erosion annually by 2.5% from the previous year during the 40-yr P drawdown period. Annual runoff was not changed.
In each scenario, one APLE simulation represented "fields" of a discrete combination of STP level and P transport (erosion and runoff ). For example, in the Paimionjoki catchment, there were six STP levels and six P transport categories providing a total of 36 combinations of soil STP and transport for each scenario above. Each STP level and transport category had an assigned catchment area (ha) that it represented (Table 3). For example, 30.7% of the catchment had an STP level of 35 mg kg -1 STP and 50.5% of the catchment had a transport category of 100 kg ha -1 of erosion and 7.5 cm of runoff. This particular combination of STP level and transport category thus represented 15.4% of the catchment (or 6816 ha). We then multiplied average annual P loss (kg ha -1 ) for each of the 36 combinations by the catchment surface (in ha) represented by that simulation. The sum of the P loss products represented estimated total P loss from the entire catchment.

Results and Discussion
Our analysis spanned three continents and catchments with different agro-hydrochemical functioning. Biomes included a wide range of soils types, from young soils on glacial till in Europe to old, Fe-rich ferrosols in Australia, and spanned temperate to mediterranean climates with marked seasonal differences in rainfall patterns. Arable and pastoral farming systems and associated P recommendation systems also varied in the range of crops being grown, length of growing season, and need for irrigation (Table 2, Supplemental Material). The study areas were therefore expected to represent variable runoff response, soil P release patterns, and erosion vulnerability, monitoring protocols covering both surface and subsurface hydrological routing, and storm intensities spanning natural rainfall events in Europe to more intense events under simulated rainfall in the United States and Australia.

Soluble Reactive Phosphorus Response to Soil Test Phosphorus
All datasets showed a highly significant positive effect of soil P status on runoff SRP concentrations at all sites regardless of national STP methods used, and for all experimental conditions and hydrological pathways (Fig. 1). A majority of the variance  Supplemental Table S1. was taken into account by fitting either a linear or nonlinear (exponential or polynomial) function, with no indication of any particular bias in the statistically preferred function. Detailed parameters from the preferred linear and exponential models are given in Supplemental Table S1. A linear model was preferred in 10 of the 16 datasets. An exponential response conforms more to soil P sorption theory, which predicts that SRP desorbs more readily into solution as soil P accumulates due to a reduction in P sorption strength and buffering power (Barrow, 2015). A linear relationship might be expected where the range in STP is small, P diffusion gradients into runoff do not reach equilibrium (wide soil-to-runoff ratios and limited soil-water contact time), or the STP method only extracts a small proportion of soil P (Koopmans et al., 2002;Neyroud and Lischer, 2003). For example, nonlinearity in SRP response in Finnish soils notably increased when converting P-acetate (I-based method) to P-Olsen (Q-based method) (Supplemental Fig. S1). Both linear and nonlinear relationships have been widely reported in the literature (e.g., McDowell et al., 2001;Vadas et al., 2005).
Predicted SRP concentrations across the range in recommended threshold STP concentrations typically used in each country were quite similar regardless of which preferred response function was used (Supplemental Table S2). However, only the polynomial model enabled SRP values at 0.5, 1.0 and 2.0 times the full range in nationally recommended agronomic optimum STP concentrations to be predicted across all datasets. The SRP concentrations varied from 0.02 to 0.37 mg L -1 at the agronomic optimum. The highest SRP values were obtained in surface runoff from simulated grassland in Australia and the United States, while the lowest values were obtained in surface runoff under simulated arable and grassland in Ireland and in natural drainage in the United Kingdom (Fig. 2). A generally larger variability in SRP response in pasture soils (e.g., simulated studies in the United States) has been noted previously (Vadas et al., 2005) and may be related to either variable SRP release from the permanent vegetation, enhanced biological processing of soil organic P (e.g., grass root debris), and/or a seasonal effect of soil moisture on SRP diffusion rates (Pote et al., 1999). Lower SRP concentrations recorded in flow through field drains under natural rainfall in the United Kingdom are also in line with previous studies (Haygarth et al., 1998;Withers et al., 2009b), although this trend was not observed under Finnish conditions. This may be due to the more strongly structured clay soils in Finland, where rapid flow along cracks and biopores decreases the contact time between percolating water and subsoil .
In Finland and the United Kingdom, where datasets covered both natural rainfall and simulated rainfall, SRP concentrations were consistently higher under natural rainfall. This may be due to either residues from recent fertilizer and manure inputs, although these were avoided in site selection as much as possible (e.g., Hart et al., 2004), SRP release from wetting and drying cycles that occur naturally in soils (Peltovuori and Soinne, 2005) but which would be absent in rainfall simulation studies that were almost always prewetted before simulation, and/or a longer soil-water contact time than would be present under the generally higher and standardized rainfall intensities used in rainfall simulation studies (Dougherty et al., 2008).
Concentrations ranged up to 0.69 mg L -1 at twice the agronomic optimum, and up to 0.28 mg L -1 at half the agronomic optimum (Fig. 2). Predicted runoff SRP concentrations in runoff from Australia and the United States were nearly always well above the target SRP concentrations required for eutrophication control (i.e., >0.1 mg L -1 ) when STP concentrations were at or exceeded the agronomic optimum. For the United States, the high SRP predictions may be at least partly due to the wide range in agronomic optimum STP values chosen. With the exception of Australia, where runoff SRP concentrations were always above 0.1 mg L -1 , eutrophication control targets could only be reliably met when STP was less than that recommended for optimum crop yields (Fig. 2).

Standardized P-Olsen Comparisons
Standardization of STP using P-Olsen for all datasets provided the opportunity to compare SRP response across a common agronomic optimum STP range (10-40 mg kg -1 ). Under natural rainfall, predicted SRP concentrations were lower in Finland and Ireland than in the United Kingdom (Fig.  3A): for example, values at the agronomic optimum were 0.029 to 0.095 mg L -1 in Finland compared with 0.070 to 0.262 mg L -1 in the United Kingdom. The UK dataset was dominated by silty soils with lower P buffering capacity than the datasets from Ireland and Finland that were dominated by more clayey soils. The observed site convergence of fitted SRP values in natural surface runoff at higher STP levels also reinforces the notion of a stronger P buffering effect at lower STP ( Fig. 3A; Barrow, 2015). This difference in P buffering was notably absent in natural flow through field drains (Fig. 3B), presumably due to the damping effect of P filtering and sorption during subsurface routing of the runoff. Hence, a common line fitted natural drainage data from both Finland and the United Kingdom predicting SRP concentrations of 0.025 to 0.105 mg L -1 at the optimum P-Olsen range (10-40 mg kg -1 ).
Simulated rainfall studies using runoff boxes and in situ on arable fields showed very similar SRP responses ( Fig. 3C and D). Across Europe, the Finnish, UK, and Spanish datasets could all be fitted by a common exponential model predicting SRP concentrations of 0.033 to 0.082 mg L -1 at the agronomic optimum. Only the Irish surface box data did not quite fit the common pattern, but this was a much smaller dataset. Predicted SRP values for US simulated boxes and simulated arable datasets were very similar (0.059-0.161 and 0.031-0.157 mg L -1 , respectively) but increased more rapidly than European soils as STP increased. In Fig. 3. Relationships between soil test P concentrations and soluble reactive P concentrations in surface and subsurface runoff under different monitoring protocols when standardized to P-Olsen across different countries and continents. The statistically preferred linear or exponent fitted lines are also shown. Some fitted lines are common to more than one dataset. Fitted parameters are given in Supplemental Table S3. contrast, runoff SRP concentrations from Australian soils in boxes were considerably higher at all P-Olsen levels (Fig. 3), with a predicted SRP concentration of 0.4 mg L -1 at a P-Olsen of 40 mg kg -1 .
The trend for much greater SRP release from Australian soils was also apparent in simulated grassland experiments: SRP concentrations in simulated runoff were 0.186 to 0.340 mg L -1 from Australian soils but only 0.097 to 0.155 mg L -1 from US soils at the agronomic optimum (Fig. 3F). Ireland's grassland data were limited but gave an SRP concentration of 0.083 mg L -1 at 20 mg kg -1 P-Olsen. Similarly high SRP release on P fertile pasture soils was noted in other Australian studies under both simulated rainfall (Burkitt et al., 2010) and natural rainfall including irrigation (Barlow et al., 2005). This suggests that Australian soils have much lower P buffering power than EU and US soils despite their more extensive weathering, which may be due to the type of Fe and Al oxides and hydroxides present in their soils (Burkitt et al., 2010;Dougherty, 2006). When all predicted SRP values across the agronomic optimum range are compared (Fig. 4), it becomes clear that most soils in Europe and the United States with ≤20 mg kg -1 P-Olsen equivalent have a greater chance of meeting the low target SRP concentrations required for eutrophication control. Threshold STP concentrations nearer 10 mg kg -1 P-Olsen equivalent will lower eutrophication risk further. Additional measures appear necessary on Australian soils.
Differences in soil P buffering capacity were also evident for two Australian sites that were excluded from the initial grouping of site data because of their deviant behavior. For example, SRP concentrations increased much more slowly on highly buffered basaltic soils and much more rapidly on poorly buffered coarse sands than for other soils receiving simulated rainfall (Supplemental Fig. S3). Hart and Cornish (2012) observed the same effect when pasture soils received simulated rainfall in situ, and similar site specificity in SRP response on more extreme soil types has been widely reported in other studies (McDowell et al., 2003). This highlights the important role soil P buffering plays in determining agronomic and environmental thresholds for management, and additional P buffering, or P saturation indicators may be required to identify deviant sites requiring more or less sensitive management (Ehlert et al., 2003;Nair and Harris, 2014). For example, a simple single point P sorption test to quantify a P buffering index is now performed routinely on soils in Australia to guide both the selection of agronomic optimum STP level and the amount of fertilizer required to raise STP (Burkitt et al., 2002;Gourley et al., 2007).

Wye Catchment
For the Wye Catchment on the border of England and Wales, modeling results show that total P loss in runoff ranged from 0.19 kg ha -1 at low STP and transport (runoff + erosion) combinations, to 3.2 kg ha -1 at high combinations. At the catchment scale, total P loss was 0.52 kg ha -1 and SRP loss dominated (60% of total P loss) because erosion P transport was much lower than STP-driven transport. These results match very well with other estimates of catchment-scale P loss for the River Wye. A tributary headwater (<1 km 2 ) representing high combinations of STP and runoff risk monitored from 1994 to 2000 showed annual P loss rates varying up to 4 kg P ha -1 depending on rainfall and river flow, with typically 35% in dissolved form (Hodgkinson and Withers, 2007;Withers and Hodgkinson, 2009). Monitoring of small headwaters (<10 km 2 ), larger lowland subcatchments (41-90 km 2 ), the upper Wye (1283 km 2 ), and the lower Wye (4812 km 2 ) over a 2-yr period representing all region classes showed average total P loss rates from diffuse sources of approximately 0.5 kg ha -1 , with >60% in SRP form ( Jarvie et al., 2008( Jarvie et al., , 2010. Previous model output for the whole of the Wye catchment estimated an average total P transfer to watercourses of 0.54 kg ha -1 , with 50% in dissolved form (Strömqvist et al., 2008).
For any rate of runoff and erosion, as STP increased, the percentage contribution of sediment P loss decreased. This is a function of the APLE equations, which predict that the pool of available P that feeds simulated SRP loss increases faster than total soil P, which feeds the sediment P loss (i.e., more of the total soil P remains as available P). This trend was true for results in all three catchments simulated and is fully consistent with experimental data; for example for Wye catchment soils, Withers et al. (2009a) found that SRP increased 50% faster than PP in runoff as STP increased. APLE results also show that total catchment P loss could be reduced by 28% if all soils were at optimum STP levels and by 55% if soils were maintained at belowoptimum levels (Fig. 5). However, reducing all soils to optimum STP to fully realize this 28% reduction would take 40 yr under current management. Compared with drawdown results alone, APLE results showed a somewhat limited ability of conservation practices to further reduce P loss (from 28 to only 36%), mostly because simulated erosion rates delivering sediment from the field were already fairly low (100-500 kg ha -1 ).

Paimionjoki Catchment
For the Paimionjoki catchment in southwest Finland, total P loss in runoff ranged from 0.25 kg ha -1 at low STP and transport combinations to 16.8 kg ha -1 at high combinations. At the catchment scale, total P loss from agricultural fields was 0.88 kg ha -1 and sediment P loss dominated (83% of total P loss) because erosion P transport was much greater relative to SRP transport. These results are in excellent agreement with Tattari et al. (2017), who estimated a mean 1.1 kg ha -1 total P loss from arable land, with 84% as PP, in the nearby Savijoki catchment (a tributary of the River Aurajoki), which is similar in soils and farming intensity but smaller (15 km 2 ) in size. For Paimionjoki, Vuorenmaa et al. (2002) and Ekholm et al. (2015) estimated mean catchment total P losses of 0.6 to 0.8 kg ha -1 , with >85% derived from agricultural land.
APLE results also show that P loss could be reduced by 18% if all soils were at optimum STP levels and by 31% if soils were at below-optimum levels (Fig. 5). However, reducing all soils to optimum STP to fully reach this 18% reduction would take much longer than 40 yr (only a 10% reduction would be realized in 40 yr). This is a much longer drawdown period than for the Wye catchment because the soils in the Paimionjoki have much greater total P per unit of STP due to their greater clay and organic matter content and therefore a greater ability to buffer the exploitation of available STP pools through time. Compared with drawdown results alone, APLE results showed a very high potential for conservation practices to further reduce P loss (from 18 to 31%), mostly because relatively high simulated erosion rates (100-7200 kg ha -1 ) offered an opportunity to reduce P transport caused by erosion. In addition, combining P drawdown together with soil conservation offered as much opportunity to reduce total P loss as reducing all soils to below-optimum STP levels.

Chesapeake Catchment
For the Chesapeake catchment on the east coast of North America, total P loss in runoff ranged from 0.20 kg ha -1 at low STP and transport combinations to 8.71 kg ha -1 at high combinations. At the catchment scale, total P loss was 0.84 kg ha -1 and sediment P loss dominated (67% of total P loss) because erosional P transport was much greater relative to SRP transport. APLE results showed that P loss could be reduced by 40% if all soils were at optimum STP levels, but this would take at least 30 yr to fully realize. Catchment P loss could be reduced by 55% if all soils were at below-optimum STP levels. Compared with drawdown results alone, APLE results showed a high potential for conservation practices to further reduce P loss (from 40 to 62%) because fairly high simulated erosion rates (112-4483 kg ha -1 ) gave an opportunity to reduce P transport through erosion. In fact, combining P drawdown with soil conservation offered as much P loss reduction in 10 yr as would take 30 yr to achieve through P drawdown alone (Fig. 5).

Optimizing Soil and Soil Phosphorus Management for Eutrophication Control
Phosphorus management must become more efficient and sustainable if food is to be produced with adequate supplies of P but without impairing water quality. The eutrophication of inland and coastal waters is a costly and growing societal issue, and global agriculture is a significant source of P loadings (Dodds et al., 2009;Fink et al., 2018). Soil conservation practices to control runoff and erosion, and maintain soil P fertility no higher than the agronomic optimum, are recommended measures (among many others) for limiting the source and transfer of particulate and soluble P in field runoff from agricultural soils (Sharpley et al., 2000;Schoumans et al., 2014). However, expectations over their relative impact at the field and catchment scale are rarely explored. The modeling outputs presented here suggest the relative P loss reductions of these two key measures, and the timescale they require to take effect, will vary between catchments. In catchments with high STP levels and only modest soil erosion rates (e.g., Wye), management should focus on reducing STP in addition to erosion control. Conversely, in catchments where erosion rates are high and drawdown of STP is much slower due to soil P buffering (e.g., Paimionjoki), soil conservation and hydrological controls to reduce runoff will give the most P loss reduction. Models like APLE that can represent the interactions of soil P buffering and erosion vulnerability at catchment scale, and which are easy to parameterize, can therefore help catchment managers and policymakers prioritize these management strategies and give rapid and reliable information about the magnitude and timeframe of P loss reduction that can be realistically expected.
Although the APLE simulations represent only what is lost from agricultural soil at the edge of a field, and do not account for "incidental" P loss from freshly applied fertilizers or manures (i.e., a potential underestimate) or P retention within the wider landscape (i.e., a potential overestimate), they gave realistic diffuse P export rates in the study catchments without recourse to model calibration or optimization typical of more sophisticated process-based models with large data input requirements. Model parameters and algorithms have been previously tested against long-term field data on soil P dynamics and drawdown rates (Vadas et al., 2012(Vadas et al., , 2018. Previous detailed sensitivity and uncertainty analysis for the APLE model showed that P loss predictions might vary by 6 to 20% for ±5% difference and by 14 to 24% for ±15% difference in P input parameters (Bolster et al., 2016). Application of these uncertainty bands still produces P loss predictions that are in line with measured P exports in the three study catchments. The STP drawdown rates (given a certain set of soil properties) are a function of average annual crop P removal (i.e., mass balance) rather than the declines in soil P by P loss in runoff and erosion. Average crop P removal rates are much more certain, and any variation in these rates will not likely affect how long a specific P mass balance will take. For example, in applying the APLE model to the Chesapeake catchment, Vadas et al. (2018) found that estimates of the time to achieve a certain STP drawdown using measured crop P removal rates each year were no different to those using a "best-fit" average rate.
The model baseline for P transport did not take into account erosion control measures already applied in each catchment and assumed all soils were under inversion tillage. Model output therefore likely overestimates the P loss reductions achievable due to erosion control if applied where soil conservation measures are in place, such as grassland buffer strips on steeply sloping land. The benefits of conservation agriculture on soil functions will also likely vary substantially depending on site conditions (Ghaley et al., 2018). In addition, APLE model outputs assume that SRP and PP have equal eutrophication impact. In reality, at the same level of P load mitigation, SRP controls are likely to lead to lower eutrophication risk than PP controls because of the greater bioavailability of SRP to aquatic biota and the complex processing of PP in receiving waters (Ekholm and Lehtoranta, 2012). This has been demonstrated for lotic and lentic waters representative of the Finnish catchment (Ekholm and Krogerus, 2003), strong links between riverine SRP concentrations and biotic abundance and diversity in the Wye catchment (Defra, 2008), and the successful ecological restoration of eutrophied freshwaters following reductions in SRP-rich effluent P loadings from wastewater centers or industry (Schindler et al., 2016).
Policy strategies toward control of agricultural eutrophication have tended to focus more on runoff and erosion prevention and limiting farm surpluses than active management of STP. Soil test P analysis was never designed to inform P loss risk to waterbodies, and STP level is only one of a number of factors influencing catchment P export. A statistical linkage between regional variation in STP levels and catchment P flux is therefore difficult to establish (Ekholm et al., 2015;McDowell et al., 2019). However, given the variable and often short-term effectiveness of soil conservation, buffer strips, and wetlands (Dodd and Sharpley, 2016), more active drawdown of high STP soils would in theory help deliver more long-term and long-lasting environmental gains by reducing source P mobilization: the APLE model predicted 18 to 55% P loss reductions were potentially achievable in our study catchments. These long-term environmental gains are particularly pertinent to the many intensively farmed regions that have a high percentage of soils with unnecessarily high STP concentrations; for example in the context of this study, 80% of Australian dairy pasture soils (Gourley et al., 2015;Hart and Cornish, 2016), 50% of Finnish soils , 40% of UK soils (PAAG, 2016) and 34% of US soils (IPNI 2015) have STP levels that exceed the agronomic optimum. Drawdown of soil P fertility to realign P supply to match more closely recommended crop P needs is also a key 5R (Re-align P inputs, Reduce P losses, Recycle P in bioresources, Recover P in wastes, and Redefine P in food systems) strategy for global sustainable P management that helps to conserve global reserves of phosphate rock and close the P cycle by recirculating unused legacy P resources held in the soil (Rowe et al., 2016;Withers et al., 2015).
Although clear guidance exists on managing soil P status to build up and maintain soil P fertility, management guidelines on STP drawdown are notably lacking. Tensions understandably exist over how soil P status can be reduced without sacrificing crop productivity (Buckley and Carney, 2013;Wall et al., 2013). While current research suggests that P inputs can be safely omitted in the short term at many sites (typically up to 10 yr), there is significant site and seasonal variation in crop reactions to lack of freshly applied P, which is poorly understood (Rowe et al., 2016). The APLE model outputs presented here suggest that P loss reductions achievable from STP drawdown will also be slow (30-40+ years); such long-term and variable timespans for STP drawdown have been reported elsewhere (Sharpley et al., 2013;Wall et al., 2013). Management options for more aggressive drawdown of STP seem limited because agricultural crops are not P accumulators.
Not all farmers are concerned with P efficiency and do not regularly undertake STP analysis. There are also inherent limitations in the accuracy of STP sampling and analysis , as well as inconsistencies in its interpretation ( Jordan-Meille et al., 2012), which confound its more widespread adoption and usefulness as a management tool. Recent trends toward minimum-or no-tillage cultivation systems add further confusion over the value of conventional STP protocols. Given the clear environmental justification for more sensitive soil P management, an overhaul of current P recommendation systems for agriculture seems warranted, including greater precision and consistency in STP sampling, analysis procedures, and interpretation. For example, Sylvester-Bradley et al. (2019) recommended not only standardizing soil sampling procedures on neighboring farms but also better recording of crop yields and P offtakes so that changes in STP can be linked to field P balances, allowing anomalous results to be identified more easily. Additional indicators of P buffering and/or degree of P saturation may also be required to identify sites at very high or very low risk of P loss to aid soil P management, as is already practiced in some regions (Gourley et al., 2007) and widely recommended (Kleinman, 2017).
Our analysis of SRP response to STP clearly shows that SRP mobilization in runoff is likely to be much closer to the challenging target levels required for effective eutrophication control when STP is at the lower end of the recommended range in agronomic optimum set for different crops (Fig. 4). Advances in precision farming and integrated management practices that enhance the largely unutilized biochemical synergistic interactions that occur between soil and crop (e.g., through microbial, rhizosphere, and crop engineering) may enable successful crop production at lower threshold STP levels than is currently recommended (Rowe et al., 2016). Lower threshold STP levels would also reduce site differences in SRP mobilization rates and reduce the need for additional soil P tests (Withers et al., 2017). Future sustainable P management within the food system must reconsider old philosophies that place undue emphasis on maintaining an artificially high level of soil P fertility dependent on inputs of highly soluble manufactured fertilizers toward new philosophies that concentrate on precision feeding of the crop not the soil (Withers et al., 2014;Sylvester-Bradley et al., 2019). Drawdown may not be feasible on some farms due to the P loading pressures imposed by high livestock intensities and the need to recycle livestock manures. Improved governance of P at catchment and regional scales will be needed in these situations to help balance out the distribution of these valuable secondary resources more evenly . There is also still a need to increase P fertilizer applications in certain regions limited by P (Macdonald et al., 2011;Sharpley et al., 2018); therefore, reducing unnecessary P inputs in regions with already excessive soil P fertility potentially frees up P to move to P-deficit regions without exhausting finite and critical global P reserves (Dumas et al., 2011).

Concluding Remarks
Soil test P is unnecessarily high in many intensively farmed regions due to past overuse of P inputs. Our analysis reaffirms that this soil P accumulation poses an accelerated risk to freshwater eutrophication and needs to be addressed through drawdown of STP and controls over soil runoff and erosion. The variable efficacy of these measures in P loss reduction and the timespans for them to take effect suggest models like APLE that are easy to parameterize are needed to guide effective catchment management of P. Although STP analysis was developed primarily as a management tool to guide agronomic performance, it proved a useful indicator of the mobilization risk of dissolved P loss in field runoff, and more policy effort is needed to encourage its more rigorous implementation for more sustainable soil P fertility management, especially if combined with additional P buffering or saturation indexes in highly eutrophic catchments. Better guidance on ways to improve precision in sampling, analysis, and interpretation of STP to enhance P use efficiency on farms is needed.
Our meta-analysis across three continents suggests agronomic optimum STP concentrations need to be in the range of 10 to 20 mg kg -1 P-Olsen equivalents to match the very challenging eutrophication controls targets set for freshwaters. Drawdown of STP will take many years and is clearly a long-term term mitigation option, but one that is fully consistent with developing more sustainable management of P use in the food chain by helping to close the P cycle through P reuse. Farmers are naturally concerned that omitting P fertilizer will compromise crop productivity. Future research must therefore provide them with the evidence base to guide the management of STP drawdown so that productivity is not compromised. Better appreciation of the environmental benefit of controlling STP would encourage policymakers to work with industry to foster expansion of voluntary soil testing and adherence to recommended threshold levels for productivity, or where that fails, exert more regulatory pressure as part of more efficient and sustainable P management practice. As such, soil and soil P management represent an important component of the suite of multiple interventions required to mitigate P losses from agriculture and make food production more sustainable in the future.

Supplemental Material
Supplemental material includes a description of the datasets selected in each country to investigate STP versus SRP relationships together with tables detailing the fitted parameters from the linear and exponent model fits (Table S1 and S3), and the range in runoff SRP concentrations at the nationally recommended agronomic optimum STP concentrations predicted by linear/exponent or polynomial models (Table S2). Supplemental Fig. S1 gives an example of the increased non-linearity in SRP response when STP is measured by a Q-based method rather than an I-based method. Supplemental Fig. S2 shows the polynomial model fits used to predict SRP concentrations at 0.5, 1, and 2 times the agronomic optimum STP level according to country-specific STP methods. Supplemental Fig. S3 gives examples of deviant site behavior in SRP response. A summary description of each the three catchments together with the derivation of the parameter settings used in the APLE model is also given.

Conflicts of Interest
The authors declare no conflicts of interest.