Are USDA Soybean Yield Forecasts Getting Better or Worse over Time?

Sep 04, 2014

Scott Irwin and Darrel Good

Department of Agricultural and Consumer Economics
University of Illinois

Dwight Sanders

Department of Agribusiness Economics

In a farmdoc daily article last week (August 29, 2014) we examined trends in the accuracy of USDA yield forecasts for corn and reported evidence that the accuracy of USDA forecasts has improved over time, particularly since 2011. We also noted that an unusually large August forecast error this year would be counter to the trend towards increasingly accurate USDA corn forecasts over time. The purpose of this article is to conduct a similar analysis of trends in the accuracy of USDA soybean yield forecasts in recent years. The analysis should provide additional perspective on the debate about the U.S. soybean yield in 2014. Two previous farmdoc daily articles (August 19, 2011; August 20, 2011) and a research report earlier this year examined the accuracy of USDA soybean yield forecasts. The analysis in those publications is updated here.

USDA Forecasting Procedures

As a starting point, it is important to understand the different procedures used by the USDA to generate soybean yield forecasts over the May through November forecasting cycle. The May, June, and July soybean yield forecasts are prepared by the World Agricultural Outlook Board (WAOB) of the USDA and the basic methodology used by this agency has been one of trend yield plus an adjustment for current year conditions and the timeliness of planting. The WAOB began using a more formal crop weather regression model as the benchmark for these forecasts in 2013. The August, September, October and November soybean yield forecasts are prepared by the National Agricultural Statistics Service (NASS) of the USDA and this agency uses a large-scale survey methodology. The key point is that soybean yield forecasts for May-July are model-based while the August-November forecasts are survey-based (the same is true for corn). We focus in this article on the survey-based forecasts generated for August through November.

Like corn, two types of surveys to collect data for the monthly soybean production forecasts in August through November. These are referred to as the Agricultural Yield Survey (or the farmer-reported survey) and the Objective Yield Survey (or the field measurement survey). The sample of farm operations for the Agricultural Yield Survey (AYS) is drawn from those who responded to the survey of planted acreage in June. The sampling design to select the operations to be surveyed uses multiple control items, such as number and type of commodities planted and desired sample size for each commodity, to determine the probability of selecting a particular operation. The same operations are interviewed each month from August through November. Most of the survey data are collected in electronic form using computer-assisted telephone interviewing. Each state in the survey is expected to achieve a minimum response rate of 80 percent.

The monthly AYS data are reviewed for consistency with previous surveys for the individual respondents and an across-record review is conducted to identify any extreme values that need to be re-checked. A summary program which accounts for sampling weights and includes an adjustment for non-respondents is used to generate an indication of expected average yield for Agricultural Statistics Districts (regions within states) and for each state surveyed. The yield indications from the survey reflect the judgment of respondents (farmers) and historical relationships indicate that respondents tend to be conservative in estimating final yields (under-estimate yield potential) particularly under drought conditions. This tendency is quantified and factored into the official yield forecasts.

The Objective Yield Survey (OYS) is designed to generate yield forecasts based on actual plant counts and measurements, eliminating some of the biases associated with the farmer-reported yields. The sample of fields selected for the OYS survey is selected from farms that reported soybean planted or to be planted in the June survey of acreage. Records from the June survey are sorted by state, district, county, segment, tract, crop, and field. A random sample of fields is drawn with the probability of selection of any particular field being proportional to the size of the tract. Two counting areas, or plots, are randomly selected in each field. Objective measurements (such as counts of plants and pods) are made for each plot each month during the survey cycle. When mature, the plots are harvested and yield is calculated based on actual production minus an allowance for harvest loss. Just before the field is harvested, both plots are hand harvested and weighed by the enumerator. At maturity, the gross yield of soybeans is calculated as the number of pods with beans per 18 square feet times bean weight per pod and then converted to bushels per acre. Harvest loss is measured in separate units near the yield plots.

Prior to maturity and harvest, the OYS soybean yield forecast requires a forecast of the number of plants per 18 square feet, the number of pods with beans per plant, and bean weight per pod. Forecasts are based on conditions as of the survey date and projected assuming normal weather conditions for the remainder of the growing season. The state average gross yield for the OYS is the simple average of the gross yields for all the sample fields. In addition, a state yield forecast is also made by first averaging the forecast or actual yield factors (such as pod counts and pod weights) and then forecasting the state average yield directly from these averages. This forecast is based on a regression analysis of the historical relationship (15 years) between the yield factors and the state average yield. Historical relationships indicate that OYS yield indications tend to over-estimate yield potential when estimating final yields. This tendency is quantified and factored into the official yield forecasts.

The survey and forecasting procedures described here produce a number of indictors of the net yield of soybeans. In August these indicators include: i) average field level yields from the OYS, ii) average state level counts from the OYS, and iii) the average yield reported by farm operators in the AYS. After harvest begins, yields reported by farmers are also included as an indicator of final yield. Each of the indicators results in a point yield forecast for which forecast errors are computed based on the historical relationships between forecasts and actual yield. The range of yields is evaluated relative to all of the pieces of available data to assist in the selection of the official yield forecast. This process is completed independently in each state and at the national level. A formal Agricultural Statistics Board (ASB) consisting of 7 to 10 statisticians is convened to review regional yield indicators and determine an official yield forecast. Data for the final estimates released in January are collected in the December Agricultural Survey in which respondents report actual acres harvested and the actual yield or production. More detailed discussions of USDA/NASS crop yield forecasting procedures can be found in this farmdoc daily article (August 28, 2013), a Marketing and Outlook Brief, and in the USDA publications found here and here.

USDA Forecast Accuracy

To evaluate the historic accuracy of USDA soybean yield forecasts, the August, September, October, and November forecasts are compared to the "final" yield estimate released in January (we say "final" because January estimates are sometimes revised based on the September grain stocks estimates at the end of the marketing year and the Agricultural Census conducted every five years). The differences between the forecasts and the final estimates in percentages over 1990-2013 are presented in Figures 1 through 4. When interpreting the errors, note that a positive error implies an under-estimate on the part of USDA and a negative error implies an over-estimate. As is well-known, errors associated with the USDA soybean yield forecasts are occasionally very large, such as 2003 and 2012. These examples of large errors are not surprising due to the unusual insect and weather events that occurred in those years. Soybean forecast errors in recent years have been within the historical range, except for September and October 2012. However, there is also a general tendency for soybean forecasts to be conservative, in the sense of underestimating the final yield. This is most pronounced over 2002-2013, when about three-quarters of the forecasts across all four months under-estimated final yield.

The USDA soybean yield forecasts are examined for both bias and changes in accuracy through time. The average percent errors for each forecast month for the entire sample period and for sub-samples formed from 1990-2001 and 2002-2013 are presented in Table 1. The average error calculations presented in Table 1 show that the soybean yield forecasts have a consistent downward bias in nearly all forecast months and subsamples. For example, the October yield forecast was on average too low by 0.89 percent from 1990-2001 and 1.26 percent too low from 2002-2013. The bias (1.07 percent) across the entire sample period is statistically different from zero at the 5 percent level. A statistically significant bias is also found in the September and November yield forecasts. The August forecasts are downward biased, but not statistically different from zero. Overall, the results show a clear statistical tendency for USDA to under-estimate soybean yield. The magnitude of the downward bias, while not large is also non-negligible. For example, the 2.56 percent downward bias in September soybean forecasts over 2002-2013 is 1.1 bushels when stated on a bushels per acre basis.

Table 2 presents the average absolute percent errors in the soybean yield forecasts for each month and sample period. The average absolute percent error declines as the growing season moves from pre-harvest through harvest. For the entire sample period, for example, the absolute error averaged 5.00 percent for the August forecast and only 0.76 percent for the November forecast. There is not a uniform tendency for errors to get either bigger or smaller over the sample period. For the two subsamples, the average absolute percent error increased for the August and November forecasts, but declined for the September and October forecasts. The difference in the average absolute error for the sub-periods was not statistically different from zero for any month. Changes through time are also examined by regressing the absolute percent errors against a constant and a linear time trend:

|USDA Percent Errorm,t|= a + bTrendt + et

where Trendt is a time trend variable for crop year t that takes a value of 1 in 1990, 2 in 1991, and so on and et is a standard, normal error term. The estimated coefficients on the trend variable show some tendency for declining errors over time across all months. For example, the October absolute percent error declined by 0.06 percent per year on average; but this was a very small average change and it was not a statistically significant trend. Generally, there is little evidence that USDA soybean yield forecast accuracy, in an absolute sense, has changed through time.

Implications

Our analysis of USDA yield forecasts for soybeans over 1990-2013 indicates there is a general tendency for soybean forecasts to be conservative, in the sense of under-estimating the final yield. The downward bias is statistically significant in September, October and November for the entire sample period and especially pronounced in 2002-2013, when about three-quarters of the forecasts across all four months under-estimated final yield. The magnitude of the downward bias, while certainly not large is also non-negligible. For example, the 2.56 percent downward bias in September soybean forecasts over 2002-2013 is 1.1 bushels when stated on a bushels per acre basis. Forecast accuracy, in the sense of absolute percent errors, did not change markedly during the sample period. What, if anything, do these results imply about the direction of USDA soybean yield forecasts in remaining Crop Production reports during 2014? Since there is a strong tendency towards under-estimation in recent years, it will come as no surprise if the USDA's August forecast of 45.4 bushels per acre increases in upcoming reports.

Source: Farmdocdaily