Teaching forecasting and prediction

Ceyhun Ozgur1*, Sanjeev Jha2 and Madison Wallner3

*Correspondence:
Ceyhun Ozgur,
ceyhun.ozgur@valpo.edu

Received: 28 August 2022; Accepted: 03 September 2022; Published: 04 October 2022.

This manuscript aimed to materialize how to teach forecasting and predictions to undergraduate students. There are two modes of thinking, including the determination of forecasters and forecast errors. To determine the best forecast, we have to use the methodology of minimizing sums of forecast errors, which is further discussed in this manuscript. An example of this is single exponential smoothing, which is rather difficult when it comes to determining the smoothing constant alpha. In management and administrative situations, the need for planning is great because the lead time for decision-making ranges from several years to a few days or even a few hours. The ability to predict many types of events seems, as natural today as, to be the accurate forecasting of weather conditions in a few decades.

Keywords: teaching forecasting, exponential smoothing, forecast errors, prediction

1. Introduction

The “predict” comes from Latin descent, meaning to say or do in advance (1). The word “forecast” comes from the English Germanic root, meaning to calculate some future event or condition (2). Together, forecasting is the ability to predict the future. Teaching forecasting and prediction are done by utilizing the latest and most cost-efficient financial tools to optimize the financial cost of specific organizations and predict the financial situation of the specific organization to flourish (3). The first mode is System one, which is automatic and fast and has constant low-level predictions. The second mode is System two, which is effortful and has complex computations. In this manuscript, we are going to discuss how we calculate the forecast error and how to teach forecasting using simple exponential smoothing and a linear regression procedure.

2. Forecast error

The analysis of the accuracy of forecasts over a set period of time series is of much importance, due to forecasting function that is evaluated as inappropriate techniques. Therefore, the link to economic evaluation is changed (4). To determine the best forecast, we have to minimize the sum of the forecast errors. We first came up with the forecasted demand (Fi) and compared it with the actual demand (Ai). This is called the “Forecast Error,” which is equivalent to e = Ai-Fi. Then, once we come up with the forecast error, we can then determine the sum of the forecast errors. This equation is given by (Ai-Fi). The firm or organization provides us with the actual demand, but it is up to us to establish the predicted demand. The method used by each business or organization to determine the anticipated demand varies. For instance, Fi = Ai-1 means that the predicted demand might be identical to the actual demand from the prior period.

3. Single exponential smoothing

Single exponential smoothing shows a class of forecasting methods that show a variety of the most successful forecasting methods that are reliant on exponential smoothing. There are multiple methods that can be considered exponential smoothing, dealing with the fact that forecasts are weighted combinations of commercial observations. This type of forecasting is based on determining the smoothing constant alpha (5, 6). This equation is given by Fi = ??Xi + (1-??)Fi. The hardest part of utilizing single exponential smoothing in forecasting and prediction is determining the smoothing constant alpha. The larger the value of α or closer to 1, the less smoothed it is, and the smaller the value of α or closer to 0, the more smooth it is. Since the aforementioned smoothing calculations are straightforward and only a minimal amount of historical information should be preserved from one prediction to the next, it is possible to employ higher-order polynomials as forecasting models. (7).

Example 3.1: Let us take a situation with the following demand values 10, 12, 14, 15 and smoothing constant ?? = 0.1. Let us determine the forecasted demand with these given values. Assume that the first forecast is the same as the actual demand, which is 10.

The forecast for 2019 is 10, so the forecast for 2020 is (0.1*10). So:

Forecast i + 1 = (α*Actuali) + (1–α)*Forecasti

Forecast 20 = (α*Actual19) + (1–α)*Forecast19

F20 = (0.1*10) + (1–0.1)*10 = 10

A20 = 12

The Forecast for 2021 is:

Forecast i + 1 = (α*Actuali) + (1–α)*Forecasti

Forecast 21 = (α*Actual20) + (1–α)*Forecast20

F21 = (0.1*12) + (1−0.1)*10 = 10.2

The Forecast for 2022 is:

Forecast i + 1 = (α*Actuali) + (1–α)*Forecasti

Forecast 22 = (α*Actual21) + (1–α)*Forecast21

F21 = (0.1*14) + (1−0.1)*10.2 = 10.58

The Forecast for 2023 is:

Forecast i + 1 = (α*Actuali) + (1–α)*Forecasti

Forecast 23 = (α*Actual22) + (1–α)*Forecast22

F21 = (0.1*15) + (1−0.1)*10.58 = 11.02.

4. Regression analysis

We can also determine the forecast error by determining the forecast with a different procedure. The most common procedure that is used is the linear regression procedure.

Regression analysis is one of the most widely used techniques for reviewing multi-factor data.

Its familiarity and ability to predict come from a conceptual process of prediction utilizing an equation expressing relationships between a variable of interest and a set of other predictability variables (8). In a linear regression procedure, we have to come up with the equation for Fi. This can also be stated as Y = a + bX, where a is the y-intercept, b is the slope, X is the value of the forecast, and Y is the forecasted value of demand, which can also be stated as F.

Along with this, there is also robust regression. This is the same as linear regression but on a more “robust” scale. Some factors that are important when doing robust regression analysis are to protect against distortion by anomalous and good efficiency when the data come from the ideal Gaussian model, as well as from a range of other similar models (9). Regardless of which regression we decide to utilize, they are all very similar to one another when it comes to forecasting and prediction.

Example 4.1: The waiting time (in minutes) for bank customers wanting to sign up for a loan was measured at four banks for all customers who arrived between 2:00 and 2:30 p.m. on a certain Wednesday. The results are shown below.

Example 4.2: A real estate appraiser is developing a regression model to predict the selling price in a certain neighborhood. She is planning to use square footage as a variable, but she also wishes to incorporate the type of house (Colonial = 0, Tudor = 1, or Contemporary = 1) and whether the house has a brick front (brick = 1, no brick = 0). She obtains the following data.

Example 4.3: We are using air passengers’ data to show an example of forecasting using R. The advantage of using R or Python is that the syntax can be reused with little changes for new data or new cases. Comment lines are pre-fixed with a # sign.

TABLE 1
www.bohrpub.com

Table 1. Bank waiting time (minutes).

TABLE 2
www.bohrpub.com

Table 2. Analysis of variance.

TABLE 3
www.bohrpub.com

Table 3. Model summary.

TABLE 4
www.bohrpub.com

Table 4. Means (Figure 1).

TABLE 5
www.bohrpub.com

Table 5. Regression analysis coefficients.

TABLE 6
www.bohrpub.com

Table 6. Regression equation.

TABLE 7
www.bohrpub.com

Table 7. Coefficients.

TABLE 8
www.bohrpub.com

Table 8. Model summary.

TABLE 9
www.bohrpub.com

Table 9. Analysis of variance.

TABLE 10
www.bohrpub.com

Table 10. Fits and diagnostics for unusual observations.

# Linear regression analysis air passenger dataset

ap = AirPassengers

# install forecast

install.packages(“forecast”)

# Load forecast package

library(forecast)

# structure of the dataset

str(ap)

# view the dataset

View(ap)

# Plotting raw time series data (Figure 2).

FIGURE 1
www.bohrpub.com

Figure 1. Interval plot of comparison of means.

FIGURE 2
www.bohrpub.com

Figure 2. Air passengers over time.

plot(ap, main = “Airpassengers Over Time,” ylab = “Passengers”)

# Creating training dataset

train = window(ap, end = c(1,959,12))

# start and end date training data

start(train)

end(train)

# Creating test dataset

test = window(ap, start = c(1,960,1))

test

length(test)

# Model 1: Additive Trend and seasonality (Figure 3)

FIGURE 3
www.bohrpub.com

Figure 3. Additive trend and seasonality.

# Plotting the complete dataset

plot(ap, main = “Airpassengers Over Time,” ylab = “Passengers”)

model1 = tslm(train ∼ trend + season)

summary(model1)

lines(model1$fitted.values, col = “red,” lwd = 2)

# forecast

pred1 = forecast(model1, h = 12)

pred1

lines(pred1$mean, col = “blue,” lwd = 2).

# accuracy

accuracy(pred1, test)

# Model 2: Multiplicative Trend and Seasonality (Figure 4)

FIGURE 4
www.bohrpub.com

Figure 4. Multiplicative trend and seasonality.

# Plotting the complete dataset

plot(ap, main = “Airpassengers Over Time,” ylab = “Passengers”)

model2 = tslm(train ∼ trend + season, lambda = 0)

summary(model2)

lines(model2$fitted.values, col = “red,” lwd = 2)

# forecast

pred2 = forecast(model2, h = 12)

pred2

lines(pred2$mean, col = “blue,” lwd = 2)

# accuracy

accuracy(pred2, test).

5. Discriminant analysis

Discriminant analysis is a complete measurement of the difference between a categorical variable and a set of related variables. Essentially, discriminant analysis involves pattern recognition, which is a key to forecasting and prediction (10). However, there are two extremes when it comes to determinant analysis. The first being at the decision end of the scale, descriptive discriminant analysis intends to understand and predict the factors that determine group separation, and on the basis of a set of variables (entities) for which the group membership is already known. In contrast, the prescriptive discriminant analysis focuses on the classification of future variables (entities) of which the group membership is unknown. However, everyday discriminant analysis lies in between these two extremes, usually having to deal with no assignment or allocation when the groups are known to the researcher (11). When dealing with discriminant analysis, it is important to note error rates, posterior probabilities, and basic notation when utilizing this forecasting method. There are various familiar techniques such as linear discriminant and quadratic rules that show the relationship of estimates and mean and covariance parameters (12).

6. Teaching forecasting and prediction over the years

The decision-making can be as few as hours, but as many as days or weeks. This is why the need for planning is great in management and administrative situations. Old methods such as time series and ratio methods are classified as “passive” methods with no control (13). Through the time series, the use of regression and simulation is often “active,” and they can also be “passive” at times. However, it should be noted that throughout the last few centuries, predicting has made significant advancements. The advancement of research has expanded our awareness of diverse environmental factors, which has boosted our capacity to forecast many different events. Today, the capacity to foresee a variety of occurrences appears as natural as how accurate weather forecasting will be in a few decades (14). Despite these upgrades, two crucial points should be remembered. The first is that managers and others may not always immediately benefit from accurate forecasting, and the second is the distinction between external events that are out of one’s control and internal ones that one can influence. Both kinds of events are necessary for a business to succeed, but predicting is only relevant to the former while decision-making is only relevant to the latter. The connection that joins them is planning.

7. Efficient use of resources

The scheduling of production, transportation, money, employees, and other factors is necessary for the efficient utilization of resources. A key component of such scheduling is forecasts of the degree of demand for products, materials, labor, funding, or services. The lead time for purchasing new machinery and equipment, employing new employees, and sourcing raw materials might range from a few days to many years. The initial goal was to identify significant contributions, application areas, constraints, and research and application suggestions for the area’s future study. The second objective was to explain control models in terms of production management (15).

Forecasting is crucial to examine the need for future resource requirements. Long-term and short-term forecasting differs among organizations, and they must determine what is used in long term. Such decisions depend on market opportunities, environmental factors, and the internal development of financial, human product, and technological resources. These determinations all require good forecasts and managers who can interpret the predictions and make appropriate decisions. Figure 5 shows the use of the flow of resources from planning, to action, to future, and finally its impact.

FIGURE 5
www.bohrpub.com

Figure 5. Use of resource flow chart.

Regardless of what problems are being studied, it is important to provide an efficient algorithm or an algorithm with worst-case and performance analyses. Similarly, computational experiments demonstrate that the heuristics developed are capable of generating near-optimal solutions for any problem desired (16). With new models and new solution strategies, this can be extremely helpful for decision-making and for future research in areas of production scheduling and planning. This is important to support decision makers by improving their resource planning and scheduling system (17). If forecasting is to be successful, it is essential to recognize the strong dependency that exists between the predictions of different divisions or departments. The following instances may be impacted by errors in sale predictions through a chain of events: operational costs, cash flows, inventory inventories, and price forecasts. Similar budgeting mistakes in estimating the amount of funding available to each division will have an impact on product development, equipment modernization, staff employment, and advertising expenses. In turn, this will have an impact on, if not really dictate, how much money will flow in and out of the business. Clearly, the many forecasting functions within an organization are highly interdependent.

8. Difference between artificial intelligence and machine language

Artificial intelligence tries to determine the forecasted demand using the human brain while machine language can also determine the forecast error by using a computer program that functions like a machine. Artificial intelligence tries to develop learning models using the idea of the human brain, in particular, the ideas of neurons and connection, such as perceptions. This is also an area of computer science that emphasizes the creation of intelligent machine language that works and reacts like humans. With this, prediction is faster and cheaper, and it is a fundamental input to decision-making and a relatively newer business strategy. Machine learning is the study of algorithms and statistical models using a computer system to carry out a particular and specific intended activity. Essentially, this can be seen as a subset of artificial intelligence. However, machine learning algorithms build a mathematical model based on sample data in order to make predictions or decisions without being explicitly programmed to perform such a task (18). This tries to develop learning models primarily based on new optimization methods such as gradient boosting. In particular, machine languages are essentially computer programs that function at the bit/byte level stressing machine components such as logic gates. The goal is to ensure that all computer programs are able to solve problems and achieve success (19). Machine language or artificial intelligence is utilized to make predictions.

9. Human error and society in forecasting and prediction

Humans are naturally worse when it comes to heuristics and biases, large datasets, and through complex interactions when forecasting and predicting. In fact, there is nothing new about tragic accidents being caused by human error. For example, human errors complicate forecasting because natural disasters often happen in the area of human disasters. Now, the nature and the scale of possible hazardous technology mean that human error can have adverse effects around the globe while working with large datasets (20). However, human errors are better when it comes to small data sets such as children learning to recognize a face and the challenger space shuttle, as these are easy things to remember. In society, forecasting and prediction can be utilized to determine job loss, rise of inequality, involving few large companies, and few countries with an advantage. Regarding the primary cause of defects in forecasting, human error is the key to understanding, also to predicting, and avoiding them. There has been little research done to predict defects in forecasting on the basis of the cognitive errors that cause them (21).

10. Division of labor: humans vs. machines

Utilizing machines for prediction provides a better, faster, and cheaper prediction. Humans create their own predictions involving strategies such as anticipatory shipping for companies and organizations. However, there are precautions for both types of divisions of labor.

The division of labor between humans and computer systems has changed for both technology and humans. There has been a shift from technologies of automation with the aim of disallowing human intervention at nearly all points in the system (22). For machines, it is important to note that there may be biased datasets, small or incomplete datasets, and artificial intelligence for specific tasks. For humans, we have an enhanced value of human judgment. Computers are only capable of solving tasks within practices in which rules are specified by humans or patterns found in datasets. Humans have the capacity to change practices through creativity and innovation, as these processes are effective in nature. The paradox of this is that even though computers can make rapid change possible, human intervention is still a necessity (23).

11. Conclusion

In teaching forecasting and analytics along with prediction, there are many similarities. It is critical that academics understand the content that is currently being taught in these courses and develop the most optimal teaching model. These findings illustrate a current gap between teaching analytics and teaching forecasting with a significant variance among the topics taught across programs.

References

1. Merriam-Webster Dictionary. Predict. The Merriam-Webster.Com Dictionary. (2022). Available online at: https://www.merriam-webster.com/dictionary/predict

Google Scholar

2. Merriam-Webster Dictionary. Forecast. The Merriam-Webster.Com Dictionary. (2022). Available online at: https://www.merriam-webster.com/dictionary/forecast

Google Scholar

3. Mishra S. Financial management and forecasting using business intelligence and big data analytic tools. Int J Financ Eng. (2018) 5:1850011.

Google Scholar

4. Davydenko A, Fildes R. Forecast Error Measures: Critical Review and Practical Recommendations. Business Forecasting: Practical Problems and Solutions. (2016).

Google Scholar

5. Hyndman R, Khandakar Y. Automatic time series forecasting: the forecast package for R. J Stat Softw. (2008) 26:1–22. doi: 10.18637/jss v027

CrossRef Full Text | Google Scholar

6. Hyndman R, Koehler AB, Ord JK, Snyder RD. Forecasting With Exponential Smoothing: The State Space Approach. Berlin: Springer Science & Business Media (2008).

Google Scholar

7. Brown RG, Meyer RF. The fundamental theorem of exponential smoothing. Operat Res. (1961) 9:673–85.

Google Scholar

8. Montgomery DC, Peck EA, Vining GG. Introduction to Linear Regression Analysis. Hoboken, NJ: John Wiley & Sons (2021).

Google Scholar

9. Li G. Robust regression. Explor Data Tables Trends Shapes. (1985) 281:U340.

Google Scholar

10. McLachlan GJ. Discriminant Analysis and Statistical Pattern Recognition. Hoboken, NJ: John Wiley & Sons (2005).

Google Scholar

11. Silva A, Stam A. Discriminant Analysis. (1995).

Google Scholar

12. Rayens WS. Discriminant Analysis and Statistical Pattern Recognition. (1993).

Google Scholar

13. Sinuany-Stern Z. Forecasting Methods in Higher Education: An Overview. Handbook of Operations Research and Management Science in Higher Education. (2021).

Google Scholar

14. Suhardi S, Widyastuti T, Bisri B, Prabowo W. Forecasting analysis of new students acceptance using time series forecasting method. Jurnal Akrab Juara. (2019) 4:10–23.

Google Scholar

15. Dolgui A, Ivanov D, Sethi SP, Sokolov B. Scheduling in production, supply chain and industry 4.0 systems by optimal control: fundamentals, state-of-the-art and applications. Int J Prod Res. (2019) 57:411–32.

Google Scholar

16. Chen ZL, Vairaktarakis GL. Integrated scheduling of production and distribution operations. Manage Sci. (2005) 51:614–28.

Google Scholar

17. Haase K. Lotsizing and Scheduling for Production Planning. (Vol. 408). Berlin: Springer Science & Business Media (2012).

Google Scholar

18. Khanzode KCA, Sarode RD. Advantages and disadvantages of artificial intelligence and machine learning: a literature review. Int J Library Inform Sci. (2020) 9:3.

Google Scholar

19. McCarthy J. What is Artificial Intelligence. (2004). Available online at: http://www-formal.stanford.edu/jmc/whatisai.html

Google Scholar

20. Reason J. Human Error. Cambridge: Cambridge University Press (1990).

Google Scholar

21. Strigini L, Huang F. HEDP: A Method for Early Forecasting Software Defects Based on Human Error Mechanisms. (2021).

Google Scholar

22. Ekbia H, Nardi B. Heteromation and its (dis) Contents: The Invisible Division of Labor Between Humans and Machines. First Monday. (2014).

Google Scholar

23. Hammershøj LG. The new division of labor between human and machine and its educational implications. Technol Soc. (2019) 59:101142.

Google Scholar