


















Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Community
Ask the community for help and clear up your study doubts
Discover the best universities in your country according to Docsity users
Free resources
Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors
The notes covers short introduction about statistical modelling, comparison between time series and cross-sectional data, classical time series decomposition, exploratory data analysis of a time series data, and sample implementation in R program.
Typology: Lecture notes
1 / 26
This page cannot be seen from the preview
Don't miss anything!
Figure 1: The interconnection of process, realization, and model.
A Process pertains to the true mechanism or system that generates data set. The data generated from the process is called realization. This realization is then used as the basis in constructing the (mathematical) model. This model represents or approximate the system that generates the data. Of course, it doesn’t end in the model. The model can now be used as a basis for intervention in the process. The updated or new process will have a new realization. New realization will result to a new model.
Statistical modeling aims at providing more flexible tools for data analysis.
Objectives for Modeling a Structure of a System
Areas in Statistical Modeling
A time series is a sequence of observations that are arranged according to the time of their outcome. Simply, it pertains to a series of data points ordered in time.
Common directions/reasons in doing time series analysis
(Discrete) Time Series
Some examples of Time Series:
Cross Section Data
Some examples:
Time Series Decomposition
A time series can be decomposed in to four major components:
Figure 3: Trend of Monthly Mauna Loa Atmospheric CO2 Concentration from 1959 to 1997. As observed, there is apparent increasing trend on Atmospheric CO2 Concentration over the years.
Seasonal Variation St
plot(decompose(co2)$season,ylab= "CO2 Concentration (ppm)")
Figure 4: Seasonal Component of Monthly Mauna Loa Atmospheric CO2 Concentration from 1959 to 1997. As observed, there is apparent recurrent pattern for every 6 months.
Cycle Ct
plot(lynx,ylab="Numbers of Lynx Trappings")
Figure 6: Irregular Component of Monthly Mauna Loa Atmospheric CO2 Concentration from 1959 to 1997.
Exercise 1 : Component of Monthly sales of new one-family houses sold in the USA since 1973(Makridakis, Wheelwright and Hyndman, 1998).
library(fpp2) autoplot(hsales) + xlab("Year") + ylab("Monthly housing sales (millions)")
40
60
80
1975 1980 1985 1990 1995
Figure 7: Monthly sales of new one-family houses sold in the USA since 1973(Makridakis, Wheelwright and Hyndman, 1998).
Given the time series Monthly housing sales (Figure 7),
Other components of Time Series
In a classical decomposition of time series, a time series can be expressed as
Yt = f ( Tt, Ct, St, It ) (1)
where
Figure 9: Monthly sales for a souvenir shop at a beach resort town in Queensland, Australia, for January 1987-December 1993 (Wheelwright and Hyndman, 1998).
Figure 8 and Figure 9 show two time series with trend and seasonality. For Figure 8,the seasonal component doesn’t vary over time. However, this is not the case for Figure 9 in which the seasonal component seems to be increasing overtime. For the case in Figure 8, time series decomposition using the additive model is more appropriate. While in the case of Figure 9, time series decomposition using the multiplicative model is more appropriate.
Trend-cycle Model In time series decomposition, trend and cycle are usually combine into a single term (trend-cycle). Trend-cycle describe the long term behavior of the time series. For simplicity, it is Trend-cycle depreciated to trend component. Thus, the model will now be
Moving Average A moving average of order n (also called n -MA) is given by
M At =
n
∑^ k
i =− k
yt + i (4)
where n = 2 k + 1
Example : The moving average of order 3.
t y 3-MA (Centered) 3-MA (Trailing)
Jan-1959 315. Feb 1959 316.31 315_._ 42+316 3_._ 31+326_.^50 = 319._ 41 Mar-1959 326.50 316_._ 31+326 3_._ 50+317_.^56 = 320._ 1233
315_._ 42+316_._ 31+326_._ 50 3 = 319_._ 41 Apr-1959 317.56 326_._ 50+317 3_._ 56+318_.^13 = 320._ 73
316_._ 31+326_._ 50+317_._ 56 3 = 320_._ 1233 May-1959 318.13 317_._ 56+318 3_._ 13+318_.^00 = 317._ 8967
326_._ 50+317_._ 56+318_._ 13 3 = 320_._ 73 Jun-1959 318.00 318_._ 13+318 3_._ 00+316_.^39 = 317._ 5067
317_._ 56+318_._ 13+318_._ 00 3 = 317_._ 8967 Jul-1959 316.39 318_._ 13+318 3_._ 00+316_.^39 = 317._ 5067
Exercise 2 : Compute the 6-MA (centered) of the following
date= seq.Date(from = as.Date('1959-01-01'), to = as.Date('1960-12-01'), by = 'months') data = data.frame("Date"=date, "CO2.ppm."=co2[1:24])
knitr::kable(data, caption = "Monthly Mauna Loa Atmospheric CO2 Concentration from 1959 to 1960", digits = 2)
Table 2: Monthly Mauna Loa Atmospheric CO2 Concentration from 1959 to 1960
Date CO2.ppm. 1959-01-01 315. 1959-02-01 316. 1959-03-01 316. 1959-04-01 317. 1959-05-01 318. 1959-06-01 318. 1959-07-01 316. 1959-08-01 314. 1959-09-01 313. 1959-10-01 313. 1959-11-01 314. 1959-12-01 315. 1960-01-01 316. 1960-02-01 316. 1960-03-01 317. 1960-04-01 318. 1960-05-01 319. 1960-06-01 319. 1960-07-01 318. 1960-08-01 315. 1960-09-01 314.
plot(tc, main="Trend-Cylce (2 x 12-MA)",ylab="CO2(ppm)")
Trend−Cylce (2 x 12−MA)
Step 2: Compute the detrended series dt.
Example:
plot(co2,main="Monthly Mauna Loa Atmospheric CO2 Concentration",ylab="CO2(ppm)")
Monthly Mauna Loa Atmospheric CO2 Concentration
# using the additive model dt = co2-tc plot(dt,main="detrended",ylab="CO2(ppm)")
caption = "Seasonal Factor for each Seasonal Indices (Months)", digits = 4)
Table 3: Seasonal Factor for each Seasonal Indices (Months)
Month Seasonal.Factor January -0. February 0. March 1. April 2. May 3. June 2. July 0. August -1. September -3. October -3. November -2. December -0.
## plot the monthly seasonal effects plot.ts(sf, ylab = "Seasonal effect", xlab = "Month", cex = 1,main="Seasonal Factor")
Seasonal Factor
## seasonal component estimate
season_comp <- ts(rep(sf, periods + 1)[seq(L)], start = start(dt),frequency = ff) plot(season_comp,main="Seasonal Component Estimates",ylab="CO2(ppm)" )
Seasonal Component Estimates
Step 4: Compute the Irregular Component It
Example:
Irregular = dt - season_comp plot.ts(Irregular,main="Irregular Component",ylab="CO2(ppm)")
Figure 10: Monthly Mauna Loa Atmospheric CO2 Concentration from 1959 to 1997(Box, Jenkins and Reinsel, 1976).
Let’s say our t is February 1959, we
zt = 1959 +
since February is the 2nd month and there 12 months in a year.
class(co2)
z_t <- time(co2) head(z_t)
co2_trend_linear <- lm(co2~z_t)
summary(co2_trend_linear)
plot(co2,ylab= "CO2 Concentration (ppm)") abline(co2_trend_linear,col="red")
Exercise 3 Using the AirPassengers dataset , apply the appropriate time series decomposition.
Descriptive Analysis of Time Series Data
Things to look for: