Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Introduction about Time Series Analysis, Lecture notes of Statistics

The notes covers short introduction about statistical modelling, comparison between time series and cross-sectional data, classical time series decomposition, exploratory data analysis of a time series data, and sample implementation in R program.

Typology: Lecture notes

2023/2024

Available from 06/18/2024

johniel-babiera
johniel-babiera 🇵🇭

1 / 26

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
TSA: Introduction
Introduction
The Concepts of Process, Realization and Model
Figure 1: The interconnection of process, realization, and model.
A Process pertains to the true mechanism or system that generates data set. The data generated from the
process is called realization. This realization is then used as the basis in constructing the (mathematical)
model. This model represents or approximate the system that generates the data. Of course, it doesn’t end
in the model. The model can now be used as a basis for intervention in the process. The updated or new
process will have a new realization. New realization will result to a new model.
Statistical Modeling
Statistical modeling aims at providing more flexible tools for data analysis.
Objectives for Modeling a Structure of a System
Understanding and describing the generating mechanism: One goal of statistical modeling
is to approximate the reality or the system where the data is observed or generated. The derived
mathematical equation (called statistical model) is used to describe the processes involve in the system.
1
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a

Partial preview of the text

Download Introduction about Time Series Analysis and more Lecture notes Statistics in PDF only on Docsity!

TSA: Introduction

Introduction

The Concepts of Process, Realization and Model

Figure 1: The interconnection of process, realization, and model.

A Process pertains to the true mechanism or system that generates data set. The data generated from the process is called realization. This realization is then used as the basis in constructing the (mathematical) model. This model represents or approximate the system that generates the data. Of course, it doesn’t end in the model. The model can now be used as a basis for intervention in the process. The updated or new process will have a new realization. New realization will result to a new model.

Statistical Modeling

Statistical modeling aims at providing more flexible tools for data analysis.

Objectives for Modeling a Structure of a System

  • Understanding and describing the generating mechanism : One goal of statistical modeling is to approximate the reality or the system where the data is observed or generated. The derived mathematical equation (called statistical model) is used to describe the processes involve in the system.
  • Forecasting of future values : The statistical model that approximate the reality can be used to predict events (or forecast future events).
  • Optimal control of a system : By understanding the system and (optionally) predict the future events, one can make intervention or plan a optimal control in the system to either prevent or make changes in the system.

Areas in Statistical Modeling

  • Time Series Modeling – describes how a particular value is influenced by its past values.
  • Spatial Modeling – describes how a particular value is influenced by its “neighboring” value
  • Space-Time Modeling – describes how a particular value is influenced by its “neighboring” and its past value. Regression Modeling – describes how a particular value is influenced by independent variables or covariates.

Time Series

A time series is a sequence of observations that are arranged according to the time of their outcome. Simply, it pertains to a series of data points ordered in time.

Common directions/reasons in doing time series analysis

  • to study the dynamic structure of a process;
  • to investigate the dynamic relationship between variables;
  • to perform seasonal adjustment of economic data; and
  • to improve regression analysis when the errors are serially correlated

Time Series Data and Cross Section Data

(Discrete) Time Series

  • a sequence of values of some variables taken at successively equally spaced time periods like a day, a week, a quarter, a month, or a year.
  • assumed data are measured at equally spaced, discrete time intervals
  • a set of observations generated sequentially in time. Hence, they are dependent (correlated) to each other

Some examples of Time Series:

  • Daily exchange rate of Philippine Peso to U.S Dollar from January 1, 2000- December 31, 2019
  • Monthly deaths due to respiratory diseases in the Philippines from 1990-
  • Monthly sales of rice from 1990-
  • Daily COVId-19 infected individuals from February 2020 to November 1, 2020.

Cross Section Data

  • a sequence of values of some variables taken for a specific period of time and for different entities.
  • usually uncorrelated observations are observed in cross section data.

Some examples:

  • This day exchange rate of national currency of Asian countries to U.S Dollar.
  • Grade point average of the freshmen MSU-IIT students. Sentiment of 500 customers about the service of the company.

Time Series Decomposition

Time Series components

A time series can be decomposed in to four major components:

Time

CO2 Concentration (ppm)

Figure 3: Trend of Monthly Mauna Loa Atmospheric CO2 Concentration from 1959 to 1997. As observed, there is apparent increasing trend on Atmospheric CO2 Concentration over the years.

Seasonal Variation St

  • Regular periodic fluctuations that occur within year.
  • results from events that are periodic and recurrent. (e.g. monthly changes recurring each year)
  • No seasonality for annual data.

plot(decompose(co2)$season,ylab= "CO2 Concentration (ppm)")

Time

CO2 Concentration (ppm)

Figure 4: Seasonal Component of Monthly Mauna Loa Atmospheric CO2 Concentration from 1959 to 1997. As observed, there is apparent recurrent pattern for every 6 months.

Cycle Ct

  • undulating wave-like change around the trend.
  • refers to up and down fluctuations that are observable over extended period of time.
  • covers longer period than the seasonal variation. (clear only for series of 20 years or more)
  • possible cause: change in economic conditions.
  • difficult to forecast because they are recurrent but not periodic.
  • Cycles are often irregular both in height of peak and duration

plot(lynx,ylab="Numbers of Lynx Trappings")

Time

CO2 Concentration (ppm)

Figure 6: Irregular Component of Monthly Mauna Loa Atmospheric CO2 Concentration from 1959 to 1997.

Exercise 1 : Component of Monthly sales of new one-family houses sold in the USA since 1973(Makridakis, Wheelwright and Hyndman, 1998).

library(fpp2) autoplot(hsales) + xlab("Year") + ylab("Monthly housing sales (millions)")

40

60

80

1975 1980 1985 1990 1995

Year

Monthly housing sales (millions)

Figure 7: Monthly sales of new one-family houses sold in the USA since 1973(Makridakis, Wheelwright and Hyndman, 1998).

Given the time series Monthly housing sales (Figure 7),

  • Is there a trend?
  • Is there a seasonal variation?
  • Is there a cyclic pattern?

Other components of Time Series

  • Easter Effect/Moving Holidays
  • Trading Day Variation
  • Extreme Values/Outliers

Time Series Classical Decomposition

In a classical decomposition of time series, a time series can be expressed as

Yt = f ( Tt, Ct, St, It ) (1)

where

  • Yt is the actual value at time t ;
  • f is a mathematical function;
  • Tt is the trend component;
  • Ct is the cyclic component;
  • St is the Seasonal component; and

Time

souvenirtimeseries

0e+

4e+

8e+

Figure 9: Monthly sales for a souvenir shop at a beach resort town in Queensland, Australia, for January 1987-December 1993 (Wheelwright and Hyndman, 1998).

Figure 8 and Figure 9 show two time series with trend and seasonality. For Figure 8,the seasonal component doesn’t vary over time. However, this is not the case for Figure 9 in which the seasonal component seems to be increasing overtime. For the case in Figure 8, time series decomposition using the additive model is more appropriate. While in the case of Figure 9, time series decomposition using the multiplicative model is more appropriate.

Trend-cycle Model In time series decomposition, trend and cycle are usually combine into a single term (trend-cycle). Trend-cycle describe the long term behavior of the time series. For simplicity, it is Trend-cycle depreciated to trend component. Thus, the model will now be

  • additive model: Yt = ( TtCt ) + St + It
  • multiplicative model: Yt = ( TtCt ) × St × It For this class, when we discuss about the decomposition model of a time series we pertains to the trend-cycle models above. Hence when we estimate and interpret trend, we may talk about trend and cycle at the same time if there is observable cycle component.

Moving Average A moving average of order n (also called n -MA) is given by

M At =

n

∑^ k

i =− k

yt + i (4)

where n = 2 k + 1

Example : The moving average of order 3.

t y 3-MA (Centered) 3-MA (Trailing)

Jan-1959 315. Feb 1959 316.31 315_._ 42+316 3_._ 31+326_.^50 = 319._ 41 Mar-1959 326.50 316_._ 31+326 3_._ 50+317_.^56 = 320._ 1233

315_._ 42+316_._ 31+326_._ 50 3 = 319_._ 41 Apr-1959 317.56 326_._ 50+317 3_._ 56+318_.^13 = 320._ 73

316_._ 31+326_._ 50+317_._ 56 3 = 320_._ 1233 May-1959 318.13 317_._ 56+318 3_._ 13+318_.^00 = 317._ 8967

326_._ 50+317_._ 56+318_._ 13 3 = 320_._ 73 Jun-1959 318.00 318_._ 13+318 3_._ 00+316_.^39 = 317._ 5067

317_._ 56+318_._ 13+318_._ 00 3 = 317_._ 8967 Jul-1959 316.39 318_._ 13+318 3_._ 00+316_.^39 = 317._ 5067

Exercise 2 : Compute the 6-MA (centered) of the following

date= seq.Date(from = as.Date('1959-01-01'), to = as.Date('1960-12-01'), by = 'months') data = data.frame("Date"=date, "CO2.ppm."=co2[1:24])

knitr::kable(data, caption = "Monthly Mauna Loa Atmospheric CO2 Concentration from 1959 to 1960", digits = 2)

Table 2: Monthly Mauna Loa Atmospheric CO2 Concentration from 1959 to 1960

Date CO2.ppm. 1959-01-01 315. 1959-02-01 316. 1959-03-01 316. 1959-04-01 317. 1959-05-01 318. 1959-06-01 318. 1959-07-01 316. 1959-08-01 314. 1959-09-01 313. 1959-10-01 313. 1959-11-01 314. 1959-12-01 315. 1960-01-01 316. 1960-02-01 316. 1960-03-01 317. 1960-04-01 318. 1960-05-01 319. 1960-06-01 319. 1960-07-01 318. 1960-08-01 315. 1960-09-01 314.

plot(tc, main="Trend-Cylce (2 x 12-MA)",ylab="CO2(ppm)")

Trend−Cylce (2 x 12−MA)

Time

CO2(ppm)

Step 2: Compute the detrended series dt.

  • For Additive Model: dt = ytT ˆ t
  • For Multiplicative Model: dt = (^) Ty ˆ t t

Example:

plot(co2,main="Monthly Mauna Loa Atmospheric CO2 Concentration",ylab="CO2(ppm)")

Monthly Mauna Loa Atmospheric CO2 Concentration

Time

CO2(ppm)

# using the additive model dt = co2-tc plot(dt,main="detrended",ylab="CO2(ppm)")

caption = "Seasonal Factor for each Seasonal Indices (Months)", digits = 4)

Table 3: Seasonal Factor for each Seasonal Indices (Months)

Month Seasonal.Factor January -0. February 0. March 1. April 2. May 3. June 2. July 0. August -1. September -3. October -3. November -2. December -0.

## plot the monthly seasonal effects plot.ts(sf, ylab = "Seasonal effect", xlab = "Month", cex = 1,main="Seasonal Factor")

Seasonal Factor

Month

Seasonal effect

## seasonal component estimate

season_comp <- ts(rep(sf, periods + 1)[seq(L)], start = start(dt),frequency = ff) plot(season_comp,main="Seasonal Component Estimates",ylab="CO2(ppm)" )

Seasonal Component Estimates

Time

CO2(ppm)

Step 4: Compute the Irregular Component It

  • For Additive Model: It = dtS ˆ t
  • For Multiplicative Model: It = (^) Sd ˆ t t

Example:

Irregular = dt - season_comp plot.ts(Irregular,main="Irregular Component",ylab="CO2(ppm)")

Time

CO2 Concentration (ppm)

Figure 10: Monthly Mauna Loa Atmospheric CO2 Concentration from 1959 to 1997(Box, Jenkins and Reinsel, 1976).

Let’s say our t is February 1959, we

zt = 1959 +

since February is the 2nd month and there 12 months in a year.

class(co2)

[1] "ts"

z_t <- time(co2) head(z_t)

Jan Feb Mar Apr May Jun

1959 1959.000 1959.083 1959.167 1959.250 1959.333 1959.

co2_trend_linear <- lm(co2~z_t)

summary(co2_trend_linear)

Call:

lm(formula = co2 ~ z_t)

Residuals:

Min 1Q Median 3Q Max

-6.0399 -1.9476 -0.0017 1.9113 6.

Coefficients:

Estimate Std. Error t value Pr(>|t|)

(Intercept) -2.250e+03 2.127e+01 -105.8 <2e-16 ***

z_t 1.308e+00 1.075e-02 121.6 <2e-16 ***

---

Signif. codes: 0 '' 0.001 '' 0.01 '' 0.05 '.' 0.1 ' ' 1

Residual standard error: 2.618 on 466 degrees of freedom

Multiple R-squared: 0.9695, Adjusted R-squared: 0.

F-statistic: 1.479e+04 on 1 and 466 DF, p-value: < 2.2e-

plot(co2,ylab= "CO2 Concentration (ppm)") abline(co2_trend_linear,col="red")

Time

CO2 Concentration (ppm)

Exercise 3 Using the AirPassengers dataset , apply the appropriate time series decomposition.

Descriptive Analysis of Time Series Data

Things to look for:

  • Trend - long term movement of the time series. Historical and Trend-cycle plot is very useful to observe trend.