Skip to content

Commit 4dcef23

Browse files
authored
Merge pull request Auquan#9 from Auquan/model-selecton
Time Series Analysis Series
2 parents 71dac22 + e36fa2d commit 4dcef23

4 files changed

Lines changed: 2117 additions & 5 deletions

File tree

Time Series Analysis - 1.ipynb

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -86,13 +86,13 @@
8686
"\n",
8787
"Stationarity is an extremely important aspect of time series - much of the analysis carried out on financial time series data involves identifying if the series we want to predict is stationary, and if it is not,findingys to transform it such that it is stationary. \n",
8888
"\n",
89-
"*Mean of a time series $x_t$ is $E(x_t)=\mu_(t)$*\n",
89+
"*Mean of a time series $x_t$ is $E(x_t)=\\mu(t)$*\n",
9090
"\n",
91-
"*Variance of a time series $x_t$ is $\sigma^2(t)=E[(x_t−\mu(t))^2]$*\n",
91+
"*Variance of a time series $x_t$ is $\\sigma^2(t)=E[(x_t−\\mu(t))^2]$*\n",
9292
"\n",
93-
"**A time series is stationary in the mean if $\mu(t)=μ$, i.e.mean is constant with time**\n",
93+
"**A time series is stationary in the mean if $\\mu(t)=\\mu$, i.e.mean is constant with time**\n",
9494
"\n",
95-
"**A time series is stationary in the variance if $σ^2(t)=σ^2$, i.e. variance is constant with time**\n",
95+
"**A time series is stationary in the variance if $\\sigma^2(t)=\\sigma^2$, i.e. variance is constant with time**\n",
9696
"\n",
9797
"This image from SEANABU.COM should help \n",
9898
"\n",
@@ -119,7 +119,7 @@
119119
"\n",
120120
"The random component is called the residual or error - the difference between our predicted value(s) and the observed value(s). Serial correlation is when the residuals (errors) of our TS models are correlated with each other. It tells us how sequential observations in a time series affect each other. If we can find structure in these observations then it will likely help us improve our forecasts and simulation accuracy. This will lead to greater profitability in our trading strategies or better risk management approaches.\n",
121121
"\n",
122-
"Formally, for a covariance-stationary time series (as #3 above, where covariance between sequential observations is not a function of time), autocorrelation $ρ_k$ for lag $k$ (the number of time steps separating two sequantial observations), $$ρ_k = E[(x_t−μ)(x_t+k−μ)]/σ^2$$\n",
122+
"Formally, for a covariance-stationary time series (as #3 above, where covariance between sequential observations is not a function of time), autocorrelation $\\rho_k$ for lag $k$ (the number of time steps separating two sequantial observations), $$\\rho_k = E[(x_t−\\mu)(x_t+k−\\mu)]/\\sigma^2$$\n",
123123
"\n",
124124
"### Why Do We Care about Serial Correlation? \n",
125125
"\n",

Time Series Analysis - 2.ipynb

Lines changed: 879 additions & 0 deletions
Large diffs are not rendered by default.

Time Series Analysis - 3.ipynb

Lines changed: 1058 additions & 0 deletions
Large diffs are not rendered by default.

Time Series Analysis - 4.ipynb

Lines changed: 175 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,175 @@
1+
{
2+
"cells": [
3+
{
4+
"cell_type": "markdown",
5+
"metadata": {},
6+
"source": [
7+
"# Time Series Analysis - Part 4 : ARCH and GARCH models\n",
8+
"\n",
9+
"In this final notebook on time series analysis, we will discuss conditional heteroskedasticity, leading us to our first conditional heteroskedastic model, known as ARCH. Then we will discuss extensions to ARCH, leading us to the famous Generalised Autoregressive Conditional Heteroskedasticity model of order p,q, also known as GARCH(p,q). GARCH is used extensively within the financial industry as many asset prices are conditional heteroskedastic."
10+
]
11+
},
12+
{
13+
"cell_type": "markdown",
14+
"metadata": {},
15+
"source": [
16+
"### Recap\n",
17+
"\n",
18+
"We have considered the following models so far(it is recommended reading the series in order if you have not done so already):\n",
19+
"\n",
20+
"* [Discrete White Noise and Random Walks]\n",
21+
"* [AR(p) and MA(q)]\n",
22+
"* [ARMA(p,q) and ARIMA(p,d,q)]\n",
23+
"\n",
24+
"The final piece to the puzzle is to examine conditional heteroskedasticity in detail and apply GARCH to some financial series that exhibit volatility clustering."
25+
]
26+
},
27+
{
28+
"cell_type": "code",
29+
"execution_count": 1,
30+
"metadata": {
31+
"collapsed": true
32+
},
33+
"outputs": [],
34+
"source": [
35+
"import os\n",
36+
"import sys\n",
37+
"\n",
38+
"import pandas as pd\n",
39+
"import numpy as np\n",
40+
"\n",
41+
"import statsmodels.formula.api as smf\n",
42+
"import statsmodels.tsa.api as smt\n",
43+
"import statsmodels.api as sm\n",
44+
"import scipy.stats as scs\n",
45+
"import statsmodels.stats as sms\n",
46+
"\n",
47+
"import matplotlib.pyplot as plt\n",
48+
"import matplotlib as mpl\n",
49+
"%matplotlib inline"
50+
]
51+
},
52+
{
53+
"cell_type": "code",
54+
"execution_count": 2,
55+
"metadata": {
56+
"collapsed": false
57+
},
58+
"outputs": [
59+
{
60+
"name": "stderr",
61+
"output_type": "stream",
62+
"text": [
63+
"C:\\Users\\Chandini\\Miniconda3\\envs\\auquan\\lib\\site-packages\\matplotlib\\__init__.py:1401: UserWarning: This call to matplotlib.use() has no effect\n",
64+
"because the backend has already been chosen;\n",
65+
"matplotlib.use() must be called *before* pylab, matplotlib.pyplot,\n",
66+
"or matplotlib.backends is imported for the first time.\n",
67+
"\n",
68+
" warnings.warn(_use_error_msg)\n"
69+
]
70+
},
71+
{
72+
"name": "stdout",
73+
"output_type": "stream",
74+
"text": [
75+
"Reading SPX\n",
76+
"Reading DOW\n",
77+
"Reading AAPL\n",
78+
"Reading MSFT\n"
79+
]
80+
}
81+
],
82+
"source": [
83+
"import auquanToolbox.dataloader as dl\n",
84+
"\n",
85+
"end = '2015-01-01'\n",
86+
"start = '2007-01-01'\n",
87+
"symbols = ['SPX','DOW','AAPL','MSFT']\n",
88+
"data = dl.load_data_nologs('nasdaq', symbols , start, end)['ADJ CLOSE']\n",
89+
"# log returns\n",
90+
"lrets = np.log(data/data.shift(1)).dropna()"
91+
]
92+
},
93+
{
94+
"cell_type": "code",
95+
"execution_count": 3,
96+
"metadata": {
97+
"collapsed": true
98+
},
99+
"outputs": [],
100+
"source": [
101+
"def tsplot(y, lags=None, figsize=(10, 8), style='bmh'):\n",
102+
" if not isinstance(y, pd.Series):\n",
103+
" y = pd.Series(y)\n",
104+
" with plt.style.context(style): \n",
105+
" fig = plt.figure(figsize=figsize)\n",
106+
" #mpl.rcParams['font.family'] = 'Ubuntu Mono'\n",
107+
" layout = (3, 2)\n",
108+
" ts_ax = plt.subplot2grid(layout, (0, 0), colspan=2)\n",
109+
" acf_ax = plt.subplot2grid(layout, (1, 0))\n",
110+
" pacf_ax = plt.subplot2grid(layout, (1, 1))\n",
111+
" qq_ax = plt.subplot2grid(layout, (2, 0))\n",
112+
" pp_ax = plt.subplot2grid(layout, (2, 1))\n",
113+
" \n",
114+
" y.plot(ax=ts_ax)\n",
115+
" ts_ax.set_title('Time Series Analysis Plots')\n",
116+
" smt.graphics.plot_acf(y, lags=lags, ax=acf_ax, alpha=0.5)\n",
117+
" smt.graphics.plot_pacf(y, lags=lags, ax=pacf_ax, alpha=0.5)\n",
118+
" sm.qqplot(y, line='s', ax=qq_ax)\n",
119+
" qq_ax.set_title('QQ Plot') \n",
120+
" scs.probplot(y, sparams=(y.mean(), y.std()), plot=pp_ax)\n",
121+
"\n",
122+
" plt.tight_layout()\n",
123+
" return"
124+
]
125+
},
126+
{
127+
"cell_type": "markdown",
128+
"metadata": {},
129+
"source": []
130+
},
131+
{
132+
"cell_type": "markdown",
133+
"metadata": {},
134+
"source": [
135+
"## Autoregressive Conditionally Heteroskedastic Models - ARCH(p)\n",
136+
"\n",
137+
"ARCH(p) models can be thought of as simply an AR(p) model applied to the variance of a time series. Another way to think about it, is that the variance of our time series NOW at time t, is conditional on past observations of the variance in previous periods."
138+
]
139+
},
140+
{
141+
"cell_type": "markdown",
142+
"metadata": {},
143+
"source": [
144+
"Once again, we have what looks like a realisation of a discrete white noise process, indicating that we have \"explained\" the serial correlation present in the squared residuals with an appropriate mixture of ARIMA(p,d,q) and GARCH(p,q).\n",
145+
"\n",
146+
"## Next Steps\n",
147+
"\n",
148+
"We are now at the point in our time series education where we have studied ARIMA and GARCH, allowing us to fit a combination of these models to a stock market index, and to determine if we have achieved a good fit or not.\n",
149+
"\n",
150+
"The next step is to actually produce forecasts of future daily returns values from this combination and use it to create a basic trading strategy. "
151+
]
152+
}
153+
],
154+
"metadata": {
155+
"kernelspec": {
156+
"display_name": "Python 2",
157+
"language": "python",
158+
"name": "python2"
159+
},
160+
"language_info": {
161+
"codemirror_mode": {
162+
"name": "ipython",
163+
"version": 2
164+
},
165+
"file_extension": ".py",
166+
"mimetype": "text/x-python",
167+
"name": "python",
168+
"nbconvert_exporter": "python",
169+
"pygments_lexer": "ipython2",
170+
"version": "2.7.13"
171+
}
172+
},
173+
"nbformat": 4,
174+
"nbformat_minor": 2
175+
}

0 commit comments

Comments
 (0)