RNN and LSTM
This notebook demonstrates the capabilities of the RNN model in scalecast.
Download data: https://www.kaggle.com/robervalt/sunspots
Requirements for this notebook:
pip install tensorflow
!pip install tqdm
!pip install ipython
!pip install ipywidgets
!jupyter nbextension enable --py widgetsnbextension
if using Jupyter Lab:
!jupyter labextension install @jupyter-widgets/jupyterlab-manager
[1]:
import pandas as pd
import numpy as np
import pandas_datareader as pdr
import matplotlib
import matplotlib.pyplot as plt
import seaborn as sns
from tqdm.notebook import tqdm
from dateutil.relativedelta import relativedelta
from scalecast.Forecaster import Forecaster
[2]:
df = pd.read_csv('Sunspots.csv',index_col=0,names=['Date','Target'],header=0)
f = Forecaster(
y=df['Target'],
current_dates=df['Date'],
test_length = 240,
future_dates = 240,
cis = True,
)
f
[2]:
Forecaster(
DateStartActuals=1749-01-31T00:00:00.000000000
DateEndActuals=2021-01-31T00:00:00.000000000
Freq=M
N_actuals=3265
ForecastLength=240
Xvars=[]
TestLength=240
ValidationMetric=rmse
ForecastsEvaluated=[]
CILevel=0.95
CurrentEstimator=mlr
GridsFile=Grids
)
EDA
[3]:
f.plot()
plt.title('Original Sunspots Series',size=16)
[3]:
Text(0.5, 1.0, 'Original Sunspots Series')

Let’s see the ACF and PACF plots, allowing 240 lags to see this series’ irregular 10-year cycle.
[4]:
figs, axs = plt.subplots(2, 1, figsize = (12,6))
f.plot_acf(ax=axs[0],lags=240)
f.plot_pacf(ax=axs[1],lags=240,method='ywm')
plt.show()

Let’s view the series’ seasonal decomposition.
[5]:
plt.rc("figure",figsize=(10,6))
f.seasonal_decompose().plot()
plt.show()

It is very difficult to make out anything useful from this graph.
Forecast RNN Model
See Darts’ GRU on the same series: https://unit8co.github.io/darts/examples/04-RNN-examples.html?highlight=sunspots
SimpleRNN
Unlayered model
1 hidden layer with 20% dropout
25 epochs
20% validcation split
32 batch size
Tanh activation
Adam optimizer
MAE loss function
[6]:
f.auto_Xvar_select(
try_trend = False,
irr_cycles = [120,132,144],
cross_validate = True,
cvkwargs = {'k':3},
dynamic_tuning = 240,
)
f.set_estimator('rnn')
f
[6]:
Forecaster(
DateStartActuals=1749-01-31T00:00:00.000000000
DateEndActuals=2021-01-31T00:00:00.000000000
Freq=M
N_actuals=3265
ForecastLength=240
Xvars=['AR1', 'AR2', 'AR3', 'AR4', 'AR5', 'AR6', 'AR7', 'AR8', 'AR9', 'AR10', 'AR11', 'AR12', 'AR13', 'AR14', 'AR15', 'AR16', 'AR17', 'AR18', 'AR19', 'AR20', 'AR21', 'AR22', 'AR23', 'AR24', 'AR25', 'AR26', 'AR27', 'AR28', 'AR29', 'AR30', 'AR31', 'AR32', 'AR33', 'AR34', 'AR35', 'AR36', 'AR37', 'AR38', 'AR39', 'AR40', 'AR41', 'AR42', 'AR43', 'AR44', 'AR45', 'AR46', 'AR47', 'AR48', 'AR49', 'AR50', 'AR51', 'AR52', 'AR53', 'AR54', 'AR55', 'AR56', 'AR57', 'AR58', 'AR59', 'AR60', 'AR61', 'AR62', 'AR63', 'AR64', 'AR65', 'AR66', 'AR67', 'AR68', 'AR69', 'AR70', 'AR71', 'AR72', 'AR73', 'AR74', 'AR75', 'AR76', 'AR77', 'AR78', 'AR79', 'AR80', 'AR81', 'AR82', 'AR83', 'AR84', 'AR85', 'AR86', 'AR87', 'AR88', 'AR89', 'AR90', 'AR91', 'AR92', 'AR93', 'AR94', 'AR95', 'AR96', 'AR97', 'AR98', 'AR99', 'AR100', 'AR101', 'AR102', 'AR103', 'AR104', 'AR105', 'AR106', 'AR107', 'AR108', 'AR109', 'AR110', 'AR111', 'AR112', 'AR113', 'AR114', 'AR115', 'AR116', 'AR117', 'AR118', 'AR119', 'AR120', 'AR121', 'AR122', 'AR123', 'AR124', 'AR125', 'AR126', 'AR127', 'AR128', 'AR129', 'AR130', 'AR131', 'AR132', 'AR133', 'AR134', 'AR135', 'AR136', 'AR137', 'AR138', 'AR139', 'AR140', 'AR141', 'AR142', 'AR143', 'AR144', 'AR145', 'AR146', 'AR147', 'AR148', 'AR149', 'AR150', 'AR151', 'AR152', 'AR153', 'AR154', 'AR155', 'AR156', 'AR157', 'AR158', 'AR159', 'AR160', 'AR161', 'AR162', 'AR163', 'AR164', 'AR165', 'AR166', 'AR167', 'AR168', 'AR169', 'AR170', 'AR171', 'AR172', 'AR173', 'AR174', 'AR175', 'AR176', 'AR177', 'AR178', 'AR179', 'AR180', 'AR181', 'AR182', 'AR183', 'AR184', 'AR185', 'AR186', 'AR187', 'AR188', 'AR189', 'AR190', 'AR191', 'AR192', 'AR193', 'AR194', 'AR195', 'AR196', 'AR197', 'AR198', 'AR199', 'AR200', 'AR201', 'AR202', 'AR203', 'AR204', 'AR205', 'AR206', 'AR207', 'AR208', 'AR209', 'AR210', 'AR211', 'AR212', 'AR213', 'AR214', 'AR215', 'AR216', 'AR217', 'AR218', 'AR219', 'AR220', 'AR221', 'AR222', 'AR223', 'AR224', 'AR225', 'AR226', 'AR227', 'AR228', 'AR229', 'AR230', 'AR231', 'AR232', 'AR233', 'AR234']
TestLength=240
ValidationMetric=rmse
ForecastsEvaluated=[]
CILevel=0.95
CurrentEstimator=rnn
GridsFile=Grids
)
[7]:
f.manual_forecast(
layers_struct=[('SimpleRNN',{'units':100,'dropout':0.2})],
epochs=25,
validation_split=0.2,
plot_loss=True,
call_me="rnn_1layer",
verbose=0, # so it doesn't print each epoch and saves space in the notebook
)
2023-04-11 16:31:38.533751: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: SSE4.1 SSE4.2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-04-11 16:31:40.033534: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: SSE4.1 SSE4.2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
1/1 [==============================] - 0s 108ms/step

1/1 [==============================] - 0s 91ms/step
88/88 [==============================] - 1s 10ms/step

Layered Model
2 SimpleRNN layers, 100 units each
2 Dense layers, 10 units each
No dropout
Everything else the same
[8]:
f.manual_forecast(
layers_struct=[
('SimpleRNN',{'units':100,'dropout':0})
] * 2 + [
('Dense',{'units':10})
]*2,
epochs=25,
random_seed=42,
plot_loss=True,
validation_split=0.2,
call_me='rnn_layered',
verbose=0
)
1/1 [==============================] - 0s 150ms/step

1/1 [==============================] - 0s 154ms/step
88/88 [==============================] - 2s 19ms/step

[9]:
f.plot_test_set(
ci=True,
models=['rnn_1layer','rnn_layered'],
order_by='TestSetRMSE',
)
plt.title('SimpleRNN Models Test-Set Results',size=16)
plt.show()

LSTM
Unlayered Model
One LSTM layer with 20% dropout
15 epochs
Everything else the same
[10]:
f.manual_forecast(
layers_struct=[('LSTM',{'units':100,'dropout':0.2})],
epochs=15,
validation_split=0.2,
plot_loss=True,
call_me="lstm_1layer",
verbose=0, # so it doesn't print each epoch and saves space in the notebook
)
1/1 [==============================] - 0s 216ms/step

1/1 [==============================] - 0s 219ms/step
88/88 [==============================] - 3s 29ms/step

Layered Model
2 lstm layers and 2 dense layers
No dropout
[11]:
f.manual_forecast(
layers_struct=[
('LSTM',{'units':100,'dropout':0})
] * 2 + [
('Dense',{'units':10})
] * 2,
epochs=15,
random_seed=42,
plot_loss=True,
validation_split=0.2,
call_me='lstm_layered',
verbose=0
)
1/1 [==============================] - 0s 408ms/step

1/1 [==============================] - 0s 411ms/step
88/88 [==============================] - 5s 59ms/step

[12]:
f.plot_test_set(models=['lstm_1layer','lstm_layered'],ci=True,order_by='TestSetRMSE')
plt.title('LSTM Models Test-Set Results',size=16)
plt.show()

Prepare NNAR Model
Let’s try building a model you can read about here: https://otexts.com/fpp2/nnetar.html
This is a time-series based dense neural network model, where the final predictions are the average of several models with randomly selected starting weights
It is not a recurrent neural network and does not use the rnn estimator, but I thought it’d be fun to demonstrate here.
It takes three input parameters:
p (number of lags)
P (number of seasonal lags)
m (seasonal period)
It also accepts exogenous variables. We add a time trend, monthly seasonality in wave functions, and a 120-period cycle
Another parameter to consider is k: the size of the hidden layer. By default, this is the total number of inputs divided by 2, rounded up. We will keep the default
The default number of models to average is 20 and we will keep that default as well
See the model’s documentation in R: https://www.rdocumentation.org/packages/forecast/versions/8.16/topics/nnetar
[13]:
p = 10 # non-seasonal lags
P = 6 # seasonal lags
m = 12 # seasonal period
f.drop_all_Xvars()
f.add_ar_terms(p)
f.add_AR_terms((P,m))
# in addition to the process described in the linked book, we can add monthly seasonality in a fourier transformation
f.add_seasonal_regressors('month',raw=False,sincos=True)
# lastly, we add an irregular 10-year cycle that is idionsycratic to this dataset
f.add_cycle(120)
[14]:
f
[14]:
Forecaster(
DateStartActuals=1749-01-31T00:00:00.000000000
DateEndActuals=2021-01-31T00:00:00.000000000
Freq=M
N_actuals=3265
ForecastLength=240
Xvars=['AR1', 'AR2', 'AR3', 'AR4', 'AR5', 'AR6', 'AR7', 'AR8', 'AR9', 'AR10', 'AR12', 'AR24', 'AR36', 'AR48', 'AR60', 'AR72', 'monthsin', 'monthcos', 'cycle120sin', 'cycle120cos']
TestLength=240
ValidationMetric=rmse
ForecastsEvaluated=['rnn_1layer', 'rnn_layered', 'lstm_1layer', 'lstm_layered']
CILevel=0.95
CurrentEstimator=rnn
GridsFile=Grids
)
Forecast NNAR Model
[15]:
# the default parameter used in the book is the total number of inputs divided by 2, rounded up
k = int(np.ceil(len(f.get_regressor_names())/2))
repeats = 20 # default repeats number used in book
f.set_estimator('mlp')
for r in tqdm(range(repeats)): # repeats
f.manual_forecast(
hidden_layer_sizes=(k,),
activation='relu',
random_state=r,
normalizer='scale',
call_me=f'mlp_{r}',
)
f.save_feature_importance()
# now we take the averages of all models to create the final NNAR
f.set_estimator('combo')
f.manual_forecast(how='simple',models=[f'mlp_{r}' for r in range(20)],call_me='nnar')
[16]:
f.plot_test_set(ci=True,models='nnar')
plt.title('NNAR Test-Set Performance',size=16)
plt.show()

This doesn’t look as good as any of our evaluated RNN models.
One thing we notice with all these test-set graphs is that the precise deviations from the trend are hard to see because the series is so large. Let’s call the function again with all evaluated models ordered by test RMSE and zoomed in to the test-set section of the series only by setting include_train=False
.
[17]:
f.plot_test_set(
models=['rnn_1layer','rnn_layered','lstm_1layer','lstm_layered','nnar'],
order_by='TestSetRMSE',
include_train=False,
)
plt.title('All Models Test Performance - test set obs only',size=16)
plt.show()

With this view, it is not clear that any model did as well as we might have thought from the other graphs.
Compare All Models
Forecast Plot
best rnn model
best lstm model
nnar model
[18]:
f.plot(models=['rnn_layered','lstm_layered','nnar'],order_by='TestSetRMSE',ci=True)
plt.title('Top-3 Model Class Forecasts - 20 Years',size=16)
plt.show()

Export Model Summaries for all applied models
[19]:
pd.set_option('display.float_format', '{:.4f}'.format)
f.export(
'model_summaries',
models=['rnn_1layer','rnn_layered','lstm_1layer','lstm_layered','nnar'],
determine_best_by = 'TestSetRMSE',
)[
[
'ModelNickname',
'TestSetRMSE',
'TestSetR2',
'InSampleRMSE',
'InSampleR2',
'best_model'
]
]
[19]:
ModelNickname | TestSetRMSE | TestSetR2 | InSampleRMSE | InSampleR2 | best_model | |
---|---|---|---|---|---|---|
0 | lstm_layered | 28.5508 | 0.7083 | 56.6811 | 0.3310 | True |
1 | lstm_1layer | 31.8674 | 0.6366 | 43.3984 | 0.6078 | False |
2 | rnn_1layer | 41.2311 | 0.3918 | 68.7316 | 0.0163 | False |
3 | rnn_layered | 55.6165 | -0.1067 | 60.6494 | 0.2341 | False |
4 | nnar | 92.0142 | -2.0293 | 28.4102 | 0.8266 | False |
Backtest Best Model
The LSTM with four layers appears to be our most accurate model out of the 5 we ran and showed. The NNAR model, in spite of being fun to set up, was the worst. With more time spent, even better results could be extracted. Although that model appears accurate on the test set we gave to it, an interesting question is how good it would have been if we had used it in production for the last 5 years. That’s what backtesting allows us to evaluate. Backtesting can take a long time for more complex models.
[32]:
from scalecast.Pipeline import Pipeline
from scalecast.util import backtest_metrics
def forecaster(f):
f.auto_Xvar_select(
try_trend = False,
irr_cycles = [120,132,144],
cross_validate = True,
cvkwargs = {'k':3},
dynamic_tuning = 240,
max_ar = 240,
)
f.set_estimator('rnn')
f.manual_forecast(
layers_struct=[
('LSTM',{'units':100,'dropout':0})
] * 2 + [
('Dense',{'units':10})
] * 2,
epochs=15,
random_seed=42,
validation_split=0.2,
call_me='lstm_layered',
verbose=0
)
pipeline = Pipeline(steps = [('Forecast',forecaster)])
[33]:
# default args below
backtest_results = pipeline.backtest(
f,
n_iter=5,
fcst_length = 240,
test_length = 0,
jump_back = 12, # place a year between consecutive training sets
cis = False,
)
bm = backtest_metrics(backtest_results,mets=['r2','rmse'])
bm
1/1 [==============================] - 0s 439ms/step
80/80 [==============================] - 5s 63ms/step
1/1 [==============================] - 0s 418ms/step
80/80 [==============================] - 5s 67ms/step
1/1 [==============================] - 0s 414ms/step
79/79 [==============================] - 5s 67ms/step
1/1 [==============================] - 0s 435ms/step
79/79 [==============================] - 6s 70ms/step
1/1 [==============================] - 0s 390ms/step
80/80 [==============================] - 5s 58ms/step
[33]:
Iter0 | Iter1 | Iter2 | Iter3 | Iter4 | Average | ||
---|---|---|---|---|---|---|---|
Model | Metric | ||||||
lstm_layered | r2 | 0.7385 | 0.3077 | 0.3695 | 0.3589 | 0.2503 | 0.4050 |
rmse | 27.0322 | 48.1908 | 46.7773 | 45.9266 | 49.2748 | 43.4403 |
[34]:
rmse = bm.loc[('lstm_layered','rmse'),'Average']
r2 = bm.loc[('lstm_layered','r2'),'Average']
print(
'This shows that if we had used this model on actual data in the previous 5 years,'
f' it would have obtained an average RMSE of {rmse:.1f} and R2 of {r2:.1%}.'
)
This shows that if we had used this model on actual data in the previous 5 years, it would have obtained an average RMSE of 43.4 and R2 of 40.5%.
[ ]: