[TIL] 22일차 TIL(20230308) - Python 시계열 분석 2 (지수평활법, 홀트-윈터스 계절성 기법)

Notice

Recent Posts

Recent Comments

Link

« 2025/06 »
일	월	화	수	목	금	토
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30

Tags more

Archives

Today

Total

관리 메뉴

Diary, Data, IT

[TIL] 22일차 TIL(20230308) - Python 시계열 분석 2 (지수평활법, 홀트-윈터스 계절성 기법) 본문

STARTERS/TIL

[TIL] 22일차 TIL(20230308) - Python 시계열 분석 2 (지수평활법, 홀트-윈터스 계절성 기법)

라딘 2023. 3. 10. 17:42

[TIL] 22일차 TIL(20230308) - Python 시계열 분석 2

(지수평활법, 홀트-윈터스 계절성 기법)

1. ETS 분해

ETS모델이란 계절성 요소, 잔차 요소, 추세 요소를 분해하여 요소를 더하거나 곱하거나, 일부를 사용하지 않고 평활화하는 모든 모델들을 지칭
덧셈 모델(additive model)은 추세가 선형에 가깝고, 계절성이 거의 일정해보일 때 적합
곱셈 모델(multiplicative model)은 비선형적으로 증가 혹은 감소하는 경우에 적합

from statsmodels.tsa.seasonal import seasonal_decompose

result = seasonal_decompose(airline['Thousands of Passengers'], model = 'multiplicative')
result.plot();  # 추세, 계절, 잔차 요소로 분리한 결과

객체.trend, 객체.seasonal, 객체.resid를 통해 각 추세 요소에 개별적으로 접근할 수 있다.

2. EWMA(지수평활법)

지수 가중 이동 평균을 통해 시계열 데이터의 특징을 잡아내고 예측하는 모델
최근의 값들에 더 높은 가중치를 적용하여 시차효과를 감소시킨다.
지수평활법에 사용되는 대표적인 가중치는 alpha(평활상수)가 있으며, span, c, halflife를 변형하여 alpha로 사용할 수 있다. (ex. alpha = 2/(span + 1))

2.1 지수평활법의 가중치

alpha: 지수평활법의 기본적인 가중치인 평활상수
span: n단위 지수 가중 이동평균을 하고 싶을 때 사용하는 인자
c: span과 역관계
halflife: 반감기, 지수적 가중치가 반으로 줄어드는데 걸리는 시간

#span = 12로 설정하고 지수평활법 적용
airline['EWMA-12'] = airline['Thousands of Passengers'].ewm(span = 12).mean()

airline[['Thousands of Passengers', 'EWMA-12']].plot(figsize = (10, 8))

from statsmodels.tsa.holtwinters import SimpleExpSmoothing

#패키지를 이용한 단일 지수평활법 적용
model = SimpleExpSmoothing(df['Thousands of Passengers'])
fitted_model = model.fit(smoothing_level = alpha, optimized = False)

#한 칸씩 밀린 결과가 반환되기 때문에 shift적용
df['SES12'] = fitted_model.fittedvalues.shift(-1)

3. 홀트-윈터스 계절성 기법

이중/삼중 지수평활법
추세 요소 b_t, 계절 요소 s_t, 일반 요소 l_t
덧셈 기법과 곱셈 기법이 존재
홀트 기법: l_t와 b_t만 고려한 모델로 이중 지수평활법
윈터스 기법: 계절적인 요소까지 고려한 모델로 삼중 지수평활법 (L은 반복되는 주기)

from statsmodels.tsa.holtwinters import ExponentialSmoothing

#추세가 선형적으로 보여서 덧셈모형 사용
df['DES_add_12'] = ExponentialSmoothing(df['Thousands of Passengers'], trend = 'add').fit().fittedvalues.shift(-1)

# 이중 지수 평활법이 실제 결과와 거의 일치하게 나타남
df[['Thousands of Passengers','SES12','DES_add_12']].iloc[:24].plot(figsize = (12, 5))

#삼중 지수 평활법
df['TES_mul_12'] = ExponentialSmoothing(df['Thousands of Passengers'], trend = 'mul', seasonal = 'mul', seasonal_periods = 12).fit().fittedvalues

# 마지막 2년 동안의 데이터만 시각화
df[['Thousands of Passengers', 'DES_mul_12', 'TES_mul_12']][-24:].plot()

4. 홀트-윈터스 모델을 통한 예측 실습코드

df = pd.read_csv('../Data/airline_passengers.csv', index_col = 'Month', parse_dates = True)
df.index.freq = 'MS'

#train, test 분리
train_data = df.iloc[:109]  # 1939년까지
test_data = df.iloc[108:]

#모델 생성
from statsmodels.tsa.holtwinters import ExponentialSmoothing

fitted_model = ExponentialSmoothing(train_data['Thousands of Passengers'],
                                   trend = 'mul', seasonal = 'mul', seasonal_periods = 12).fit()
test_predictions = fitted_model.forecast(36)  #예측하고자 하는 기간을 입력하고 예측

#실제 데이터와 예측 데이터를 그래프로 확인
train_data['Thousands of Passengers'].plot(legend = True, label = 'Train', figsize = (12, 8))
test_data['Thousands of Passengers'].plot(legend = True, label = 'Test')
test_predictions.plot(legend = True, label = 'PREDICTION')

시계열 데이터를 예측하는 것은 일종의 회귀 문제이기 때문에 회귀모델 평가 지표인 MSE, RMSE, MAE를 그대로 사용한다. 특히 RMSE를 주로 사용하며, 각 지표의 해석은 데이터의 scale에 따라 달라지므로, 명확하게 정의할 수 없지만 대략적으로 평균과 비교해보면서 오차의 정도를 측정할 수 있다.

from sklearn.metrics import mean_squared_error, mean_absolute_error

mean_absolute_error(test_data, test_predictions)  #MAE
mean_squared_error(test_data, test_predictions)  #MSE

'STARTERS > TIL' 카테고리의 다른 글

[TIL] 24일차 TIL(20230310) - Python 시계열 분석 4 (딥러닝, RNN, LSTM) (0)	2023.03.12
[TIL] 23일차 TIL(20230309) - Python 시계열 분석 3 (ARIMA, 차수 선택, SARIMA) (0)	2023.03.12
[TIL] 21일차 TIL(20230307) - Python 시계열 분석 1 (1)	2023.03.07
[TIL] 20일차 TIL(20230306) - Tableau 고급 2 (0)	2023.03.06
[TIL] 19일차 TIL(20230303) - Tableau 고급 1 (0)	2023.03.03

'STARTERS/TIL' Related Articles

Diary, Data, IT

[TIL] 22일차 TIL(20230308) - Python 시계열 분석 2 (지수평활법, 홀트-윈터스 계절성 기법) 본문

[TIL] 22일차 TIL(20230308) - Python 시계열 분석 2 (지수평활법, 홀트-윈터스 계절성 기법)

[TIL] 22일차 TIL(20230308) - Python 시계열 분석 2

(지수평활법, 홀트-윈터스 계절성 기법)

1. ETS 분해

2. EWMA(지수평활법)

3. 홀트-윈터스 계절성 기법

4. 홀트-윈터스 모델을 통한 예측 실습코드

'STARTERS > TIL' 카테고리의 다른 글

티스토리툴바