AdaBoost, which stands for Adaptive Boosting, is a popular machine learning algorithm used for classification and regression tasks. It belongs to the family of ensemble learning algorithms that combines multiple base models to achieve better performance than any of the individual models. AdaBoost is particularly known for its ability to improve the accuracy of weak classifiers by weighting their predictions based on their individual accuracy. This allows it to create a strong classifier from a collection of weak classifiers.
How does AdaBoost Algorithm work?
In AdaBoost Algorithm, a set of weak classifiers is first trained on a given dataset. Weak classifiers are classifiers that have an accuracy slightly better than random guessing. After the initial training of the weak classifiers, AdaBoost assigns higher weights to the data points that were misclassified by the weak classifiers. The next round of training is then focused on these misclassified data points. The process is repeated for a set number of rounds, and the final model is obtained by combining the weak classifiers based on their individual accuracy and weighted according to their performance.
Example of using AdaBoost Algorithm for stock price prediction
In this article, we will explore how to use AdaBoost Algorithm to predict the stock price of RELIANCE.NS, the stock of Reliance Industries Limited, an Indian multinational conglomerate. We will use the yfinance Python package to retrieve the historical stock data from Yahoo Finance. We will then use AdaBoost Algorithm to predict the stock price for the next 60 days.
We start by importing the required libraries and setting the time zone to India Standard Time. We then retrieve the historical stock data from Yahoo Finance using the yfinance package. We split the dataset into train and test sets, with the last 60 days of the dataset used as the test set. We then convert the data into arrays for training and testing the model.
import pytz import yfinance as yf import pandas as pd import numpy as np import matplotlib.pyplot as plt from sklearn.ensemble import AdaBoostRegressor from datetime import datetime as dt from sklearn.tree import DecisionTreeRegressor tz = pytz.timezone("Asia/Kolkata") start = tz.localize(dt(2001,8,1)) #end = tz.localize(dt.today())
end = tz.localize(dt(2022,8,1))tickers = "RELIANCE.NS".split(",") df = yf.download(tickers, start, end) train_data = df[:len(df)-60] test_data = df[len(df)-60:] X_train = np.array(range(0,len(train_data))).reshape(-1, 1) y_train = train_data['Adj Close'].values X_test = np.array(range(len(train_data),len(df))).reshape(-1, 1) y_test = test_data['Adj Close'].values
We then use AdaBoostRegressor from the scikit-learn package to train the model with a DecisionTreeRegressor as the base estimator. We set the number of estimators to 300 and the maximum depth of the decision tree to 4.
regr = AdaBoostRegressor(DecisionTreeRegressor(max_depth=4), n_estimators=300, random_state=42)
regr.fit(X_train, y_train)
We use the trained model to predict the stock price for the next 60 days and store the predictions in a pandas DataFrame. After downloading the stock prices data, we split the data into training and testing sets, where the last 60 days are used as the testing set. We then use the AdaBoost algorithm, which is an ensemble learning technique that combines multiple weak learners to create a strong learner. The AdaBoost algorithm assigns weights to each sample in the training set based on their classification accuracy. Misclassified samples are given higher weights so that subsequent weak learners can focus on correctly classifying those samples.
In our case, we use the AdaBoostRegressor function from the scikit-learn package, which uses the AdaBoost algorithm to train a regression model. We use DecisionTreeRegressor as the base estimator for the AdaBoostRegressor function. The number of estimators is set to 300, which means that 300 decision trees are trained sequentially. The maximum depth of each decision tree is set to 4, which helps prevent overfitting of the training data.
Once the model is trained, we use it to predict the stock prices for the next 60 days, which we store in the next_days dataframe. We then plot the actual stock prices for the last 300 days along with the predicted stock prices for the next 60 days.
Finally, we print the predicted stock price for the last day of the test data using the tail function on the next_days dataframe. This gives us an idea of how well the model has performed.
next_days = pd.DataFrame(index=test_data.index, columns=test_data.columns) next_days['Adj Close'] = regr.predict(X_test) plt.figure(figsize=(10,5)) plt.plot(train_data.index[-300:], train_data['Adj Close'].tail(300), label='Train') plt.plot(test_data.index[-300:], test_data['Adj Close'].tail(300), label='Test') plt.plot(next_days.index, next_days['Adj Close'], label='Forecast') plt.legend(loc='best') plt.title('AdaBoost Algorithm') plt.xlabel('Date') plt.ylabel('Price') plt.show() print(next_days.tail(1))
![Ada Boost Algorithm to predict Stock Price](https://i0.wp.com/akanther.com/wp-content/uploads/adaboost.png?resize=640%2C325&ssl=1)
You must log in to post a comment.