Predicting Stock Price Direction Using Support Vector Machines

In this article we are going to learn how to predict stock price direction using Support Vector Machines.

Machine Learning is an Artificial Intelligence application that is improving the way the world functions in every discipline. At its essence, it is an algorithm or model that identifies patterns in a specific data collection and then predicts the learned patterns on generic data. In layman’s words, it’s the concept that robots learn a pattern and adjust through experience to make correct and repeatable conclusions. In this post, we will look into Predicting Stock Price Direction Using Support Vector Machines. Let’s begin.

Installing libraries and importing them
In the first step we just need to install the libraries and import them.

!pip install pandas
!pip install numpy
! pip install scikit-learn
import pandas as pd
import numpy as np
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score
import matplotlib.pyplot as plt
import warnings
Downloading and reading stock dataset
Reading the dataset from the file is the next job. You can download the dataset from here, and the file will be in external storage. We are using pandas to read the dataset.

Example
df = pd.read_csv(‘/content/sample_data/RELIANCE.csv’)
df.head()
Output
Date Symbol Series Prev Close Open High Low Last Close VWAP Volume Turnover Trades Deliverable Volume %Deliverble RELIANCE EQ 233.05 237.50 251.70 237.50 251.70 251.70 249. .111319e+14 NaN NaN NaN RELIANCE EQ 251.70 258.40 271.85 251.30 271.85 271.85 263. .500222e+14 NaN NaN NaN RELIANCE EQ 271.85 256.65 287.90 256.65 286.75 282.50 274. .373697e+14 NaN NaN NaN RELIANCE EQ 282.50 289.00 300.70 289.00 293.50 294.35 295. .633254e+14 NaN NaN NaN RELIANCE EQ 294.35 295.00 317.90 293.00 314.50 314.55 308. .138388e+14 NaN NaN NaN
Data Preparation
The date column should function as an index in order to analyze the data before usage.

Example
Output
Symbol Series Prev Close Open High Low Last Close VWAP Volume Turnover Trades Deliverable Volume %Deliverble
Date RELIANCE EQ 233.05 237.50 251.70 237.50 251.70 251.70 249. .111319e+14 NaN NaN NaN RELIANCE EQ 251.70 258.40 271.85 251.30 271.85 271.85 263. .500222e+14 NaN NaN NaN RELIANCE EQ 271.85 256.65 287.90 256.65 286.75 282.50 274. .373697e+14 NaN NaN NaN RELIANCE EQ 282.50 289.00 300.70 289.00 293.50 294.35 295. .633254e+14 NaN NaN NaN RELIANCE EQ 294.35 295.00 317.90 293.00 314.50 314.55 308. .138388e+14 NaN NaN NaN
… … … … … … … … … … … … … … … RELIANCE EQ 1441. . . . . . . .518059e+ . .0 0. RELIANCE EQ 1431. . . . . . . .190317e+ . .0 0. RELIANCE EQ 1424. . . . . . . .354223e+ . .0 0. RELIANCE EQ 1445. . . . . . . .717698e+ . .0 0. RELIANCE EQ 1472. . . . . . . .702029e+ . .0 0. Explanatory factors
The value response variable is predicted using explanatory or independent factors. The variables that are utilized for prediction are stored in the X dataset. Variables like “Open-Close” and “High-Low” are part of the X. These can be viewed as markers that the algorithm will use to forecast the trend for the upcoming day. Feel free to include more metrics and assess the results.

Example
Output
Open-Close High-Low
Date .20 14. .45 20. .85 31. .35 11. .55 24.90
Targeting variable
The target dataset y contains the appropriate trade signal, which the machine learning algorithm will try to predict.

y = np.where(df[‘Close’].shift(-1) > df[‘Close’], 1, 0)
Splitting the data into train and test
There will be distinct data sets for training and testing.

Support Vector Classifier
Now it’s time use support vector classifier.

Example
cls = SVC().fit(X_train, y_train)
df[‘prediction’] = cls.predict(X)
print(df[‘prediction’])
Output
Date .. Name: prediction, Length: 5075, dtype: int64
Conclusion
Support Vector Machine, a well-liked and space-effective approach for classification and regression applications, leverages geometrical concepts to address our issues. We also used the SVM algorithm to forecast the direction of stock price movement. In the corporate sector, stock price forecasting is quite important, and when we automate this process, it raises awareness of the issue.

Updated on 01-Dec :34:17