NSE-HDFCBANK STOCK PREDICTION & FORECASTING USING LSTM NEURAL NETWORK

NSE-HDFCBANK STOCK PREDICTION & FORECASTING USING LSTM NEURAL NETWORK

Imagine the stock market as a rollercoaster. Prices go up and down, sometimes predictably, often not. Can we predict these ups and downs? That's what stock prediction using Recurrent Neural Networks (RNNs), specifically LSTMs (Long Short-Term Memory), tries to do.

WHAT LSTM:

  • LSTMs are a type of RNN that can "remember" long sequences of data, unlike basic RNNs. This is crucial for stock prices, which depend on past trends and events.
  • We feed stock data, like closing prices, volumes, and news sentiment, to the LSTM. It analyses the patterns and learns to predict future values.

WHY?

  • Predicting stock prices helps investors make informed decisions. Buy low, sell high, right?
  • It can also benefit financial institutions and economists to understand market trends and risks.

RNN (Recurrent Neural Network):

  • Imagine you're reading a sentence. Each word depends on the previous ones to make sense. RNNs work similarly, processing data sequentially, "remembering" previous inputs to interpret the current one.
  • Think of it like a conveyor belt: data goes in one by one, the network analyzes it while considering what came before, and outputs a prediction or understanding.
  • RNNs are good for tasks like language translation, speech recognition, and time series forecasting (like stock prices).

DIFFERENT FEATURES RNN Vs LSTM Vs CNN:

Data type:

RNN: Sequential (text, time series)

LSTM: Sequential (text, time series)

CNN: Grid-like (images, videos)

Memory:

RNN: Short-term

LSTM: Long-term (gates)

CNN: No explicit memory

Strengths:

RNN: Time series analysis, language processing

LSTM: Long-term dependencies, forecasting

CNN: Image recognition, classification


Import Libraries:

  • Imports pandas, matplotlib

import pandas as pd         

1.Data Exploration and Preprocessing:

df = pd.read_csv('HDFCBANK1.csv')
df.head()  # View first few rows        

OUTPUT:

Article content
Top 5 rows of HDFCBANK data
df1 = df.reset_index()['Close']  # Extract closing prices
import matplotlib.pyplot as plt
plt.plot(df1)  # Plot closing prices        

OUTPUT

Article content

Scales the closing prices to a range of 0-1 using MinMaxScaler.

import numpy as np
from sklearn.preprocessing import MinMaxScaler
scaler = MinMaxScaler(feature_range=(0, 1))
df1 = scaler.fit_transform(np.array(df1).reshape(-1, 1))  # Scale values to 0-1        

2. Data Splitting:

  • Splits the scaled data into training (70%) and testing (30%) sets.

##splitting dataset into train and test split
training_size=int(len(df1)*0.7)
test_size=len(df1)-training_size
train_data,test_data=df1[0:training_size,:],df1[training_size:len(df1),:1]
training_size,test_size        

OUTPUT: (865, 372)

3. DATA PREPARATION FOR LSTM:

  • Defines a function create_dataset to create sequences of data for LSTM input.
  • Creates training and testing datasets with sequences of length 100.
  • Reshapes the data into the 3D format required by LSTM (samples, time steps, features).

import numpy
# convert an array of values into a dataset matrix
def create_dataset(dataset, time_step=1):
	dataX, dataY = [], []
	for i in range(len(dataset)-time_step-1): # Create sequences of length 'time_step'
		a = dataset[i:(i+time_step), 0]   ###i=0, 0,1,2,3-----99   100 
		dataX.append(a)
		dataY.append(dataset[i + time_step, 0])
	return numpy.array(dataX), numpy.array(dataY)
# reshape into X=t,t+1,t+2,t+3 and Y=t+4
time_step = 100 # Use 100 time steps
X_train, y_train = create_dataset(train_data, time_step)
X_test, ytest = create_dataset(test_data, time_step)
# reshape input to be [samples, time steps, features] which is required for LSTM
X_train =X_train.reshape(X_train.shape[0],X_train.shape[1] , 1)
X_test = X_test.reshape(X_test.shape[0],X_test.shape[1] , 1)        

4. BUILDING LSTM MODEL:

Import Necessary Libraries:

  • Imports the Sequential model class for building a linear stack of layers.
  • Imports the Dense layer class for creating fully connected layers.
  • Imports the LSTM layer class for creating Long Short-Term Memory layers.

Create Sequential Model:

  • Instantiates a Sequential model object, serving as a container for the layers.

Add LSTM Layers:

  • Adds the first LSTM layer with 50 units. Ensures the layer outputs a sequence of values, not just the final state. Specifies the expected input shape (100 timesteps, 1 feature).
  • Adds the second LSTM layer with 50 units, also returning sequences.
  • Adds the third LSTM layer with 50 units, no longer returning sequences.

Add Output Layer:

  • Adds a final Dense layer with a single output unit for making the prediction.

Compile the Model:

  • Configures the model for training.
  • Uses mean squared error as the loss function for optimization.
  • Employs the Adam optimizer to update model weights during training.

View Model Summary:

Prints a summary of the model's architecture, including layers, output shapes, and parameter counts.

### Create the Stacked LSTM model
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras.layers import LSTM
model=Sequential()
model.add(LSTM(50,return_sequences=True,input_shape=(100,1)))
model.add(LSTM(50,return_sequences=True))
model.add(LSTM(50))
model.add(Dense(1))
model.compile(loss='mean_squared_error',optimizer='adam')
model.summary()        

OUTPUT

Article content

5 TRAIN THE MODEL:

  • Trains the model using provided data.
  • Training data (features and targets).
  • Validation data for monitoring performance during training.
  • 100 Number of training iterations over the entire dataset.
  • 64 Number of samples processed per training update.
  • Displays progress bar during training.

model.fit(X_train,y_train,validation_data=(X_test,ytest),epochs=100,batch_size=64,verbose=1)        

6. MODEL PREDICTION & EVALUATION

  • Imports the TensorFlow library for deep learning tasks.
  • Generates predictions for the training data using the trained model.
  • Generates predictions for the test data.
  • Converts predictions back to the original scale using a scaler object (likely used for preprocessing).
  • Same transformation for test predictions.

import tensorflow as tf
### Lets Do the prediction and check performance metrics
train_predict=model.predict(X_train)
test_predict=model.predict(X_test)
##Transformback to original form
train_predict=scaler.inverse_transform(train_predict)
test_predict=scaler.inverse_transform(test_predict)        

a. Calculate RMSE Metrics

  • Imports the math library for mathematical functions and the mean squared error function from scikit-learn.
  • Calculates the root mean squared error (RMSE) for the training and test data.

### Calculate RMSE performance metrics
import math
from sklearn.metrics import mean_squared_error
math.sqrt(mean_squared_error(y_train,train_predict))
### Test Data RMSE
math.sqrt(mean_squared_error(ytest,test_predict))        

b. Plotting:

  • Sets a look_back parameter, likely related to the time window used for prediction
  • Creates an empty array similar to df1 to hold shifted train predictions.
  • Fills it with NaN values initially.
  • Inserts train predictions with appropriate shifting.
  • Plots the original data, Plots the train & test predictions. Displays the plot.

### Plotting 
# shift train predictions for plotting
look_back=100
trainPredictPlot = numpy.empty_like(df1)
trainPredictPlot[:, :] = np.nan
trainPredictPlot[look_back:len(train_predict)+look_back, :] = train_predict
# shift test predictions for plotting
testPredictPlot = numpy.empty_like(df1)
testPredictPlot[:, :] = numpy.nan
testPredictPlot[len(train_predict)+(look_back*2)+1:len(df1)-1, :] = test_predict
# plot baseline and predictions
plt.plot(scaler.inverse_transform(df1))
plt.plot(trainPredictPlot)
plt.plot(testPredictPlot)
plt.show()        

c. Predict for Next 30 Days:

  • Initializes an empty list to hold predictions.
  • Sets a window size for prediction (likely 100 timesteps).
  • Loops 30 times to make 30-day predictions.
  • Checks if temp_input has enough elements for prediction.
  • Shapes input for prediction using x_input = x_input.reshape((1, n_steps, 1)).
  • Makes a prediction using yhat = model.predict(x_input, verbose=0).
  • Updates temp_input and lst_output for the next prediction step

# demonstrate prediction for next 30 days
from numpy import array

lst_output=[]
n_steps=100
i=0
while(i<30):
   
  if(len(temp_input)>100):
    #print(temp_input)
    x_input=np.array(temp_input[1:])
    print("{} day input {}".format(i,x_input))
    #x_input=x_input.reshape(1,-1)
    x_input = x_input.reshape((1, n_steps, 1))
    #print(x_input)
    yhat = model.predict(x_input, verbose=0)
    print("{} day output {}".format(i,yhat))
    temp_input.extend(yhat[0].tolist())
    temp_input=temp_input[1:]
    #print(temp_input)
    lst_output.extend(yhat.tolist())
    i=i+1
  else:
    x_input = x_input.reshape((1, n_steps,1))
    yhat = model.predict(x_input, verbose=0)
    print(yhat[0])
    temp_input.extend(yhat[0].tolist())
    print(len(temp_input))
    lst_output.extend(yhat.tolist())
    i=i+1
   

print(lst_output)
x_input=test_data[272:].reshape(1,-1)
# Reshape lst_output for plotting
lst_output = np.array(lst_output).reshape((30, 1)) # Shape (30, 1)
lst_output.shape
day_new=np.arange(1,101)
day_pred=np.arange(101,131)
import matplotlib.pyplot as plt
plt.plot(day_new,scaler.inverse_transform(df1[1137:]))
plt.plot(day_pred,scaler.inverse_transform(lst_output))        

d. Prepare Input for Further Prediction:

  • x_input = test_data[272:].reshape(1, -1):Selects a portion of test_data starting from index 272.Reshapes it into a 2D array with 1 row and an unspecified number of columns, likely for making a subsequent prediction using the model.

e. Reshape Predictions for Plotting:

  • lst_output = np.array(lst_output).reshape((30, 1)):Converts lst_output (likely containing previous predictions) into a NumPy array.Reshapes it into a 2D array with 30 rows and 1 column, likely to align with the 30 predicted values.
  • lst_output.shape: Prints the shape of the reshaped lst_output for confirmation.

f. Create Arrays for Plotting Days:

  • day_new = np.arange(1, 101): Creates an array with numbers from 1 to 100, likely representing the original time steps for plotting.
  • day_pred = np.arange(101, 131): Creates an array with numbers from 101 to 130, likely representing the predicted time steps for plotting.

g. Import Plotting Library and Plot:

  • import matplotlib.pyplot as plt: Imports the Matplotlib library for plotting.
  • plt.plot(day_new, scaler.inverse_transform(df1[1137:])):Plots the original data using day_new as the x-axis and the inverse-transformed values from a portion of df1 (index 1137 onwards) as the y-axis.The inverse transformation suggests the data was scaled before, and this step brings it back to its original scale for plotting.
  • plt.plot(day_pred, scaler.inverse_transform(lst_output)):Plots the predicted values using day_pred as the x-axis and the inverse-transformed values from lst_output as the y-axis.Again, the inverse transformation is likely for visualization purposes.

x_input=test_data[272:].reshape(1,-1)

# Reshape lst_output for plotting
lst_output = np.array(lst_output).reshape((30, 1)) # Shape (30, 1)
lst_output.shape

day_new=np.arange(1,101)
day_pred=np.arange(101,131)

import matplotlib.pyplot as plt
plt.plot(day_new,scaler.inverse_transform(df1[1137:]))
plt.plot(day_pred,scaler.inverse_transform(lst_output))        

OUTPUT

Article content

Conclusion:

HDFCBANK price may move more than 5% up or 75 points from current price of 1478 within 30 days. Kindly note it is not a recommendation just for studying purpose only

You can use NSEPY library to extract any NSE india stock or use any api to extract data directly from website.

Thanks for your time. Feel free to ask any queries and request to share this article to your contacts.





































































































































































































































































































Thanks for sharing

To view or add a comment, sign in

More articles by Hari Galla

Insights from the community

Others also viewed

Explore topics