A Novel Multivariate Bi-LSTM model for Short-Term Equity Price Forecasting
Abstract
Prediction models are crucial in the stock market as they aid in forecasting future prices and trends, enabling investors to make informed decisions and manage risks more effectively. In the Indian stock market, where volatility is often high, accurate predictions can provide a significant edge in capitalizing on market movements. While various models like regression and Artificial Neural Networks (ANNs) have been explored for this purpose, studies have shown that Long Short-Term Memory networks (LSTMs) are the most effective. This is because they can capture complex temporal dependencies present in financial data. This paper presents a Bidirectional Multivariate LSTM model designed to predict short-term stock prices of Indian companies in the NIFTY 100 across four major sectors- ICICI Bank, NTPC, Ambuja Cement and Wipro. The study utilizes eight years of hourly historical data, from 2015 to 2022, to perform a comprehensive analysis of the proposed methods. Both Univariate LSTM and Univariate Bidirectional LSTM models were evaluated based on R² score, RMSE, MSE, MAE, and MAPE. To improve predictive accuracy, the analysis was extended to multivariate data. Additionally, 12 technical indicators, having high correlation values with the close price(greater than 0.99) including EMA5, SMA5, TRIMA5, KAMA10 and the Bollinger Bands were selected as variables to further optimize the prediction models. The proposed Bidirectional Multivariate LSTM model, when applied to a dataset containing these indicators, achieved an exceptionally high average R² score of 99.4779% across the four stocks, which is 3.9833% higher than that of the Unidirectional Multivariate LSTM without technical indicators. The proposed model has an average RMSE of 0.0103955, an average MAE of 0.007485 and an average MAPE of 1.1635%. This highlights the model’s exceptional forecasting accuracy and emphasizes its potential to improve short-term trading strategies.
Index Terms:
Deep Learning, LSTM, Bidirectional LSTM, RNN, Time Series, Capital MarketsI Introduction
The stock market is a critical component of the financial system, and its significant price fluctuations can have far-reaching effects on the broader economy [1]. Accurate predictions of stock prices are essential for preventing market crashes, enhancing market management, and fostering financial stability. However, forecasting stock prices presents challenges due to the market’s volatility, non-linearity, and the complex nature of financial data. While traditional methods have focused on historical data analysis to predict future trends, recent advancements in machine learning, particularly Long Short-Term Memory (LSTM) networks, have shown promise in improving prediction accuracy by capturing long-term patterns in time-series data [2, 3, 4]. This study utilizes both Univariate and Multivariate LSTM models to forecast short-term stock prices, with a focus on stocks from the National Stock Exchange of India (NSE). Given the high volatility of the stock market, effective prediction mechanisms are crucial for managing risks and making informed investment decisions, especially in India where over 190 million people actively trade stocks [8]. By analyzing historical market data, this study aims to improve resource management and trend forecasting. Hourly data is used to capture daily market fluctuations while considering computational complexity, with the goal of identifying trading opportunities and enhancing short-term price trend analysis. The research focuses on predicting short-term closing prices for stocks from four major sectors—Finance, Energy, Industrial, and Information Technology (IT)—which are significant contributors to India’s economy. The selected stocks are among the top contributors to the NIFTY 100 index, making them key indicators of market performance. Additionally, backtesting is employed to evaluate the effectiveness of trading strategies and predictive models using historical data. This process helps researchers and investors assess the potential success of their approaches before real-world application, thereby reducing financial risk and improving decision-making accuracy.
II Literature Review
Stock price prediction has evolved from traditional statistical methods like ARIMA (AutoRegressive Integrated Moving Average) and GARCH (Generalized Autoregressive Conditional Heteroskedasticity) to Machine Learning (ML) techniques and, more recently, deep learning approaches to enhance forecasting accuracy [9]. Statistical methods, foundational in econometrics, capture patterns in historical data but may struggle with non-linear relationships and complex patterns in modern markets. For instance, Menon et al. (2016) successfully used ARIMA and GARCH for bulk price forecasting [10], while Shakhla et al. (2018) applied multiple linear regression with success [11].
ML models, including Naive Bayes(NV),Random Forests (RF) and Support Vector Machines (SVM), are able to overcome these limitations by leveraging their ability to handle large datasets and complex variable interactions [12]. Patel et al.(2015) applied these algorithms to predict stock and stock index movements, demonstrating superior performance compared to traditional statistical approaches. However, despite their advantages, these methods often require extensive hyperparameter tuning, can be computationally intensive, and suffer from overfitting, especially when applied to volatile financial markets.
ANNs were designed to learn intricate relationships through multiple hidden layers and activation functions and used by Guresen et al. for stock market index prediction. These models can handle large volumes of data with high dimensionality and uncover deep, non-linear dependencies that ML techniques might miss[13]. However, a drawback of ANNs is that they cannot deal with temporal dependencies and sequential patterns.
Recurrent Neural Networks (RNNs), overcoming these temporal dependencies, handle sequential data by maintaining a form of memory through feedback connections. This enables them to process sequences more effectively than ANNs. Though RNNs can efficiently handle temporal data[14], they face issues like vanishing gradients, which affect their performance over long sequences.
Long Short-Term Memory (LSTM) networks are a major improvement for time series prediction, addressing the issues faced by RNNs. LSTMs include memory cells capable of retaining information over extended periods, lessening the vanishing gradient problem and allowing accurate capturing of long-term dependencies. Hochreiter and Schmidhuber (1997) first introduced LSTM networks, and emphasized that they could represent intricate temporal patterns[4]. LSTM models possess short-term memory and can catch the longer term effects and predict financial data accurately[15]. LSTMs outperform traditional econometric models in forecasting foreign exchange rates[16]. Additionally, LSTMs can achieve better results than classical ML methods in predicting stock market movements[17]. Taking these factors into consideration, this study focuses on LSTM as the primary architecture.
Unlike traditional LSTMs that process data in a single direction, bidirectional LSTMs analyze sequences in both forward and backward directions [19]. This approach allows them to utilize information from both past and future contexts, offering a more comprehensive understanding of the data. This makes them especially efficient for stocks [20]. Notably, Han et al. implemented a Bidirectional LSTM model, with a mean squared error of 0.00020 [21]. We propose to use technical indicators to further improve upon the model performance.
Incorporation of technical indicators, like moving averages and relative strength index, leads to further improvement in the accuracy of prediction models. Alsubaie et al. in their 2019 study, found that using at least 10 technical indicators will lead to a higher prediction accuracy on stock price data[22].
In this study, we present a Bidirectional Multivariate LSTM model for short term trade prediction, which incorporates the use of technical indicators.
III Methodology
III-A Dataset
The stocks used for the study are ICICI, NTPC, Ambuja Cement and Wipro. Historical prices of a total of eight years from Jan 2015 to Feb 2022 were extracted from the Yahoo Finance website [23]. The data samples were of 5-minute intervals, which were converted to hourly data during preprocessing. Each final stock dataset thereby had 10,862 entries. We then used TA-lib [24], an open source library for Technical Analysis of financial data to generate the following 50 technical indicators. The input and output pairs were then created using a window size of 24. A train, test and validation split of 70%, 15% and 15% respectively was used. The training dataset had 7586 values, test 1626 and validation 1626.

A brief explanation of Open High Low Close and Volume (OHLCV) metrics:
-
•
Open Price: The open price represents the initial price at which a stock begins trading during a specified period.
-
•
High Price: It is the highest price reached by a stock within a given time frame.
-
•
Low Price: It is the lowest price recorded by the stock during a specific time window.
-
•
Closing Price: It indicates the value of a stock at the end of a particular time frame.
-
•
Volume: Volume is the number of shares traded (both sold and bought) in all within a selected period, typically on a daily basis.
III-B Data Preprocessing
The data was first converted to hourly intervals. We decided to analyze the importance of the generated indicators and select only those which would improve the prediction models’ performance.
III-B1 Selection of technical indicators
We calculated the average correlation of each technical indicator with the close price over all four stocks, and selected the 12 most highly correlated values (greater than 0.99), including open, high, close, low and volume.
The correlation formula as given below was used. Where and represent individual data points, whereas and are the variables’ means. The selected indicators were SMA5, EMA5, TRIMA5, KAMA10, Lowerband, Middleband and Upperband.
III-B2 Normalization
After selecting technical indicators, we performed an exploratory data analysis of each stock dataset. For example, Figure 2 gives the distribution for ICICIBANK.

Taking into consideration the large range of the data and its volatility - extreme values, negative values and so on, we decided to normalize each column, ensuring all values lie in the range [0,1], so as to ensure better performance of the neural networks.
III-C Technical Indicators
The Simple Moving Average (SMA) is used to calculate average price over a given period, smoothing out the data to find trends. The Exponential Moving Average (EMA) gives greater weightage to recent prices, thus being more sensitive to any new information. The Triangular Moving Average (TRIMA) smooths price data with greater emphasis on the middle period of the calculation range [18]. Bollinger Bands are made up of one middle band, SMA, and two outer bands which reflect price volatility. The lower and upper bands are set at a distance of typically two standard deviations with respect to the middle band. They help identify overbought or oversold conditions and volatility.
Indicator | Formula |
---|---|
SMA5 | |
EMA5 | |
KAMA10 | |
TRIMA5 | |
Lower Band | |
Middle Band | |
Upper Band |
Kaufman’s Adaptive Moving Average (KAMA) adjusts based on market volatility, filtering out noise by adapting its sensitivity. Table I contains formulae for the selected indicators, where stands for close price, is the smoothing constant in KAMA, and is the deviation multiplier in Bollinger Bands.
III-D Prediction models
III-D1 LSTM Cell Architecture
The cell state (C) in an LSTM acts as long-term memory, preserving information across time steps for learning long sequences, while the hidden state (h) represents the immediate output, summarizing the cell state for predictions. LSTM cells have three gates: the input gate, which controls how much new input is added to the cell state; the forget gate, which decides what to discard from the cell state; and the output gate, which filters the cell state’s information to influence the hidden state. The cell state is updated by combining the old state with a new memory candidate, guided by the input and forget gates.
III-D2 Unidirectional LSTM
Unidirectional LSTM models process data in one direction, typically from past to future, relying solely on historical information. Univariate LSTM focuses on forecasting a single time series, such as stock prices, making it suitable for simpler tasks where only past data is necessary.

In contrast, Multivariate LSTM handles multiple time series or features, like stock prices and trading volumes, simultaneously. This approach captures interactions between variables, leading to improved forecasting accuracy and a deeper understanding of how various factors impact the target variable. By processing data sequentially, unidirectional LSTMs are effective for scenarios where future context is not required for accurate predictions.
III-D3 Bidirectional LSTM
Introduced by Graves et al. in 2005 [19], Bidirectional LSTM (BiLSTM) models enhance classical LSTM architectures by processing data in the forward direction as well as the backward direction. This architecture enables the capture of dependencies from past contexts and from future contexts relative to each time step, offering an increasingly comprehensive grasp of temporal patterns.

Our proposal focuses on the Bidirectional Multivariate LSTM, which extends this concept to multiple time series or features, such as stock prices and trading volumes. By processing these features in both directions, the model can capture complex interactions and correlations between variables from both past and future perspectives. This processing enables the model to integrate a richer set of information, providing a more nuanced understanding of how various factors influence each other and the target variable. The enhanced ability to model intricate relationships and dependencies improves prediction accuracy and offers deeper insights into the dynamics of the data.
III-E Prediction Approaches
-
(a)
Univariate – Close Value approach: Here, we used only the closing price data to predict future close prices.
-
(b)
Multivariate – OHLCV Approach: Here, we used the open, high, low, close and volume features to predict close prices.
-
(c)
Multivariate – Technical Indicators Approach: Here, in addition to OHLCV, we used technical indicators as well.


IV Implementation and Results
IV-A Evaluation Measures
The following standard measures were used to derive a robust performance evaluation for the considered models, taking into consideration the nature of the data and the intended task:
-
1.
Score: Also called the coefficient of determination, it is a statistical measure used in ML for evaluating regression models. By evaluating the percentage of the dependent variable’s variance that the independent variables account for, it determines how well a model fits the data.
Mathematically, it is calculated by the comparison of the Sum of Squared Residuals (SSR) with the Total Sum of Squares (SST).
-
2.
Mean Absolute Error (MAE):
MAE calculates the average magnitude of errors between the actual and predicted values without considering whether they are positive or negative. It is a straightforward measure of model accuracy, and reflects the average error in the same units as those of the data.
-
3.
Mean Absolute Percentage Error (MAPE): MAPE measures the average percentage error between the predicted and actual values. It is often used to express forecast accuracy as a percentage, and is easy to interpret. However, it can be sensitive to extremely small actual values, which then leads to very high percentage errors.
-
4.
Root Mean Squared Error (RMSE): RMSE is calculated as the square root of the MSE, bringing the error measure back to the original units of the data. It is widely used because it penalizes large errors more than MAE, while still being interpretable with respect to units.
IV-B Sliding Window Approach
In this study, we employ a sliding window approach to forecast the next time step’s equity price using historical price data. The sliding window technique is a very frequently used method in time series analysis, especially while dealing with sequential data using RNNs such as LSTM networks.
IV-B1 Window Size Selection
Selecting a window size, denoted as , is critical for capturing temporal dependencies in the data. For our experiments, we selected a window size of hours, corresponding to sequential hourly stock prices. This window size was determined based on preliminary experiments and domain-specific considerations, ensuring that the model captures daily price trends while avoiding overfitting to shorter-term fluctuations.
IV-B2 Formulation
Given a time series of stock prices , where is the stock price at time , the sliding window approach creates overlapping sequences of length to be used as input features. For each prediction, the model utilizes the past prices to forecast the price at the next time step .
Mathematically, for a given time , the input-output pair can be defined as:
Where represents the input sequence consisting of the stock prices from to , and represents the predicted stock price at time .
IV-C Model Training and Results
All models were run on each of the 4 stocks, in identical hardware and software environments. Using Google Colab’s T4 GPU, the models were trained for 10 epochs, using an initial learning rate of 0.001. The Adam optimizer was used, and the loss function was mean squared error. Each of the models had early stopping with a patience factor of 5.
Model | MAE | RMSE | MAPE | |
---|---|---|---|---|
Univariate LSTM | 0.9826 | 0.010906 | 0.014003 | 1.4012 |
Univariate bi-LSTM | 0.9893 | 0.007936 | 0.010972 | 1.0712 |
Multivariate OHCLV LSTM | 0.9241 | 0.023189 | 0.029181 | 2.8648 |
Multivariate OHCLV bi-LSTM | 0.9922 | 0.006668 | 0.009364 | 0.8863 |
Multivariate LSTM | 0.9879 | 0.008702 | 0.001163 | 1.1295 |
Multivariate bi-LSTM | 0.9926 | 0.006532 | 0.009096 | 0.8639 |
Model | MAE | RMSE | MAPE | |
---|---|---|---|---|
Univariate LSTM | 0.9935 | 0.009571 | 0.012951 | 1.3396 |
Univariate bi-LSTM | 0.9885 | 0.013873 | 0.017319 | 1.8516 |
Multivariate OHCLV LSTM | 0.9216 | 0.037438 | 0.045227 | 4.7221 |
Multivariate OHCLV bi-LSTM | 0.9870 | 0.015423 | 0.018392 | 2.1552 |
Multivariate LSTM | 0.9732 | 0.021153 | 0.026422 | 2.6542 |
Multivariate bi-LSTM | 0.9961 | 0.007396 | 0.010097 | 1.0236 |
As can be seen from the results tables II, III, IV, and V, the proposed Bidirectional LSTM with technical indicators shows the best performance on all stocks, achieving an score of 99.67% on Ambuja Cement and 99.61% on Wipro, with MAE values of 0.006584 and 0.007396, respectively. This can also be seen from the model prediction plots on the Ambuja Cement dataset in Figure 7.
Model | MAE | RMSE | MAPE | |
---|---|---|---|---|
Univariate LSTM | 0.9492 | 0.031262 | 0.037446 | 5.4748 |
Univariate bi-LSTM | 0.9891 | 0.012964 | 0.017321 | 2.5136 |
Multivariate OHCLV LSTM | 0.9803 | 0.017054 | 0.023464 | 3.1943 |
Multivariate OHCLV bi-LSTM | 0.9899 | 0.013085 | 0.016812 | 2.4457 |
Multivariate LSTM | 0.9910 | 0.012214 | 0.015862 | 2.2621 |
Multivariate bi-LSTM | 0.9937 | 0.009430 | 0.013310 | 1.8296 |
Model | MAE | RMSE | MAPE | |
---|---|---|---|---|
Univariate LSTM | 0.9929 | 0.009896 | 0.013360 | 1.4271 |
Univariate bi-LSTM | 0.9770 | 0.019744 | 0.024054 | 2.6028 |
Multivariate OHCLV LSTM | 0.9938 | 0.009361 | 0.012525 | 1.3354 |
Multivariate OHCLV bi-LSTM | 0.9899 | 0.012644 | 0.015975 | 1.6891 |
Multivariate LSTM | 0.9942 | 0.009257 | 0.012055 | 1.2908 |
Multivariate bi-LSTM | 0.9967 | 0.006584 | 0.009079 | 0.9369 |
In contrast, the Univariate LSTM shows the weakest performance, with higher MAE and RMSE values, such as 0.010906 and 0.014003 for ICICI, and lower scores. Adding Open High Low and Volume statistics improved the prediction performance, as compared to univariate prediction, and led to higher accuracy.

The results indicate that incorporating technical indicators as input variables enhances model performance by providing additional features that capture market dynamics like momentum, volatility, and trends. These indicators help the model learn more complex patterns and reduce overfitting. The use of Bidirectional LSTMs further improves performance by processing the sequence data in backward as well as forward directions, allowing the model to capture dependencies from both future and past time steps. This bidirectional approach improves the model’s ability to learn robust temporal patterns and address long-term dependencies with greater accuracy than traditional LSTMs.
V Conclusion and Future Scope
This paper aims to develop a novel technique for forecasting stock price trends, specifically tailored for short-term trades using hourly data from ICICI, NTPC, Ambuja Cement, and Wipro over an eight-year period (2015–2022). We employed Unidirectional and Bidirectional LSTM models for both Univariate and Bivariate data. Three research approaches were explored: the first used a basic univariate forecast model with stock close price data, the second included additional features such as open, high, low, and volume, and the third incorporated 12 features, including technical indicators that closely follow the trends of the close price.
The analysis revealed that the proposed Bidirectional LSTM model outperformed Unidirectional LSTMs, demonstrating a comprehensive understanding of price movements. Furthermore, the inclusion of technical indicators significantly improved the proposed model’s performance across all sectors of the Indian National Stock Exchange, demonstrating it’s robustness and adaptability to diverse market conditions.
The proposed model achieved the highest score of 99.6736% on Ambuja Cement, underscoring the value of integrating technical indicators with advanced LSTM architectures. This makes it ideal for predicting stock movements in real time, leading to more profitable and secure investments across all major sectors of the Indian National Stock Exchange.
The proposed model offers potential for future improvements and wider applications. One approach is to extend it for long-term stock price predictions by using longer window sizes to capture broader market trends. Expanding the model to other major stock exchanges, like NYSE, Nikkei, and LSE, could validate its effectiveness across different market conditions. Additionally, incorporating macroeconomic indicators, social media sentiment, and global news could enhance its predictive power, leading to a more comprehensive approach to stock price forecasting.
References
- [1] Rajesh, D., Suresh, M., & Sankar, E. S. THE IMPACT OF STOCK MARKET ON INDIAN ECONOMY.
- [2] Chaudhary, R., Bakhshi, P., & Gupta, H. (2020). Volatility in international stock markets: An empirical study during COVID-19. Journal of Risk and Financial Management, 13(9), 208.
- [3] Haase, F., & Neuenkirch, M. (2023). Predictability of bull and bear markets: A new look at forecasting stock market regimes (and returns) in the US. International Journal of Forecasting, 39(2), 587-605.
- [4] Hochreiter, S. (1997). Long Short-term Memory. Neural Computation MIT-Press.
- [5] Sunny, M. A. I., Maswood, M. M. S., & Alharbi, A. G. (2020, October). Deep learning-based stock price prediction using LSTM and bi-directional LSTM model. In 2020 2nd novel intelligent and leading emerging sciences conference (NILES) (pp. 87-92). IEEE.
- [6] Pawar, K., Jalem, R. S., & Tiwari, V. (2019). Stock market price prediction using LSTM RNN. In Emerging Trends in Expert Applications and Security: Proceedings of ICETEAS 2018 (pp. 493-503). Springer Singapore.
- [7] Ahire, P., Lad, H., Parekh, S., & Kabrawala, S. (2021). LSTM based stock price prediction. International Journal of Creative Research Thoughts, 9(2), 5118-5122.
- [8] NSE India - Number of registered investors as of 22-02-2024. Retrieved from https://www.nseindia.com/registered-investors
- [9] Mintarya, L. N., Halim, J. N., Angie, C., Achmad, S., & Kurniawan, A. (2023). Machine learning approaches in stock market prediction: A systematic literature review. Procedia Computer Science, 216, 96-102.
- [10] Menon, V. K., Chekravarthi Vasireddy, N., Jami, S. A., Pedamallu, V. T. N., Sureshkumar, V., & Soman, K. P. (2016). Bulk price forecasting using spark over NSE data set. In Data Mining and Big Data: First International Conference, DMBD 2016, Bali, Indonesia, June 25-30, 2016. Proceedings 1 (pp. 137-146). Springer International Publishing.
- [11] Shakhla, S., Shah, B., Shah, N., Unadkat, V., & Kanani, P. (2018). Stock price trend prediction using multiple linear regression. Int. J. Eng. Sci. Invent, 7(10), 29-33.
- [12] Patel, J., Shah, S., Thakkar, P., & Kotecha, K. (2015). Predicting stock and stock price index movement using trend deterministic data preparation and machine learning techniques. Expert systems with applications, 42(1), 259-268.
- [13] Guresen, E., Kayakutlu, G., & Daim, T. U. (2011). Using artificial neural network models in stock market index prediction. Expert systems with Applications, 38(8), 10389-10397.
- [14] Jahan, I., & Sajal, S. (2018). Stock price prediction using recurrent neural network (RNN) algorithm on time-series data. In 2018 Midwest instruction and computing symposium. Duluth, Minnesota, USA: MSRP.
- [15] Zhang, X., Liang, X., Zhiyuli, A., Zhang, S., Xu, R., & Wu, B. (2019, July). At-lstm: An attention-based lstm model for financial time series prediction. In IOP Conference Series: Materials Science and Engineering (Vol. 569, No. 5, p. 052037). IOP Publishing.
- [16] Kaushik, M., & Giri, A. K. (2020). Forecasting foreign exchange rate: A multivariate comparative analysis between traditional econometric, contemporary machine learning & deep learning techniques. arXiv preprint arXiv:2002.10247.
- [17] Nelson, D. M., Pereira, A. C., & De Oliveira, R. A. (2017, May). Stock market’s price movement prediction with LSTM neural networks. In 2017 International joint conference on neural networks (IJCNN) (pp. 1419-1426). IEEE.
- [18] Pring, M.J., Technical Analysis Explained: The Successful Investor’s Guide to Spotting Investment Trends and Turning Points, McGraw-Hill Education, 2014
- [19] Graves, A., & Schmidhuber, J. (2005, July). Framewise phoneme classification with bidirectional LSTM networks. In Proceedings. 2005 IEEE International Joint Conference on Neural Networks, 2005. (Vol. 4, pp. 2047-2052). IEEE.
- [20] Althelaya, K. A., El-Alfy, E. S. M., & Mohammed, S. (2018, April). Evaluation of bidirectional LSTM for short-and long-term stock market prediction. In 2018 9th international conference on information and communication systems (ICICS) (pp. 151-156). IEEE.
- [21] Han, C., & Fu, X. (2023). Challenge and opportunity: deep learning-based stock price prediction by using Bi-directional LSTM model. Frontiers in Business, Economics and Management, 8(2), 51-54.
- [22] Alsubaie, Y., El Hindi, K., & Alsalman, H. (2019). Cost-sensitive prediction of stock price direction: Selection of technical indicators. IEEE Access, 7, 146876-146892.
- [23] Yahoo Finance - Historical data for CNX100. Retrieved from https://finance.yahoo.com/
- [24] TA-Lib - Technical analysis library functions. Retrieved from https://ta-lib.org/functions/