LSTM-DDPG for Trading with Variable Positions

Sensors (Basel). 2021 Sep 30;21(19):6571. doi: 10.3390/s21196571.

Abstract

In recent years, machine learning for trading has been widely studied. The direction and size of position should be determined in trading decisions based on market conditions. However, there is no research so far that considers variable position sizes in models developed for trading purposes. In this paper, we propose a deep reinforcement learning model named LSTM-DDPG to make trading decisions with variable positions. Specifically, we consider the trading process as a Partially Observable Markov Decision Process, in which the long short-term memory (LSTM) network is used to extract market state features and the deep deterministic policy gradient (DDPG) framework is used to make trading decisions concerning the direction and variable size of position. We test the LSTM-DDPG model on IF300 (index futures of China stock market) data and the results show that LSTM-DDPG with variable positions performs better in terms of return and risk than models with fixed or few-level positions. In addition, the investment potential of the model can be better tapped by the reward function of the differential Sharpe ratio than that of profit reward function.

Keywords: deep reinforcement learning; reward function; trading strategy; variable positions.

MeSH terms

  • Forecasting
  • Investments*
  • Machine Learning
  • Memory, Long-Term*
  • Policy