Deep Reinforcement Learning for Adaptive Portfolio Optimization in Dynamic Financial Environments

Fenna Trowbridge

doi:10.5281/zenodo.17567673

pdf

Published: 2025-09-01

DOI: https://doi.org/10.5281/zenodo.17567673

Fenna Trowbridge

University of Michigan–Flint

Abstract

Financial portfolio optimization has long been a central challenge in quantitative finance, aiming to balance the trade-off between maximizing returns and minimizing risks. Traditional portfolio management strategies, such as the mean-variance model, rely heavily on predefined assumptions about market distributions and are limited by static parameter configurations. In contrast, deep reinforcement learning (DRL) provides a flexible and adaptive framework capable of learning optimal policies directly from data. This paper proposes a Deep Reinforcement Learning-Based Adaptive Portfolio Optimization (DRL-APO) framework that integrates temporal feature extraction, policy gradient learning, and reward shaping mechanisms to address the dynamic and stochastic nature of financial markets. The proposed approach combines a convolutional feature encoder and a long short-term memory (LSTM) network to capture multi-scale temporal dependencies from historical price data, while a proximal policy optimization (PPO) agent dynamically adjusts asset weights to optimize the Sharpe ratio and cumulative return. Experimental evaluations conducted on benchmark financial datasets, including S&P 500, NASDAQ, and cryptocurrency indices, demonstrate that DRL-APO consistently outperforms traditional baselines such as Mean-Variance, Deep Q-Learning, and Actor-Critic models. The proposed method achieves superior adaptability to volatility shifts and robust performance under varying market regimes.

Issue

Vol. 1 No. 2 (2025)

Section

Articles

Article Sidebar

Main Article Content

Abstract

Article Details

Issue

Section