I'm training a Deep Q-Network (DQN) to trade crypto using historical data. My model keeps outputting NaN values for the Q-values during prediction. I'm using a custom function getState2() to generate the state, and the state is then passed into model.predict(state).
The Q-values are:
Q-values: [[[nan nan nan] [nan nan nan] ... [nan nan nan]]] Model Architecture:
from keras.models import Sequential from keras.layers import Dense, Flatten, Input from keras.optimizers import Adam model = Sequential() model.add(Input(shape=(window_size, num_features))) model.add(Flatten()) model.add(Dense(64, activation='relu')) model.add(Dense(32, activation='relu')) model.add(Dense(3, activation='linear')) # 3 actions: Buy, Sell, Hold model.compile(loss='mse', optimizer=Adam(learning_rate=0.001)) getState2 Function:
def getState2(data, step, window_size): import pandas as pd import numpy as np start = step - window_size + 1 if start < 0: pad_count = abs(start) pad_df = pd.DataFrame([data.iloc[0]] * pad_count, columns=data.columns) block = pd.concat([pad_df, data.iloc[0:step + 1]], ignore_index=True) else: block = data.iloc[start:step + 1].copy() # Normalize each column for col in block.columns: col_std = block[col].std() if col_std != 0: block[col] = (block[col] - block[col].mean()) / col_std else: block[col] = 0 # or keep original state = block.values return np.expand_dims(state, axis=0) # Shape: (1, window_size, num_features) What I’ve Tried:
Ensured all values in the input state are finite (np.isfinite(state).all() is True).
Tried removing normalization. still NaNs.
Added Flatten() layer to ensure shape is compatible.
Re-initialized the model.
Verified my training data doesn’t have NaNs.
Printed the input to model.predict() and confirmed it looks fine.
Why would model.predict() be returning only NaN values for Q-values?