Q-values output is NaN in DQN model - input state is normalized and padded

Question

I'm training a Deep Q-Network (DQN) to trade crypto using historical data. My model keeps outputting NaN values for the Q-values during prediction. I'm using a custom function getState2() to generate the state, and the state is then passed into model.predict(state).

The Q-values are:

Q-values: [[[nan nan nan] [nan nan nan] ... [nan nan nan]]]

Model Architecture:

from keras.models import Sequential from keras.layers import Dense, Flatten, Input from keras.optimizers import Adam model = Sequential() model.add(Input(shape=(window_size, num_features))) model.add(Flatten()) model.add(Dense(64, activation='relu')) model.add(Dense(32, activation='relu')) model.add(Dense(3, activation='linear')) # 3 actions: Buy, Sell, Hold model.compile(loss='mse', optimizer=Adam(learning_rate=0.001))

getState2 Function:

def getState2(data, step, window_size): import pandas as pd import numpy as np start = step - window_size + 1 if start < 0: pad_count = abs(start) pad_df = pd.DataFrame([data.iloc[0]] * pad_count, columns=data.columns) block = pd.concat([pad_df, data.iloc[0:step + 1]], ignore_index=True) else: block = data.iloc[start:step + 1].copy() # Normalize each column for col in block.columns: col_std = block[col].std() if col_std != 0: block[col] = (block[col] - block[col].mean()) / col_std else: block[col] = 0 # or keep original state = block.values return np.expand_dims(state, axis=0) # Shape: (1, window_size, num_features)

What I’ve Tried:

Ensured all values in the input state are finite (np.isfinite(state).all() is True).

Tried removing normalization. still NaNs.

Added Flatten() layer to ensure shape is compatible.

Re-initialized the model.

Verified my training data doesn’t have NaNs.

Printed the input to model.predict() and confirmed it looks fine.

Why would model.predict() be returning only NaN values for Q-values?

Stack Exchange Network

Q-values output is NaN in DQN model - input state is normalized and padded

0

Hot Network Questions

Q-values output is NaN in DQN model - input state is normalized and padded

0

Know someone who can answer? Share a link to this question via email, Twitter, or Facebook.

Related

Hot Network Questions