Skip to main content

You are not logged in. Your edit will be placed in a queue until it is peer reviewed.

We welcome edits that make the post easier to understand and more valuable for readers. Because community members review edits, please try to make the post substantially better than how you found it, for example, by fixing grammar or adding additional resources and hyperlinks.

Required fields*

5
  • $\begingroup$ First thing I'd check is your activation function, if your output is with a sigmoid output your output is bounded between -1 and 1, for relu it is >0, so that cound explain why your NN is worse than your input. another thing to check is how you framed your problem. Usually normalizing your (input, output) relatively to the naive guess gives better result (In your case I would substract from your input the last value, and predict the difference between your true target and the last input value). To give a proper answer though I think it would be best if you provided some code if you can. $\endgroup$ Commented Sep 18, 2020 at 11:45
  • $\begingroup$ Thank you for your help! I try 4 different activation functions in the grid search including identity (see edit). The code is quite long and complicated. Are you interested in a specific section? I will gladly post it. $\endgroup$ Commented Sep 18, 2020 at 12:13
  • $\begingroup$ Yout network definition/GridSearch would be great, as well as one sample of your dataset (like 1 row of your X_train) I think that would help me and others better contribute to your issue $\endgroup$ Commented Sep 18, 2020 at 12:28
  • $\begingroup$ Note about the scaling of the feature matrix : my second suggestion is not about scaling features but more about predicting the difference rather than a raw value ( if the NN is random that should at least give your baseline's results). that would be asuivalent I think to a Y = Y - X[:, -1] and X = X - X[:, -1] $\endgroup$ Commented Sep 18, 2020 at 12:30
  • $\begingroup$ I have added the code and an example for a feature set wih the corresponding target value. I try to predict residuals which were generated by removing a priori models. Do you still recommend to work with the difference? $\endgroup$ Commented Sep 18, 2020 at 13:06