Questions tagged [prediction]
prediction is the use of data, statistical algorithms and machine learning techniques to identify the likelihood of future outcomes based on historical data. The goal is to go beyond knowing what has happened to providing a best assessment of what will happen in the future.
440 questions
3 votes
0 answers
66 views
Finding clusters in sales data and predicting future sales based on those
I have monthly sales data from a set of online merchants that sell on an online shop using a cloud-based software solution. The data look something like this: month merchant_id shop_id shop_country ...
1 vote
0 answers
127 views
ML models that train on graphs but infer without any edges (edge prediction task)
I'm exploring a machine learning research direction and I'm looking for ideas or pointers to existing models/projects that fit the following setup: The model is trained on graphs with edge information ...
1 vote
0 answers
51 views
Modelling Payment Default Probability
Suppose you work as a data scientist for a bank and want to create a model to predict the probability of a given client pays you back in, say, 12 months. If by anytime this client stays a period of, ...
7 votes
1 answer
91 views
Wind Power Data Analysis - Python
I am seeking some help and or perspectives in solving a problem. I have a dataset (accessible here) with the following columns: DATE: this is the date in dd/mm/yyyy format HH: this is the "half-...
0 votes
0 answers
38 views
using random forest to predict shipping rejection in various buckets (Final Disposition Activity Description) getting below error
ValueError Traceback (most recent call last) ~\AppData\Local\Temp/ipykernel_34600/2721349307.py in ----> 1 model.fit(X_train,y_train) ~\Anaconda4\lib\site-packages\...
3 votes
0 answers
32 views
Predict the next status given previous sequence
I have a sequence dataset as the following. These sequences are statuses got approved by clients and they are ordered by date/time. A client can get multiple statuses and jump back to the same status ...
0 votes
0 answers
68 views
Need Help Understanding AUC-ROC Curve
I am a student working on building a predictive model. While evaluating different models, I noticed that in some cases, some AUC is around 0.75, but the ROC curve appears below the random guess line. ...
2 votes
0 answers
91 views
Prediction interval vs Confidence interval in a (Poisson) GLM (in Python)
In short, I get prediction interval smaller than confidence interval while they should be wider. Any help to understand why is certainly welcome :) Let me start by stating the problem at hand. I use a ...
0 votes
1 answer
72 views
Scaling and PCA for test data before prediction
I'm fairly new to the world of ML & Data Science. I've completed a certification course in Coursera/IBM and I'm trying to hone my skills using some exercises from Kaggle. The course did not ...
0 votes
0 answers
22 views
What evaluation method is suitable if the detected data size is different from the actual (expected) one?
I want to evaluate the sequential tone detection system. Although the results are similar to what is expected, the problem is that the data size is different between the predicted data and the actual ...
0 votes
0 answers
42 views
Model Predicts Narrow Range of values but with promising MSE and RMSE values; ; Issues with Normalization and Error Metrics in Regression Task
I'm working on a spectrum sensing-based project, where I need to predict the SNR values from spectrogram images. To train and evaluate the model, I normalized the SNR ground truths, and I got decent ...
0 votes
1 answer
79 views
Can Polynomial Features Be Used in Logistic Regression and Random Forest Models?
I am working in Python to predict the treatment response of 43 patients using 10 predictors as input. I noticed that adding polynomial features to my models produces nearly perfect results. I am ...
7 votes
1 answer
339 views
When the regression models outperforms naive method?
I followed from this question. Case1: I have the following task to do: Training by the consecutive 3 days to predict the each 4th day. Each day data represents one CSV file which has dimension 24x25. ...
1 vote
1 answer
62 views
Multivariate linear regression via scikit and statsmodels
want to preface this first with terminology: multivariate regression deals with the case where there are more than one dependent variables while multiple regression deals with the case where there is ...
1 vote
0 answers
25 views
Need advice on feature engineering on Longitudinal Data
I'm trying to predict the rated capacity of a wind turbine given factors such as wind speed and direction. Now since this is weather data which is high resolution, I don't want to just average things ...