Questions tagged [class-imbalance]

Question 1

I am working on a network security-related project, in which I have to build a deep learning model to detect a specific attack. It's about detecting whether a network system of an organisation is a ...

Question 2

I’m working on a MarTech use case (predict customers conversions to a certain product). Not really used to work within this domain, therefore I’m seeking some critical questions on my set up. Context: ...

Question 3

i am working on my bachelor thesis, the name of the topic is Diabetes prediction using machine learning. Dataset i am working on is from Kaggle and it's called Pima Indians Diabetes. Since my dataset ...

Question 4

Following on from my recent post on the topic, my goal here is to synthesise the excellent community wisdom on it over at Cross Validated into a "canonical" Q&A for the data science SE :)...

Question 5

I'm working with a custom YOLO-like architecture implemented in TensorFlow/Keras. While pretraining on the COCO dataset works, I plan to fine-tune the model on a highly imbalanced dataset. ...

Question 6

i am working on a project to check for churn prediction, but my data is very imbalanced I tried so many things but this the best model I can get to my main problem is that I want recall and Precision ...

Question 7

I have a table in a database; let's call it TABLE1. It contains several columns: One for a unique customer ID Several feature columns One for the class I want to predict There are ~280k rows where ...

Question 8

I need to calculate class-weights to train my deep learning model. In order to simulate real-world producing scenario as possible as I can, I have excluded the testing/infering dataset from which ...

Question 9

I'm working on predicting two genetic mutations simultaneously using an XGBoost Multioutput Classifier. My dataset is severely imbalanced, particularly for cases where both genetic mutations are ...

Question 10

We use Smote to balance the imbalanced dataset but why we are manipulating things and cannot use the natural data i mean what is the need for balancing what exact impact it will make to model

Question 11

I'm trying to build a predictive model, but I haven't found a method that consistently delivers high performance. Is it acceptable to use an # Optimize classification threshold 0.996 ?

Question 12

This is the accuracy and loss plot for CNN model. Is it possible that train and test accuracy may starts from 80% from the 1st epoch itself for 5 k fold.

Question 13

I am working on a highly imbalanced fraud detection dataset (class 0:284315 instances, class 1: 492 instances) and trying to implement random undersampling correctly during cross-validation in Orange. ...

Question 14

I can see everywhere that when the dataset is imbalanced PR-AUC is a better performance indicator than ROC. From my experience, if the positive class is the most important, and there is higher ...

Question 15

I am generally trying to take into account costs in learning. The set-up is as follows: a statistical learning problem with usuall X and y, where y is imbalanced (roughly 1% of ones). Scikit learn ...

Stack Exchange Network

Questions tagged [class-imbalance]

When should we avoid balancing an imbalanced dataset?

Imbalanced classes and ML set up

Balancing dataset question

Is class imbalance really a problem in machine learning?

What loss functions are suitable for a YOLO-like architecture in TensorFlow/Keras, especially for fine-tuning on an imbalanced dataset?

churn prediction machine learning low precision

Why do these undersampling methods return such different results?

Should class-weights take validation-set into account?

How to Properly Use scale_pos_weight in an XGBoost MultiOutput Classifier to Address Severe Class Imbalance?

Why do we need Smote?

Question on Optimized Threshold in Predictive Modeling

What do these train and test accuracy and loss graphs suggest ? Can train and test accuracy reach 80% after one epoch?

how to properly implement Random Undersampling during Cross-Validation in Orange

ROC vs PR-score and imbalanced datasets

Taking into account instance cost in learning?

Hot Network Questions