Skip to main content

Questions tagged [feature-engineering]

the process of using domain knowledge of the data to create features that improve machine learning algorithms

0 votes
0 answers
9 views

I am working on some natural language stuff for fun, basically a rhyming dictionary, trying to figure it out. Trying next to figure out how to properly/decently capture the basic consonants + vowels ...
Lance Pollard's user avatar
5 votes
1 answer
46 views

What I’m trying to figure out I'm working on a machine learning project and would love to hear your thoughts on two things: A. How to prioritize feature exploration B. Whether to fix hyperparameters (...
Ten's user avatar
  • 51
7 votes
1 answer
125 views

Beginner ML practitioner here. I'm trying to do some time series forecasting on a fairly high resolution dataset that stretches over a long period of time. The values vary pretty widely over time: to ...
Seth's user avatar
  • 251
4 votes
2 answers
462 views

Question: Are there better approaches than regex for extracting event dates (including relative) from noisy text? Are there NLP tools that can help disambiguate multiple date mentions in various ...
ja_him's user avatar
  • 143
9 votes
2 answers
216 views

I am building a random forest regression model. The goal is to predict the maximum each customer will spend in a single transaction during the next 90 days. I have transaction data for 7m customers, ...
SRJCoding's user avatar
  • 191
3 votes
1 answer
367 views

I'm trying to do a regression problem where I find Molar compositions of some chemical species. I'm using this kind of netwrok: ...
Naivahash80's user avatar
3 votes
1 answer
105 views

I struggle to select the key features that contribute to PC1. I will use the public breast cancer dataset to illustrate the issue. Please feel free to point me to previous post if this question has ...
WhiskerFeatures's user avatar
0 votes
1 answer
95 views

I am working in a team developing a time series forecasting model using xgboost (or similar). We have a draft workflow for optimising model hyperparameters, incorporating an initial train-test split ...
Wannabe_PhD's user avatar
1 vote
1 answer
67 views

Need to make time series prediction on a large data set. There are both static and dynamic features. Static features like (store location id 10k+) and dynamics features like daily sales and daily ...
new world's user avatar
2 votes
0 answers
49 views

I have built a knowledge graph with every customer having the same attributes, for example, gradeA, gradeB, gradeC. With this graph I want to attempt to find patterns between customers with shared ...
Ocean's user avatar
  • 427
8 votes
4 answers
940 views

Let's assume I have a column with float values (e.g., 3.12334354454, 5.75434331354, and so on). If I round these values to two decimal places (e.g., 3.12, 5.75), I think the advantages and ...
Guna's user avatar
  • 897
1 vote
0 answers
39 views

When plotting a SHAP beeswarm plot on my binary classification model (predicting subscription renewal probability), one of the columns indicate that high feature values correlate with low SHAP values ...
fendrbud's user avatar
0 votes
0 answers
20 views

I'm building a neural network model to predict which student in a class will achieve the highest score on an upcoming exam (this is not the actual task, I actually modified the task to maintain ...
Saffy's user avatar
  • 11
2 votes
1 answer
71 views

In my regression-based machine learning project, I have features like coordinates (latitude and longitude) that I prefer not to scale or transform. The main reason is that reversing the transformation ...
ml.freak's user avatar
  • 113
0 votes
1 answer
32 views

Given hourly updates of precipitation amount (for the preceding hour) and temperature, how would you calculate if it's slippery or not?
tsorn's user avatar
  • 173

15 30 50 per page
1
2 3 4 5
44