Questions tagged [python]
Use for data science questions related to the programming language Python. Not intended for general coding questions (which should be asked on Stack Overflow).
6,626 questions
0 votes
0 answers
23 views
Unable to run pandas/modin[ray] code on sagemaker unified studio
I am working on a movie recommendation problem where I get multiple files from the source, and the total data size is around 900 MB. I am using the ...
2 votes
1 answer
39 views
Python Datascience on windows OS behind a proxy impossible to cross: which solution do you use?
In an structure, IT has deployed a strict proxy policy (no specific right for any people). Windows 11 is the OS installed for every people, in a strict way. To run datascience tasks using python, in ...
0 votes
0 answers
13 views
Unexpected Feature Importance Pattern in Random Forest Classification of MNIST Digits 0 and 1
I performed Random Forest–based feature importance analysis on the MNIST dataset, focusing only on digits 0 and 1. When I visualize the importance map (see image below), it doesn’t resemble the ...
0 votes
0 answers
12 views
How can I group transcribed phrases into meaningful chunks without using complex models?
I have a large set of phrases obtained via Azure Fast Transcription, and I need to group them into coherent semantic chunks (to use later in a RAG pipeline). Initially, I tried grouping phrases based ...
0 votes
0 answers
20 views
How to extract my fingerprint from my laptop's finger sensor
So like I have a bunch of fingerprint as a data set (my college gave me). Now I want to use these fingerprint as datasets and train a model to understand the different things. That is beside the point....
1 vote
0 answers
36 views
How to identify and quantify main tendencies across participants from cluster membership heatmaps?
I'd appreciate your thoughts on the following problem. I've created a heatmap plot (attached) showing the cluster membership ratio for each participant (in separate subplots) and condition (η). Now, I'...
1 vote
0 answers
13 views
How to interpret an unstable learning curve on a model tuned with Hyberband Tuning?
I have used Hyperband automatic tuning for an ANN model to predict price. After running the model with the automatic tuning, I am obtaining an R2 score of 1.00 that suggests overfitting, however, I am ...
4 votes
0 answers
30 views
Time-efficient parallelization of masks for pre-processing a dataset
I have a large dataset (~10M points) in python and I want to filter it using a large number of different custom masks, as part of calculations to create a new but related dataset. Because the dataset ...
5 votes
1 answer
68 views
Jupyter notebooks compiled from different building blocks
I use Jupyter notebooks to teach programming, using markdown in text cells, and I want to separate the concepts by level-1 headings (starting with # Heading), for ...
4 votes
1 answer
80 views
RAG Chatbot does not keep track of chat session history
I built a RAG chatbot in python,langchain, OpenAI LLM, and FAISS for the vectorstore. And the data is stored as JSON. The chatbot does not always keep track of the inputs and outputs. Here is an ...
3 votes
1 answer
42 views
Is it possible to make the python widget in Orange to give output and receive input (both in the same widget)
I'm working on a project which works on loop control, when I try to implement that in the orange platform, I'm unable to connect one widget (python script) to another in loop, as the connection is ...
0 votes
1 answer
60 views
NLP : How to clean the data of a conversation correctly?
Say we have the data as follows Input ...
2 votes
0 answers
66 views
RAG Chatbot does not answer paraphrased questions
I built a RAG chatbot in python,langchain, and FAISS for the vectorstore. And the data is stored as JSON. The chatbot sometimes refuses to answer when a question is rephrased. Here are two ...
0 votes
0 answers
45 views
Qiskit Problem: this solution is a bit slow, is there a way to make it faster and increase the accuracy a little bit?
I'm currently making a small binary classification program using Quantum Machine Learning (EstimatorQNN to be more specific). My program classifies data inside the Wisconsin Breast Cancer database and ...
5 votes
1 answer
134 views
How to get MLFlow built container to listen on 0.0.0.0?
I'm following this tutorial and am stuck on step 8: https://mlflow.org/docs/latest/ml/getting-started/hyperparameter-tuning/#test-your-container The inference server is listening on ...