Questions tagged [pytorch]

Question 1

My model takes in an image of a handwritten equation and converts it into its LaTeX representation. In order to do this, it uses a ResNet50 pre-trained model for feature extraction and a transformer ...

Question 2

I have a Mask2Former model fine-tuned on my own custom dataset and it is working nicely. I want to play around with knowledge distillation and use my pretrained ...

Question 3

Beginner ML practitioner here. I'm trying to do some time series forecasting on a fairly high resolution dataset that stretches over a long period of time. The values vary pretty widely over time: to ...

Question 4

In the work I am doing right now, I have multiple (say 5, for purposes of illustration) pieces of text, (which are somewhat close in meaning, let's say for clarity). My objective is to combine these 5 ...

Question 5

I’m planning to fine-tune a YOLO model for a custom object detection task. There seem to be two main approaches: Clone the official YOLO GitHub repository (e.g., YOLOv5 or YOLOv8), adjust the codebase ...

Question 6

A few days ago I installed my new NVIDIA GeForce RTX 5090 and I can't get pytorch to work on my Win11 Desktop (just background info, the question is not directly ...

Question 7

I am trying to determine if using multiple instances of nn.Embedding() has any value over using a single instance in training a model. As an example, let's say I ...

Question 8

I have set up a DQN with TorchRL to solve a problem where the agent can move in a square grid and pick some rewards scattered randomly on it. Right now, I am using a 5x5 grid and have 3 rewards on it. ...

Question 9

I am trying to apply the idea from Embedding Deep Networks into Visual Explanations and see if it works on Transformers. The performance is terrible because the accuracy hasn't passed 10%. Can someone ...

Question 10

I'm trying to distill a YOLO11x detection model into a YOLO11n for inference speed improvements without sacrificing too much detection performance. For this, I just overloaded some functions in the ...

Question 11

I am using CNN-transformer hybrid architecture to detect handwritten equation and convert them to LaTex strings. All target sequences (the actual LaTex representation of a handwritten equation) are ...

Question 12

I am trying to train a bert-base using LoRA with HF transformers to experiment how different datasets could influence the model's output. This is just a simple project, and I am not trying to ...

Question 13

To be clear, I shuffled my data when I trained it. It is only the testing data that I modified to be unshuffled, and found that accuracy tanks. (i also used the same data for training and for testing)

Question 14

I am applying the N Beats Model of the pytorch-forecasting package on a traffic dataset. I am doing single step prediction with a context length of 5. Now the prediction is unfortunately slightly ...

Question 15

I have been trying to code an upscaling gan but while the code run, I pretty much always end up with terrible result when the gan doesn't collapse, collapse which happen often. I previously tried to ...

Stack Exchange Network

Questions tagged [pytorch]

Sequence generation model produces incorrect, but coherent outputs

How to correctly implement the loss function for my distillation of Mask2Former?

LSTM feature scaling with windowing?

How do I combine multiple texts with mathematical accuracy using specific weights?

Fine-tuning YOLO: Directly cloning and modifying the GitHub repo vs. using Transformers library and Hugging Face — pros and cons?

Is CUDA 13 a thing (or am I misinterpreting something)?

Single nn.Embedding instance vs mulitple nn.Embedding instances

How should a typical reward curve look like while training a RL model

terrible performance on CIFA10 using SWIN model

YOLO knowledge distillation (11x to 11n) yields poorer performance than native training

Model seems to peek into target sequence and cheat during training despite using masking

What are the correct steps to successfully train a simple bert seq2seq model on scraped data?

when testing with shuffled data, accuracy is high, but when testing with unshuffled data, accuracy is low

N-Beats, Pytorch forecasting: predicitons are slightly shifted

Why is my upscaling gan not working?

Hot Network Questions