Questions tagged [computer-vision]

Question 1

Is object aspect ratio truly important for resize robustness, or is this suggestion based on a misunderstanding — e.g., treating a very wide object as if it is “bigger” or “pixel-richer” than a square ...

Question 2

I'm trying to distill a YOLO11x detection model into a YOLO11n for inference speed improvements without sacrificing too much detection performance. For this, I just overloaded some functions in the ...

Question 3

I have ~1,000 pictures like this. I really want the long, thin rock core in the middle, but they all differ (slightly) in angle, have different lengths, shaped ends and rock colours vary. I tried ...

Question 4

I am training an DensNet model on medical dataset which has gold standards as per annotation. After training i noticed accuracy is just 60%. Later i performed following changes but still no luck. ...

Question 5

I'm working on a Raspberry Pi 4–based project involving the MLX90640 thermal camera breakout. The camera outputs a thermal heat map (a low-resolution infrared image of 32x24 pixels). My goal is to ...

Question 6

I’m working on an object detection system and I'm new to this field. Here i'm talking with respect to camera point of view. When a object is detected which is far from the camera, it appears small and ...

Question 7

My requirement: Need to extract license plates without duplicates and store images in a folder,then apply ocr to extract text from images. What i have achieved: Iam able to detect license plates ...

Question 8

I am working on my first object detection project and need to implement multi-object detection using ResNet-18 (I am restricted to using this architecture). My dataset follows the COCO format and ...

Question 9

I am working on 6D pose tracking, where the goal is to estimate how 3D position and orientation of an object changes from frame t-1 to t. Train/validation datasets are synthetic and come from a single ...

Question 10

I have an image of a one-line substation schema diagram that includes various components (like transformers, circuit breakers, etc.) and the connections between them. I’m looking for a way to convert ...

Question 11

I am using a 6D continuous rotation representation (e.g., two orthogonal vectors from a 3×3 rotation matrix) to predict camera rotations in panoramic video sequences. Since panoramic videos involve ...

Question 12

I am currently building my first CNN network on my own for a regression task for which the network must predict the coordinates I am looking at on my screen based on an input image taken through my ...

Question 13

I trained my first model using yolov8m.pt. now I want to use best.pt from the first model to train the second model with new class data (not including first model data). This second class needs to ...

Question 14

I'm looking for images dataset which have multiple images per instance. For example, healthcare dataset, where each person is classified with a diagnosys and have several images describing them.

Question 15

I'm trying to automate a process where someone has to tag when an animal jumps from one platform to another platform. Currently, a manual review of the video is done to note at which frame the animal ...

Stack Exchange Network

Questions tagged [computer-vision]

does object Aspect-ratio affect our resize policy?

YOLO knowledge distillation (11x to 11n) yields poorer performance than native training

Can I train what background is in a picture?

DensNet169 model accuracy not increasing on medical classification dataset

human detection using Thermal Imaging camera and Machine Learning on Raspberry Pi

How to normalize bounding box sizes in perspective transform for objects at different distances from the camera

Need support to straighten,crop image properly for requirement in computer vision

How to properly implement and debug RPN anchors in ResNet-18 for multi-object detection?

Validation metrics plateau from the first few epochs at relatively good values and don't improve

How can I convert a one-line substation schema image into XML/JSON with all components and connections preserved?

Persistent 6D Rotation Representation Collapse to near-zero magnitudes in sequential camera rotation estimation

CNN for gaze regression predicts near the mean

is it possible for YOLO to remember previous weights data if I only train new class data using previous weights?

Looking for images dataset with multiple images per instance

How do you create a model to track where a specific event starts and ends within a dataset?

Hot Network Questions