Questions tagged [computer-vision]
Computer Vision is a subfield of computer science which deals with analyzing and understanding images. This includes detection of objects like faces in images or segmenting images.
610 questions
0 votes
0 answers
12 views
does object Aspect-ratio affect our resize policy?
Is object aspect ratio truly important for resize robustness, or is this suggestion based on a misunderstanding — e.g., treating a very wide object as if it is “bigger” or “pixel-richer” than a square ...
0 votes
1 answer
144 views
YOLO knowledge distillation (11x to 11n) yields poorer performance than native training
I'm trying to distill a YOLO11x detection model into a YOLO11n for inference speed improvements without sacrificing too much detection performance. For this, I just overloaded some functions in the ...
2 votes
1 answer
54 views
Can I train what background is in a picture?
I have ~1,000 pictures like this. I really want the long, thin rock core in the middle, but they all differ (slightly) in angle, have different lengths, shaped ends and rock colours vary. I tried ...
2 votes
0 answers
62 views
DensNet169 model accuracy not increasing on medical classification dataset
I am training an DensNet model on medical dataset which has gold standards as per annotation. After training i noticed accuracy is just 60%. Later i performed following changes but still no luck. ...
2 votes
0 answers
39 views
human detection using Thermal Imaging camera and Machine Learning on Raspberry Pi
I'm working on a Raspberry Pi 4–based project involving the MLX90640 thermal camera breakout. The camera outputs a thermal heat map (a low-resolution infrared image of 32x24 pixels). My goal is to ...
5 votes
1 answer
99 views
How to normalize bounding box sizes in perspective transform for objects at different distances from the camera
I’m working on an object detection system and I'm new to this field. Here i'm talking with respect to camera point of view. When a object is detected which is far from the camera, it appears small and ...
1 vote
1 answer
66 views
Need support to straighten,crop image properly for requirement in computer vision
My requirement: Need to extract license plates without duplicates and store images in a folder,then apply ocr to extract text from images. What i have achieved: Iam able to detect license plates ...
0 votes
0 answers
38 views
How to properly implement and debug RPN anchors in ResNet-18 for multi-object detection?
I am working on my first object detection project and need to implement multi-object detection using ResNet-18 (I am restricted to using this architecture). My dataset follows the COCO format and ...
0 votes
0 answers
36 views
Validation metrics plateau from the first few epochs at relatively good values and don't improve
I am working on 6D pose tracking, where the goal is to estimate how 3D position and orientation of an object changes from frame t-1 to t. Train/validation datasets are synthetic and come from a single ...
0 votes
0 answers
51 views
How can I convert a one-line substation schema image into XML/JSON with all components and connections preserved?
I have an image of a one-line substation schema diagram that includes various components (like transformers, circuit breakers, etc.) and the connections between them. I’m looking for a way to convert ...
1 vote
0 answers
24 views
Persistent 6D Rotation Representation Collapse to near-zero magnitudes in sequential camera rotation estimation
I am using a 6D continuous rotation representation (e.g., two orthogonal vectors from a 3×3 rotation matrix) to predict camera rotations in panoramic video sequences. Since panoramic videos involve ...
1 vote
0 answers
42 views
CNN for gaze regression predicts near the mean
I am currently building my first CNN network on my own for a regression task for which the network must predict the coordinates I am looking at on my screen based on an input image taken through my ...
0 votes
0 answers
30 views
is it possible for YOLO to remember previous weights data if I only train new class data using previous weights?
I trained my first model using yolov8m.pt. now I want to use best.pt from the first model to train the second model with new class data (not including first model data). This second class needs to ...
0 votes
1 answer
32 views
Looking for images dataset with multiple images per instance
I'm looking for images dataset which have multiple images per instance. For example, healthcare dataset, where each person is classified with a diagnosys and have several images describing them.
0 votes
0 answers
39 views
How do you create a model to track where a specific event starts and ends within a dataset?
I'm trying to automate a process where someone has to tag when an animal jumps from one platform to another platform. Currently, a manual review of the video is done to note at which frame the animal ...