How to normalize bounding box sizes in perspective transform for objects at different distances from the camera

Question

I’m working on an object detection system and I'm new to this field. Here i'm talking with respect to camera point of view. When a object is detected which is far from the camera, it appears small and the bounding box is small and when the same small object comes near to the camera, the same object appears big and the bounding box is big. So, I want to normalise the bounding boxes so that objects of the same size appear similar in dimensions, regardless of their distance from the camera.

Below is diagram, which might explain you well:

In the above diagram, PQRS is the camera view, the camera is at certain height from the ground. Obj 1 is the object which is travelling from the position Pos A, we can see that at Pos A the object appears to be small and the bounding box is small and when the same object travels to the position Pos B after some frames( sorry for mentioning sec/mins in the diagram ), the object appears bigger than it was at the Pos A.

Here is my real question:

How should I normalise the bounding box sizes, are there standard techniques like homography, perspective warp or others to handle this?

Are there pre-built libraries that can be used to normalising the bounding boxes.

Thank you for your guidance and feedback in advance.

either you have to know distance from the camera, else you can use a plausible default size per object — Nikos M.
– Nikos M., Commented May 14 at 13:56
@NikosM. I don't know the distance of object from the camera, and for using plauisble default size the object's are not same every time. It's a real time. — Basavaraj Kittali
– Basavaraj Kittali, Commented May 15 at 6:14
@NikosM. i'll consider your below point's, since i want a accurate and real time working solution instead of making assumptions that would help me lot. I have searched for whitepapers, conference papers, but i had no luck in finding, so if have kind of the reference links for this problem, where i can get a little bit of idea will help me a lot. I'll consider your response. — Basavaraj Kittali
– Basavaraj Kittali, Commented May 16 at 6:35

Nikos M. · Accepted Answer · 2025-05-15 13:53:38Z

Determining actual size of object from image without having extra information and/or making some assumptions, does not have a unique solution.

Here are some ways to tackle this problem:

If distance from camera is known or can be estimated, then it can be used (using projection formula) to determine actual size.
If the object belongs to a class of known physical objects (eg a car) then a common/default size can be used as reference.
Else some chosen arbitrary reference size can be used and all detected objects be projected based on that.
Combinations of the above.

Stack Exchange Network

How to normalize bounding box sizes in perspective transform for objects at different distances from the camera

1 Answer 1

Hot Network Questions

How to normalize bounding box sizes in perspective transform for objects at different distances from the camera

1 Answer 1

Related

Hot Network Questions