mlapi image resizing curiosity

Discussion topics related to mobile applications and ZoneMinder Event Server (including machine learning)
Post Reply
ibrewster
Posts: 31
Joined: Sat Aug 31, 2019 4:18 pm

mlapi image resizing curiosity

Post by ibrewster »

Sorry if this is the wrong place, but I can't find any other support info for mlapi/zmeventserver questions.

At any rate, in the latest version of mlapi (and, presumably, zmeventserver), I noticed that in the config it notes that the image will be internally resized to 416x416. This made me wonder about a couple of things:

1) Since that is square, but my input images are not, is the input image distorted on this resize? If so, wouldn't that affect recognition accuracy? Or is it simply resized to be no more than 416x416, while keeping the aspect ratio?

2) Since the image is going to be resized internally, would it be better to enable the resize_image option in the objectconfig.ini? Or is said option no longer relevant since it is resizing internally?

3) Why so small? Does image recognition work better on small images?
User avatar
asker
Posts: 1553
Joined: Sun Mar 01, 2015 12:12 pm

Re: mlapi image resizing curiosity

Post by asker »

The reason for 416x416 is that the default Yolo weights are actually trained for 416x416 sized images. I do believe there are weights available that are also trained in 608x608 somewhere.

Note that training much larger images takes a huge amount of time, and as I understand, at some point, gives diminishing returns. A lot of weights are actually trained even smaller at 256x256. Don't think of these as purely visual (i.e. bigger the better).

Yolo maintains aspect ratio - the default weights use 416x416, so if the resized image doesn't fit, it will pad it.

There's a lot of black magic (to me) in these models and how they learn/get accuracy. I understand some of it, most of it I don't and this is an area of continuous research. If you want to dive into details, you're going to have to read the research papers and understand the math. I don't dive in that deep. There are also many online discussions on stackoverflow/darknet/tensorflow repos you can follow.

This brings me to: I really don't think the resize option in object config is useful, since darknet already resizes it to 416x416 no matter what we resize it to.
I no longer work on zmNinja, zmeventnotification, pyzm or mlapi. I may respond on occasion based on my available time/interest.

Please read before posting:
How to set up logging properly
How to troubleshoot and report - ES
How to troubleshoot and report - zmNinja
ES docs
zmNinja docs
Post Reply