Buffer overrun causes continuous warnings

Support and queries relating to all previous versions of ZoneMinder
Locked
JackG
Posts: 5
Joined: Sat Oct 18, 2008 10:42 pm
Location: California

Buffer overrun causes continuous warnings

Post by JackG »

I'm running 1.23.3 on Gentoo. This is my crack at ZM, so I'm using a cheesy IBM USB 1.1 web cam, ibm-cam driver. It works well, very low load, all the features I want, for about a few hours. Then something bad happens, the system locks up with a busy disk. After reboot (even logging in on the console times out) I see an endless stream of these in my syslog:

Oct 19 03:15:40 statest zma_m1[9587]: WAR [Approaching buffer overrun, consider slowing capture, simplifying analysis or increasing ring buffer size]
Oct 19 03:15:40 statest zma_m1[9587]: WAR [Approaching buffer overrun, consider slowing capture, simplifying analysis or increasing ring buffer size]
Oct 19 03:15:40 statest zma_m1[9587]: WAR [Approaching buffer overrun, consider slowing capture, simplifying analysis or increasing ring buffer size]
Oct 19 03:15:40 statest zma_m1[9587]: WAR [Approaching buffer overrun, consider slowing capture, simplifying analysis or increasing ring buffer size]
Oct 19 03:15:40 statest zma_m1[9587]: WAR [Approaching buffer overrun, consider slowing capture, simplifying analysis or increasing ring buffer size]
Oct 19 03:15:41 statest zma_m1[9587]: WAR [Approaching buffer overrun, consider slowing capture, simplifying analysis or increasing ring buffer size]
Oct 19 03:15:41 statest zma_m1[9587]: WAR [Approaching buffer overrun, consider slowing capture, simplifying analysis or increasing ring buffer size]
Oct 19 03:15:41 statest zma_m1[9587]: WAR [Approaching buffer overrun, consider slowing capture, simplifying analysis or increasing ring buffer size]

As you can see they are coming it at a rate of at least five/sec, and that is the problem. The system is so busy that I can't even get a console prompt to let me deal with it - I have to reboot. I'm sure I can increase the buffer size or whatever to avoid the underlying cause, but I can't because zma it is reporting it so quickly.

It would seem at a good fix would be to rate limit warnings of performance lag. But where should that happen? In the warning subsystem, or at the point of warning? I don't know ZM's architecture well enough.

Is there a bug tracking data base? I would expect this problem exists in other parts of the system.
User avatar
zoneminder
Site Admin
Posts: 5215
Joined: Wed Jul 09, 2003 2:07 pm
Location: Bristol, UK
Contact:

Post by zoneminder »

There was a bug in zma which meant that buffer overruns get incorrectly reported continuously when the index happened to be 0 I think. I don't know if that is what you are seeing though.

You can easily change the amount of buffers in the web gui, in the buffers section and I would suggest trying that especially if you have it at less than 20.

There is something odd in the messages arriving at such a rate though, that is highly unusual. I wonder if the USB cam driver is doing something odd. However 5 of those messages should not affect your system to the point of it locking up, that also indicates something else is going on.
Phil
JackG
Posts: 5
Joined: Sat Oct 18, 2008 10:42 pm
Location: California

Post by JackG »

zoneminder wrote:There was a bug in zma which meant that buffer overruns get incorrectly reported continuously when the index happened to be 0 I think. I don't know if that is what you are seeing though.
Hard to test for, but that sounds like the failure mode. Is there a simple patch? What should I look for in the devel code to diff against?
There is something odd in the messages arriving at such a rate though, that is highly unusual. I wonder if the USB cam driver is doing something odd. However 5 of those messages should not affect your system to the point of it locking up, that also indicates something else is going on.
It was five A SECOND until I reset the box. It was doing nothing but filling up /var/log, couldn't even get a prompt. Happened several times.
galphanet
Posts: 1
Joined: Sat Jan 03, 2009 7:11 pm
Location: Switzerland

Post by galphanet »

Hello,
I have exactly the same issue, I have to hard-reboot the server every two hours. No SSH, no login (timeout).

ZoneMinder version 1.23.3 installed from a .deb
Debian Lenny (5)
Dell PowerEdge SC1425
with 4 Go RAM and 2x 146 Go SCSI
My load average is around 2

5 IP Cams 640x480, coulors, 4 in Modect mode and one in monitor mode.
Buffer size: 150
In sysctl.conf:
kernel.shmall = 201326592
kernel.shmmax = 201326592

The first application to lock is bind9.
I've modified the buffer size (original to 20) so it locks after two hours, not after five minutes....
After this, no more
"WAR [Approaching buffer overrun, consider slowing capture, simplifying analysis or increasing ring buffer size] " or "'zmc -m 3' exited abnormally, exit status 255" in the logs but the server freeze.

I've tried to delete all zones but without success.

And I see also in the syslog messages from the kernel:
kernel: [163199.094983] INFO: task cron:25317 blocked for more than 120 seconds.
kernel: [163199.095000] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.

Various tasks are blocked, as cron, mount, squeezecenter, postfix and so on, only when zm is running.
I don't understand where is the problem because I've been running zm since one year without problems on this server but I had to re-install it last week and this problem began here.

So what can I do in order to correct this big problem ?

Many thanks for your help !
Locked