Page 3 of 3

Posted: Mon Mar 08, 2010 8:21 pm
by mitch
install the tool lsof then run it with the -p option with the pid of the zmwatch leaking( IE: lsof -p 19213) pastebin the results. My zmwatch doesn't leak like this but this should atleast list what file handles are leaking.

Posted: Mon Mar 08, 2010 8:30 pm
by mastertheknife
There is an easier way:

Code: Select all

ls /proc/PID/fd | wc -w
mastertheknife.

Posted: Mon Mar 08, 2010 8:32 pm
by mitch
wc -w prints the number of words and is absolutely useless. But ls -lah /proc/PID/fd will work just fine also.

Posted: Mon Mar 08, 2010 8:47 pm
by maufacc
mastertheknife wrote:There is an easier way:

Code: Select all

ls /proc/PID/fd | wc -w
mastertheknife.

ps -ef | grep zmwatch => 16150


ls /proc/16150/fd | wc -w => 7
lsof -p 16150 | wc -l => 40

cat /proc/sys/fs/file-nr
7264 0 42438

then kill -9 16150

cat /proc/sys/fs/file-nr
6400 0 42438

As I read, but not fully understand, there is a difference between file handles an file descriptors, and between open files (lsof) and allocate file descriptors

http://linux.derkeiler.com/Newsgroups/c ... /0186.html
http://linux.derkeiler.com/Newsgroups/c ... /0397.html
http://www.netadmintools.com/part295.html

Thank you for your prompt response.

Posted: Fri Apr 23, 2010 12:47 am
by christiangraves
Hello, I am experiencing identical errors. I followed maufacc's steps and killing zmwatch brought the system back to life.

Code: Select all

root@camera:/etc# ls /proc/1334/fd | wc -w
8
root@camera:/etc# lsof -p 1334 | wc -l
42
root@camera:/etc# cat /proc/sys/fs/file-nr
65664   0       65535
root@camera:/etc# kill -9 1334
root@camera:/etc# cat /proc/sys/fs/file-nr
992     0       65535
has anyone found a fix for this problem? if not, would a good workaround be to restart zoneminder daily with cron?

Thank you for the help!

error

Posted: Fri Apr 23, 2010 5:57 am
by johnnytolengo
hi all, here the same error the only way is delete the cookies from the browser and login again.

JT.

Posted: Fri Apr 23, 2010 6:22 am
by MarcoP
The following patch helped me a lot

http://www.zoneminder.com/forums/viewto ... 1462#61462

if you use this patch can you give a feedback too please?

Posted: Fri Apr 23, 2010 9:56 pm
by jfkastner
saw your post about apache hangs - this is the same issue i guess?!

gotta be some apache problem though:

i can still SSH into my ZM machine, and use webmin (as root and also with the same account as ZM), as well as xterm etc -> the machine is NOT locked up, still recording etc, it is ONLY apache that stops responding (and therefore there have to be 'free' file handles/sockets etc)

also if you try ZM from a third machine you got the same problem (no apache response) as from the ZM server itself or your other client -> can't be a FF issue then (or sockets) on the client

guess it's difficult since there are so many apache versions/patch levels on so many OSs, but this thread (and the other) had some 16k views so this seems problematic to many others, hope we get this solved!

Posted: Tue Apr 27, 2010 6:36 pm
by fleed
Another me too on this one. As soon as I kill zmwatch.pl the number of files used goes back to normal.

Code: Select all

eletromidia:~# cat /proc/sys/fs/file-nr 
1632	0	51140
eletromidia:~# kill 11613
eletromidia:~# cat /proc/sys/fs/file-nr 
1280	0	51140
eletromidia:~# cat /proc/sys/fs/file-nr 
1280	0	51140
eletromidia:~# /etc/init.d/zm stop
Stopping ZoneMinder: success

eletromidia:~# cat /proc/sys/fs/file-nr 
1056	0	51140
eletromidia:~# /etc/init.d/zm start
Starting ZoneMinder: success

eletromidia:~# cat /proc/sys/fs/file-nr 
1280	0	51140
It seems to me to be more related to zoneminder itself than apache. If it's relevant, I have mmap on... just a hunch, maybe it leaves a file handle open each time it shows a stream?

Other relevant details:
* I can ssh into the machine as root.
* Machine feels normal, no high cpu usage.
* Cannot change from root to a normal user. Get too many open files error message.

Posted: Tue Apr 27, 2010 9:54 pm
by fleed
I think the problem is really related to mmap.

Have a look at /proc/$PID/maps and see a lot of references to mmap files:

Code: Select all

eletromidia:/proc/12407# ps -ef | grep zmwatch
www-data 12407 12358  0 12:25 pts/0    00:00:17 /usr/bin/perl -wT /usr/local/bin/zmwatch.pl
root     17920 11144  0 15:50 pts/0    00:00:00 grep zmwatch
eletromidia:/proc/12407# grep mmap /proc/12407/maps | head -10
b58e8000-b58e9000 rw-s 00000000 00:10 154456     /dev/shm/zm.mmap.5
b58e9000-b58ea000 rw-s 00000000 00:10 154339     /dev/shm/zm.mmap.4
b58ea000-b58eb000 rw-s 00000000 00:10 154330     /dev/shm/zm.mmap.3
b58eb000-b58ec000 rw-s 00000000 00:10 154329     /dev/shm/zm.mmap.2
b58ec000-b58ed000 rw-s 00000000 00:10 154328     /dev/shm/zm.mmap.1
b58ed000-b58ee000 rw-s 00000000 00:10 154456     /dev/shm/zm.mmap.5
b58ee000-b58ef000 rw-s 00000000 00:10 154339     /dev/shm/zm.mmap.4
b58ef000-b58f0000 rw-s 00000000 00:10 154330     /dev/shm/zm.mmap.3
b58f0000-b58f1000 rw-s 00000000 00:10 154329     /dev/shm/zm.mmap.2
b58f1000-b58f2000 rw-s 00000000 00:10 154328     /dev/shm/zm.mmap.1
eletromidia:/proc/12407# grep mmap /proc/12407/maps | wc -l
6125
so, 6125 references to mmap in zmwatch.

And it just keeps going up...

Posted: Thu Apr 29, 2010 12:51 pm
by jfkastner
todays tests on a ubu 8.04 client with FF 3.0.19 ran for >3 hours WITHOUT problems (in montage view - that's where all my problems came from!)

ZM server went up to 2000 open files and stayed there and never stalled (which is ubu 9.04 with zm 1.24.2)

BTW i do not use mapped memory but the 'old style' shared mem ...

Posted: Fri Apr 30, 2010 3:33 pm
by jfkastner
on other clients than the above i found this helpful:

/etc/php5/apache2/php.ini settings (on the ZM server) changed to

max_execution_time = 120
max_input_time = 120
memory_limit = 19M
post_max_size = 12M

just to give PHP a bit more room to handle the picture/event data - maybe it's not apache freezing after all ?

sure this is trial and error but i'm getting somewhere ... BTW this is the PHP that came with ubu 9.04

NEWS EDIT: my FF37 client ran for more than 8 hours on montage w/o a stall - before the PHP change it ran only 5 minutes max (with 7 cams)
maybe PHP could not allocate any more memory and then it stalled apache

please someone else could try this out and let us know!

Re: Maxopenfile reached & machine hangs zm problem ?

Posted: Sun Jun 26, 2011 9:35 am
by Maklaut

Re:

Posted: Sun Jun 26, 2011 9:38 am
by mastertheknife
fleed wrote:I think the problem is really related to mmap.

Have a look at /proc/$PID/maps and see a lot of references to mmap files:

Code: Select all

eletromidia:/proc/12407# ps -ef | grep zmwatch
www-data 12407 12358  0 12:25 pts/0    00:00:17 /usr/bin/perl -wT /usr/local/bin/zmwatch.pl
root     17920 11144  0 15:50 pts/0    00:00:00 grep zmwatch
eletromidia:/proc/12407# grep mmap /proc/12407/maps | head -10
b58e8000-b58e9000 rw-s 00000000 00:10 154456     /dev/shm/zm.mmap.5
b58e9000-b58ea000 rw-s 00000000 00:10 154339     /dev/shm/zm.mmap.4
b58ea000-b58eb000 rw-s 00000000 00:10 154330     /dev/shm/zm.mmap.3
b58eb000-b58ec000 rw-s 00000000 00:10 154329     /dev/shm/zm.mmap.2
b58ec000-b58ed000 rw-s 00000000 00:10 154328     /dev/shm/zm.mmap.1
b58ed000-b58ee000 rw-s 00000000 00:10 154456     /dev/shm/zm.mmap.5
b58ee000-b58ef000 rw-s 00000000 00:10 154339     /dev/shm/zm.mmap.4
b58ef000-b58f0000 rw-s 00000000 00:10 154330     /dev/shm/zm.mmap.3
b58f0000-b58f1000 rw-s 00000000 00:10 154329     /dev/shm/zm.mmap.2
b58f1000-b58f2000 rw-s 00000000 00:10 154328     /dev/shm/zm.mmap.1
eletromidia:/proc/12407# grep mmap /proc/12407/maps | wc -l
6125
so, 6125 references to mmap in zmwatch.

And it just keeps going up...
This issue (zmwatch.pl doesn't free file handles) was fixed in ZM 1.24.4.

mastertheknife