Signal 11 - segmentation fault errors - backtrace included!

Forum for questions and support relating to the 1.26.x releases only.
Locked
McFuzz
Posts: 181
Joined: Tue Aug 28, 2012 7:03 am

Signal 11 - segmentation fault errors - backtrace included!

Post by McFuzz »

Hi all,

I have a brand new install using Debian 7, MySQL 5.5, ffmpeg 1.0.8 (though I doubt any of this matters). For all intents and purposes, everything seems to work fine, but randomly I get the following errors:

Image

I launched the backtrace - addr2line -e /usr/local/bin/zmc 0x4b5789 0x7ffdccfd3030 0x7ffdcc076465 0x44b97e 0x4e8c8c 0x4e6edf 0x4e5ab2 0x4e5e63 0x44187a 0x4805e9 0x46591a 0x408b84 0x7ffdcbf72ead 0x409201 and here is the result:

Code: Select all

/usr/src/ZoneMinder-1.26.4/src/zm_signal.cpp:89
??:0
??:0
/usr/src/ZoneMinder-1.26.4/src/zm_jpeg.cpp:277
jdmarker.c:0
jdinput.c:0
??:0
??:0
/usr/src/ZoneMinder-1.26.4/src/zm_image.cpp:911
/usr/src/ZoneMinder-1.26.4/src/zm_remote_camera_http.cpp:1105
/usr/src/ZoneMinder-1.26.4/src/zm_monitor.cpp:2675
/usr/src/ZoneMinder-1.26.4/src/zmc.cpp:265
??:0
??:0


At first glance it does not seem to be affecting anything... cameras appear to work just fine but, this is a bit disconcerting considering I have two deployments that behave the same...

Any ideas?
McFuzz
Posts: 181
Joined: Tue Aug 28, 2012 7:03 am

Re: Signal 11 - segmentation fault errors - backtrace includ

Post by McFuzz »

Ahhh! It looks like one of the triggers of the segmentation fault is when stopping zoneminder service via CLI...
mikb
Posts: 604
Joined: Mon Mar 25, 2013 12:34 pm

Re: Signal 11 - segmentation fault errors - backtrace includ

Post by mikb »

From a programming point of view, extracting -1 bytes from a buffer is probably not going to work!

It looks like the "interrupted system call" is returning a result of -1 (commonly "error", rather than 0 = nothing read, but okay, or e.g. 157 = 157 bytes read).

Then something else is trying to get that -1 bytes. Walking backwards off an array. Bang!

It is definitely affecting *something* as zmc (capture process) is crashing, but you probably won't notice if you were in the process of shutting down ZM. Even if you weren't shutting it down, the capture processes are tenacious and keep restarting, which would hide the problem somewhat.
McFuzz
Posts: 181
Joined: Tue Aug 28, 2012 7:03 am

Re: Signal 11 - segmentation fault errors - backtrace includ

Post by McFuzz »

So looks like I can trigger it non stop if I try to change running states via scheduling a cronjob:

Code: Select all

Nov 11 17:27:02 BigBrother zmpkg[27474]: INF [Command: state]
Nov 11 17:27:02 BigBrother zmpkg[27474]: INF [Updating DB: nightime]
Nov 11 17:27:02 BigBrother zmdc[26919]: INF ['zmaudit.pl -c' stopping at 13/11/11 17:27:02]
Nov 11 17:27:02 BigBrother zmdc[26919]: INF [Can't find child with pid of '26920']
Nov 11 17:27:02 BigBrother zmdc[26919]: INF ['zmaudit.pl -c' exited, signal 14]
Nov 11 17:27:02 BigBrother zmdc[26919]: INF ['zmfilter.pl ' stopping at 13/11/11 17:27:02]
Nov 11 17:27:02 BigBrother zmdc[26919]: INF ['zmfilter.pl ' exited, signal 14]
Nov 11 17:27:02 BigBrother zmdc[26919]: INF ['zmc -m 1' stopping at 13/11/11 17:27:02]
Nov 11 17:27:02 BigBrother zmc_m1[26945]: INF [Got signal 15 (Terminated), exiting]
Nov 11 17:27:02 BigBrother zmc_m1[26945]: ERR [Select error: Interrupted system call]
Nov 11 17:27:02 BigBrother zmc_m1[26945]: ERR [Unable to read content]
Nov 11 17:27:02 BigBrother zmc_m1[26945]: WAR [Attempt to extract -1 bytes of buffer, size is only 14417 bytes]
Nov 11 17:27:02 BigBrother zmc_m1[26945]: ERR [Got signal 11 (Segmentation fault), crashing]
Nov 11 17:27:02 BigBrother zmc_m1[26945]: ERR [Signal address is 0x7fa372f15ffd, from 0x7fa36f12a522]
Nov 11 17:27:02 BigBrother zmc_m1[26945]: ERR [Backtrace 0: /usr/local/bin/zmc() [0x4b5789]]
Nov 11 17:27:02 BigBrother zmc_m1[26945]: ERR [Backtrace 1: /lib/x86_64-linux-gnu/libpthread.so.0(+0xf030) [0x7fa370087030]]
Nov 11 17:27:02 BigBrother zmc_m1[26945]: ERR [Backtrace 2: /lib/x86_64-linux-gnu/libc.so.6(+0x122522) [0x7fa36f12a522]]
Nov 11 17:27:02 BigBrother zmc_m1[26945]: ERR [Backtrace 3: /usr/local/bin/zmc() [0x44b97e]]
Nov 11 17:27:02 BigBrother zmc_m1[26945]: ERR [Backtrace 4: /usr/local/bin/zmc() [0x4e8c8c]]
Nov 11 17:27:02 BigBrother zmc_m1[26945]: ERR [Backtrace 5: /usr/local/bin/zmc() [0x4e6edf]]
Nov 11 17:27:02 BigBrother zmc_m1[26945]: ERR [Backtrace 6: /usr/local/bin/zmc() [0x4e5ab2]]
Nov 11 17:27:02 BigBrother zmc_m1[26945]: ERR [Backtrace 7: /usr/local/bin/zmc() [0x4e5e63]]
Nov 11 17:27:02 BigBrother zmc_m1[26945]: ERR [Backtrace 8: /usr/local/bin/zmc() [0x44187a]]
Nov 11 17:27:02 BigBrother zmc_m1[26945]: ERR [Backtrace 9: /usr/local/bin/zmc() [0x4805e9]]
Nov 11 17:27:02 BigBrother zmc_m1[26945]: ERR [Backtrace 10: /usr/local/bin/zmc() [0x46591a]]
Nov 11 17:27:02 BigBrother zmc_m1[26945]: ERR [Backtrace 11: /usr/local/bin/zmc() [0x408b84]]
Nov 11 17:27:02 BigBrother zmc_m1[26945]: ERR [Backtrace 12: /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xfd) [0x7fa36f026ead]]
Nov 11 17:27:02 BigBrother zmc_m1[26945]: ERR [Backtrace 13: /usr/local/bin/zmc() [0x409201]]
Nov 11 17:27:02 BigBrother zmc_m1[26945]: INF [Backtrace complete, please execute the following command for more information]
Nov 11 17:27:02 BigBrother zmc_m1[26945]: INF [addr2line -e /usr/local/bin/zmc 0x4b5789 0x7fa370087030 0x7fa36f12a522 0x44b97e 0x4e8c8c 0x4e6edf 0x4e5ab2 0x4e5e63 0x44187a 0x4805e9 0x46591a 0x408b84 0x7fa36f026ead 0x409201]
Nov 11 17:27:02 BigBrother zmdc[26919]: ERR ['zmc -m 1' exited abnormally, exit status 11]

Running the addr2line command yields same output as in original post. Any devs wanna chime in?

Thanks :)
mastertheknife
Posts: 678
Joined: Wed Dec 16, 2009 4:32 pm
Location: Israel

Re: Signal 11 - segmentation fault errors - backtrace includ

Post by mastertheknife »

Haven't investigated yet but i think the problem is that read() syscall returns -1 because it was interrupted and then zm tries to read -1 bytes and boom..
Kfir Itzhak.
McFuzz
Posts: 181
Joined: Tue Aug 28, 2012 7:03 am

Re: Signal 11 - segmentation fault errors - backtrace includ

Post by McFuzz »

mastertheknife wrote:Haven't investigated yet but i think the problem is that read() syscall returns -1 because it was interrupted and then zm tries to read -1 bytes and boom..
Any headway on this?

Thanks :)
fastolfe
Posts: 9
Joined: Sun Oct 27, 2013 4:39 pm

Re: Signal 11 - segmentation fault errors - backtrace includ

Post by fastolfe »

I suspect this was due to the use of an unsigned int rather than an int at zm_remote_camera_http.cpp:1148, but looking at HEAD this appears to be fixed (commit 5a9364703c983bc7177a306f5446b91854c5dd09 by mastertheknife). I, not being a regular contributor to the project, would expect this problem to only occur when the connection was closed or when the process was terminated, so a crash at this time shouldn't be particularly problematic.
mastertheknife
Posts: 678
Joined: Wed Dec 16, 2009 4:32 pm
Location: Israel

Re: Signal 11 - segmentation fault errors - backtrace includ

Post by mastertheknife »

Hi,

Its because of the sign mess we had lately. We were attempting to remove all sign comparison warnings during compilation, so we changed the sign of many variables and it seems we messed up few things.
I fixed it today:
https://github.com/ZoneMinder/ZoneMinde ... 9f78c30abb

However, the shutdown is still not entirely clean as we would like it to be because we are not checking for zm_reload or zm_terminate being set to true. Need to tackle this someday.

Code: Select all

2013-11-24 15:29:58.775598	zmc_m10	26984	ERR	Unable to get response	/home/kfir/zmwork/ZoneMinder/src/zm_remote_camera_http.cpp	1110
2013-11-24 15:29:58.742188	zmc_m10	26984	ERR	Unable to read content	/home/kfir/zmwork/ZoneMinder/src/zm_remote_camera_http.cpp	987
2013-11-24 15:29:58.708980	zmc_m10	26984	ERR	Select error: Interrupted system call	/home/kfir/zmwork/ZoneMinder/src/zm_remote_camera_http.cpp
Kfir Itzhak.
fastolfe
Posts: 9
Joined: Sun Oct 27, 2013 4:39 pm

Re: Signal 11 - segmentation fault errors - backtrace includ

Post by fastolfe »

If I may make a suggestion: use signed ints for everything. If you need that one extra bit that you've lost to the sign, it's rare that one additional bit will solve your problem permanently, so just upgrade to a 64-bit signed int. The safety and readability and maintainability benefits of using consistent types and avoiding the rat's nest of casts almost always justifies the loss of a sign bit on a type you don't plan to ever go negative. But that's just my experience.
Locked