ZM Server Optimization Comments
Posted: Sat Jul 28, 2018 11:18 pm
I seem to remember reading somewhere an online article, most likely written by a ZM programmer, regarding various ways to optimize your ZM server.
Some of the comments that stick in my mind are "more CPU cores than CPU speed" and "don't run a monitoring display screen on the server". I could be wrong.
I originally started my ZM project (7 cameras at 720p and 6 to 10 fps per camera with "motion detection", an inbound aggregate of 12 megabits per second) on a board with a 4 core AMD CPU (actually an APU, which is CPU+ GPU in same package) with 3+ GHz clock speed. It worked but the CPU was constantly loaded at something like 40 to 50 percent utilization, maybe more. The board had 16GB of RAM installed. The MySQL DB & ZM "capture" files were broken out to a RAID-1 disk array separate from the OS. My ZM setup is the 1.30.x version as used with Debian "Stretch".
I watched system performance using a tool/application called "Monitorix" that is coded in Python & Perl (I think) and presents a broad assortment of performance graphs via a web interface. I highly recommend this tool/application (freely downloadable from the web, search for it), but expect it to have a "configuration learning curve". I use it's graphs to monitor system and network utilization on all of my Linux systems.
Then I found that article on ZM server optimization while I was trying to improve my "education" on Zoneminder.
So I bought a Supermicro A1SRM-2758F board; "more CPU cores than raw CPU speed". I like Supermicro boards because they are solid 24x7 "workhorse" boards, and you get what you pay for. That board has an Intel C2758 CPU with 8 cores of "out of order" CPU architecture (most CPUs are OOO architecture, think "branch prediction") in an SoC with 4 GB Ethernet ports and IPMI on it's own LAN port. I could easily reuse my current memory since that board did not require ECC memory, but it can use ECC memory. Since I am not a "purist" when it comes to ECC versus non-ECC memory on servers, I accepted the fact that memory issues or even database corruption could happen. I would later upgrade that memory to ECC "because I could". Since my system OS was Debian "Stretch" it was easy to move all of the hard disks to the new hardware by adjusting parts of the configuration. It took some adjustments to the "Monitorix" configuration to get my performance monitoring operational again on the new board.
Like the previous AMD APU-based server, this Intel C2758-based server "drove" a monitoring screen that cycled between some of my 7 cameras. That cycling uses a web browser, and running a web browser on a Linux desktop requires a "desktop interface". I use LXDE because of it's supposed "lightweight" compared to desktops like GNOME and KDE. Some might argue "weight" differently, but I an only relating the "marketing" here. Having a desktop meant having a video card in the PC along with the normal VGA port that connects to the IPMI/BMC chip; you don't get a graphical desktop on the IPMI/BMC video port. I even setup the desktop of "autologin" and start Firefox to my desired "camera cycle" URL. It all worked fine, even after many updates and reboots, so the overall system configuration was solid.
Over many months of performance measurements I noticed that my average system load was around "3". I could drive it up to "4" or almost "5" if I deleted lots of events (like 100s at a time) via the ZM web interface; makes sense to me. The system easily handled that sort of stress, but I wondered about the load placed on the system by the graphical desktop interface and web browser cycling through camera displays. Here's where "Monitorix" performance data will pay off; "before & after" comparisons.
I installed a Raspberry Pi 3B (1GB of onboard RAM) to drive the monitoring display. If you carefully configure your RPi, tune your RPi GPU memory (gpu_mem=128 in my case) and remove all the extraneous stuff from the RPi Linux setup that is not necessary to display camera images on a desktop interface, you actually get decent performance when cycling between 3 cameras. Since the RPi is talking to the ZM server over the network (Gigabit in my case and connected to the same switch as the ZM server), the ZM Apache server is doing the "delivery" work while the RPi is doing the display work. Since I was cycling through 3 cameras, "Monitorix" told me that my network load to the RPi was about 8 Megabits per second. The Supermicro board is commonly used in virtualization & hosting work so this small a load is nothing to it.
Then I started to go through the Debian configuration of the ZM server. I started to strip out all of the packages that make the graphical desktop work. There are lots of packages and the process is tedious. If you clean out packages too agressively you can break your ZM installation, so be very careful and very cautious if you do this. After cleaning out as many packages as I could I rechecked the "Monitorix" stats. For a system whose average load hovered around "3" the cleaning reduced the average system load to roughly "2". I call that a measurable improvement!!
Think about my ZM server setup for a minute. 8 CPU cores, 16 GB of RAM, 7 cameras set for between 6 and 10 fps at 720p with motion detection in use. That can be a fair amount of CPU processing due to the image analysis (thinking back to that online article again). My performance data from "Monitorix" clearly documented that a graphical desktop interface on the ZM server could account for a value of "1" in average system load. So what I accomplished by removing my cycling camera display function to an offboard device is a ZM server that is quite capable of handling even more cameras in the future or handling the existing camera count at higher frame rates and higher resolutions. Said another way, I gained valuable "reserve" ZM server processing capacity even while having 2 or more offboard devices pound on the ZM server over the network for "camera cycling" displays.
I also pull CPU clock speed info into "Monitorix" so I have long term graphs of that data. My average clock speeds range from 1.4GHz to 2.2GHz. Peak CPU speed for the C2758 SoC is 2.4GHz and all 8 cores can achieve that speed at the same time, but the SoC will get warm, like 50 degrees C or better. The C2758 SoC is designed for that sort of loading, but make sure you have good airflow over the CPU heatsink. One of the CPU cores tends to run at 2.2GHz most of the time so I wonder what process is running on it, possibly "Monitorix" itself, so that's another experiment. Offboarding "Monitorix" is not an option given the way that application is designed since collecting much of the data displayed by "Monitorix" requires access to data that is not likely instrumented into SNMP, and SNMP applications are much more painful to configure (if you have ever done that, and I have).
Do I have more cameras planned? Yes, 2 more.
Where do I use this setup? Not important.
Is it nice to see ZM "stand up and dance" like this? You bet!
I wish I could find that article because I thought I was quite informative and thought provoking.
Some of the comments that stick in my mind are "more CPU cores than CPU speed" and "don't run a monitoring display screen on the server". I could be wrong.
I originally started my ZM project (7 cameras at 720p and 6 to 10 fps per camera with "motion detection", an inbound aggregate of 12 megabits per second) on a board with a 4 core AMD CPU (actually an APU, which is CPU+ GPU in same package) with 3+ GHz clock speed. It worked but the CPU was constantly loaded at something like 40 to 50 percent utilization, maybe more. The board had 16GB of RAM installed. The MySQL DB & ZM "capture" files were broken out to a RAID-1 disk array separate from the OS. My ZM setup is the 1.30.x version as used with Debian "Stretch".
I watched system performance using a tool/application called "Monitorix" that is coded in Python & Perl (I think) and presents a broad assortment of performance graphs via a web interface. I highly recommend this tool/application (freely downloadable from the web, search for it), but expect it to have a "configuration learning curve". I use it's graphs to monitor system and network utilization on all of my Linux systems.
Then I found that article on ZM server optimization while I was trying to improve my "education" on Zoneminder.
So I bought a Supermicro A1SRM-2758F board; "more CPU cores than raw CPU speed". I like Supermicro boards because they are solid 24x7 "workhorse" boards, and you get what you pay for. That board has an Intel C2758 CPU with 8 cores of "out of order" CPU architecture (most CPUs are OOO architecture, think "branch prediction") in an SoC with 4 GB Ethernet ports and IPMI on it's own LAN port. I could easily reuse my current memory since that board did not require ECC memory, but it can use ECC memory. Since I am not a "purist" when it comes to ECC versus non-ECC memory on servers, I accepted the fact that memory issues or even database corruption could happen. I would later upgrade that memory to ECC "because I could". Since my system OS was Debian "Stretch" it was easy to move all of the hard disks to the new hardware by adjusting parts of the configuration. It took some adjustments to the "Monitorix" configuration to get my performance monitoring operational again on the new board.
Like the previous AMD APU-based server, this Intel C2758-based server "drove" a monitoring screen that cycled between some of my 7 cameras. That cycling uses a web browser, and running a web browser on a Linux desktop requires a "desktop interface". I use LXDE because of it's supposed "lightweight" compared to desktops like GNOME and KDE. Some might argue "weight" differently, but I an only relating the "marketing" here. Having a desktop meant having a video card in the PC along with the normal VGA port that connects to the IPMI/BMC chip; you don't get a graphical desktop on the IPMI/BMC video port. I even setup the desktop of "autologin" and start Firefox to my desired "camera cycle" URL. It all worked fine, even after many updates and reboots, so the overall system configuration was solid.
Over many months of performance measurements I noticed that my average system load was around "3". I could drive it up to "4" or almost "5" if I deleted lots of events (like 100s at a time) via the ZM web interface; makes sense to me. The system easily handled that sort of stress, but I wondered about the load placed on the system by the graphical desktop interface and web browser cycling through camera displays. Here's where "Monitorix" performance data will pay off; "before & after" comparisons.
I installed a Raspberry Pi 3B (1GB of onboard RAM) to drive the monitoring display. If you carefully configure your RPi, tune your RPi GPU memory (gpu_mem=128 in my case) and remove all the extraneous stuff from the RPi Linux setup that is not necessary to display camera images on a desktop interface, you actually get decent performance when cycling between 3 cameras. Since the RPi is talking to the ZM server over the network (Gigabit in my case and connected to the same switch as the ZM server), the ZM Apache server is doing the "delivery" work while the RPi is doing the display work. Since I was cycling through 3 cameras, "Monitorix" told me that my network load to the RPi was about 8 Megabits per second. The Supermicro board is commonly used in virtualization & hosting work so this small a load is nothing to it.
Then I started to go through the Debian configuration of the ZM server. I started to strip out all of the packages that make the graphical desktop work. There are lots of packages and the process is tedious. If you clean out packages too agressively you can break your ZM installation, so be very careful and very cautious if you do this. After cleaning out as many packages as I could I rechecked the "Monitorix" stats. For a system whose average load hovered around "3" the cleaning reduced the average system load to roughly "2". I call that a measurable improvement!!
Think about my ZM server setup for a minute. 8 CPU cores, 16 GB of RAM, 7 cameras set for between 6 and 10 fps at 720p with motion detection in use. That can be a fair amount of CPU processing due to the image analysis (thinking back to that online article again). My performance data from "Monitorix" clearly documented that a graphical desktop interface on the ZM server could account for a value of "1" in average system load. So what I accomplished by removing my cycling camera display function to an offboard device is a ZM server that is quite capable of handling even more cameras in the future or handling the existing camera count at higher frame rates and higher resolutions. Said another way, I gained valuable "reserve" ZM server processing capacity even while having 2 or more offboard devices pound on the ZM server over the network for "camera cycling" displays.
I also pull CPU clock speed info into "Monitorix" so I have long term graphs of that data. My average clock speeds range from 1.4GHz to 2.2GHz. Peak CPU speed for the C2758 SoC is 2.4GHz and all 8 cores can achieve that speed at the same time, but the SoC will get warm, like 50 degrees C or better. The C2758 SoC is designed for that sort of loading, but make sure you have good airflow over the CPU heatsink. One of the CPU cores tends to run at 2.2GHz most of the time so I wonder what process is running on it, possibly "Monitorix" itself, so that's another experiment. Offboarding "Monitorix" is not an option given the way that application is designed since collecting much of the data displayed by "Monitorix" requires access to data that is not likely instrumented into SNMP, and SNMP applications are much more painful to configure (if you have ever done that, and I have).
Do I have more cameras planned? Yes, 2 more.
Where do I use this setup? Not important.
Is it nice to see ZM "stand up and dance" like this? You bet!
I wish I could find that article because I thought I was quite informative and thought provoking.