Published: May 11, 2021 by luxagen
After 4-5 power failures over the last year, one of which corrupted a development repository, nuked my Git Extensions configuration, and interrupted external services, I decided I needed a UPS. I was recommended the BX500CI by a friend, but an extended period of Amazon stocklessness gave me time to realise that it wasn’t really right for my needs. Since I’m not always in the house, it wouldn’t help much for long outages because the equipment would merely suffer a delayed power loss if I wasn’t around to notice.
A search yielded the more expensive but connected BX950UI, with full support on Windows, GNU/Linux, and macOS via the excellent APC UPS Daemon, so I went for it.
On its arrival, I charged it and spent a couple of hours doing physical hookup, which involved replacing the plugs on a couple of multiway power strips with IEC C13s. This article and the
apcupsd manpage helped me get the basics going in another couple of hours, and I ran a battery calibration.
My theory is that this step doesn’t directly use power-consumption information, but records the drain curve of the battery with respect to total energy delivered to loads, allowing the UPS to correctly estimate remaining runtime for any load. I suspect that it’s worth recalibrating every 3 months or so, both to keep this profile updated and to give the battery itself enough exercise to avoid degrading too fast; I gather lead-acid batteries don’t like being left charged for too many months on end.
At first I began editing the
/etc/apcupsd/apccontrol script directly, but this file needs to be replaceable on package upgrade, so the correct approach is to edit the scripts specific to each event, e.g.
/etc/apcupsd/doshutdown. The next gotcha was that, not having read the above article properly, I missed the important step of setting
/etc/default/apcupsd, without which
apcaccess would work — but not the daemon itself. I was alerted to this problem by the smoking gun that neither the
/etc/apcupsd/powerfail files were being created during testing.
As well as the Ubuntu server (an Intel NUC), I have a Windows 10 desktop attached to the UPS. Since the latter sucks enough juice to reduce battery runtime to about ten minutes, it was crucial to arrange for it to shut down quickly on power loss in order to maximise the server’s uptime. To this end I put the following command in the
/etc/apcupsd/onbattery script, just before the
exit 0 line (to conceal any failure owing to e.g. the machine already being off), and using its IP address to dodge any transient name-resolution problems:
net rpc shutdown -t 45 -f -C "UPS shutdown" -I $IP_ADDRESS -U$USERNAME%$PASSWORD
This gives 45 seconds’ warning, just enough to run
shutdown /a if there’s a pressing need to use the machine. Testing revealed a few gotchas: the first was that the remote-shutdown feature apparently (see later) requires the Remote Registry service to be running, which got me from one error message to another.
The second problem was the need for UAC auto-elevation on RPC connections; I fixed that by using
regedit to create the DWORD value
LocalAccountTokenFilterPolicy=1 inside the
I later discovered that the Remote Registry service was set to Automatic but not actually started — and thus unnecessary. I left it on Automatic anyway and started it, because there’s a nasty impasse one can trip over during cloning/migration of Windows builds where logon is blocked owing to a faulty
Another detail worth noting is that I don’t see the point of blocking remote logins on a GNU/Linux system that’s in the process of shutting down; it seems like a nannying feature to save admins from having their session nuked by the shutdown (not much of a benefit) and it takes away options in a time-critical situation. I therefore set this line in
Here’s how it looks, along with my comments.
Since I run a Git repository in the Ubuntu server’s
/etc folder, the following gets rid of
Since I don’t check
root’s mailbox on the server, I changed the
export SYSADMIN line to refer to my external e-mail address.
Here I altered the following:
POLLTIME is the maximum time (in seconds) that
apcupsd will take to notice a UPS event. I believe
ONBATTERYDELAY is the time in seconds between one of
apcupsd’s polls noticing power loss and the
onbattery state being triggered; this is how I prevent very short outages from shutting down the Windows machine.
MINUTES specifies how close to battery exhaustion the UPS can get before a shutdown is triggered; 2 minutes is generous for my server, which usually takes 10-20 seconds to shut down.
BATTERYLEVEL unnecessary, so I commented it out.
Here I added the remote-shutdown command for the Windows machine (see above).
I replicated the Windows remote-shutdown command here (again before the
exit 0 line to conceal failure) so that, even if I choose to keep the machine on during power loss via
shutdown /a, it will retry on battery exhaustion.
Flaws and future work (or Things I Didn’t Have Time For)
One thing I must address is what I consider the biggest (only?) flaw in
apcupsd: its failure to provide handling for communications failures, which might sabotage the whole setup by hiding a power loss. In theory, this could be worked around with a bunch of custom scripting, but I think the real solution would lie in
- introduce a
COMMFAILUREDELAYsetting that allows comms to recover within some timespan without generating an event;
- change the comms-failure behaviour to (optionally?) trigger a shutdown.
The other major improvement to my setup would be to hibernate the machines in question instead of shutting them down. I looked into this for a while, but there are two problems:
- GNU/Linux requires a swap partition in order to hibernate;
- The Samba
net rpc shutdowncommand provides only shutdown and restart, not sleep or hibernate.
I’ll probably deal with the former via an already-planned server rebuild, and the latter could be fixed by remotely invoking the native Windows
SHUTDOWN command instead of using the Samba one, but since Windows 10 no longer has an inbuilt Telnet server this would require extra software and time.
Thanks to this modest amount of work, and some testing — sometimes using the
TIMEOUT setting in
/etc/apcupsd/apcupsd.conf to avoid running the battery all the way down — I no longer have to worry about power loss in the middle of an intense work session, fiddling around with manual Git-repository repairs, or touring the local construction sites in a blame-seeking rage. Yay!