Potential bug with USB driver and HID reports

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Potential bug with USB driver and HID reports

Paul Godard
Hello,

I have an APC Back-UPS 650VA Battery Backup device (BN650M1-CA) connected to my physical pfSense box (2.3.3_1 - FreeBSD 10.3 p17). When running apcupsd in debug mode on the command line (-b -d 300), I can see that all attempts to get a HID report fail. So all calls to usb_control_msg attempting to gather a report will fail but calls using usb_control_msg requesting other things may work (such as getting the UPS's HID interface's report descriptor - this work). Also, oddly enough, USB interrupts events work just fine; apcupsd is able to populate some information (including the UPS status, estimated runtime left, battery charge percentage, etc.) and is thus able to trigger alerts when a power outage occurs.

Back to the HID report issue, I enabled kernel tracing and was able to isolate the failure to ioctl calls (that usb_control_msg utilizes); the HID report calls return -1 values and sets the "errno" value to 5 (Input/Output Error). You can see the trace here:

 32271 apcupsd  CALL  write(0x1,0x80201d000,0x4b)
 32271 apcupsd  GIO   fd 1 wrote 75 bytes
       "63.611 apcupsd: HidUps.cpp:438 get_report: id=0x0c, kind=2, length=4 pos=0
       "
 32271 apcupsd  RET   write 75/0x4b
 32271 apcupsd  CALL  ioctl(0x8,0xc018556f,0x7fffffffe760)
 32271 apcupsd  RET   ioctl -1 errno 5 Input/output error
 32271 apcupsd  CALL  write(0x1,0x80201d000,0x62)
 32271 apcupsd  GIO   fd 1 wrote 98 bytes
       "63.620 apcupsd: HidUps.cpp:446 Error getting report: (-1) Operation not permitted [Unknown error]
       "

I've tried connecting that UPS to another box running a pfSense VM with USB-passthru (identical version to my physical box) - this is a different physical computer - and all works fine. So this leads me to believe it's some sort of issue with the physical USB controller on my physical pfSense box? If that's the case, I don't believe this is something that should (or can) be fixed in apcupsd. However, apcupsd is still somewhat usable thanks to the USB interrupts events but has another logical bug caused by the unexpected HID reports failures...

So the actual bug that I want to report... When a power outage event is triggered, apcupsd reports it correctly, however, apcupsd also instantly reports an all clear ("Power is back. UPS running on mains.") when the outage is in-fact still occurring. When power is legitimately restored, apcupsd doesn't report anything different (since it already "recovered" immediately after the failure was initially detected). I tracked down the bug to the UsbUpsDriver::read_volatile_data() method (which is called periodically, and/or after an "important" event - which is determined in UsbUpsDriver::usb_report_event) in the src/drivers/usb/usb.c file. In this method, right before you loop through updated CI values by requesting HID reports (and by eventually calling usb_update_value), you reset the "Status" field bits to 0x00 (thus getting rid of the "power outage" state) anticipating that the updated CI values will correctly repopulate the "Status" field. In my particular case, apcupsd is unable to fetch any HID reports, therefore, after you clear the "Status" field, no CI values will ever set it back to a failed "power outage" state. This is why immediately after detecting an outage, apcupsd thinks power was restored.

The fix I'm proposing (which worked in my particular case) is to keep a copy of the "Status" field's value before clearing it. Then, in the CI update loop, I keep track of whether a single HID report came back successfully (in my particular case, none ever come back so the code now knows this). Right after the loop, if I detected that no HID reports were successfully queried, I reset the "Status" field back to its original value (before it gets cleared before the loop) thus keeping my failed "power outage" status. The moment power returns to the UPS, apcupsd gets an updated CI_STATUS through a USB interrupt event and correctly reports the event and everyone is happy.

Anyone else experienced something similar to this? What are your thoughts on this? I created a simple patch to fix the bug of the immediate power restore issue that can be found here: https://pastebin.com/AnJ1VYTn

Thanks,
Paul

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Apcupsd-users mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/apcupsd-users
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Potential bug with USB driver and HID reports

Ted Mittelstaedt-5
Do not use pastebin since this may be picked up months or years in the
future and the patebin link may be gone.  Here is your patch:

--- src/drivers/usb/usb.c 2015-03-21 18:43:19.000000000 -0400
+++ src/drivers/usb/usb.c.new 2017-03-29 07:53:59.702173000 -0400
@@ -815,14 +815,20 @@
     _ups->poll_time = now;           /* save time stamp */

     /* Clear APC status bits; let the various CIs set them again */
+   int status = _ups->Status;
     _ups->Status &= ~0xFF;

     /* Loop through all known data, polling those marked volatile */
+   bool valid_read = false;
     for (int i=0; _known_info[i].usage_code; i++) {
        if (_known_info[i].isvolatile && _known_info[i].ci != CI_NONE)
-         usb_update_value(_known_info[i].ci);
+         valid_read |= usb_update_value(_known_info[i].ci);
     }

+   /* If we weren't able to read any of the CIs, reset the status back
to what it was before we cleared it (thinking the CIs would update it) */
+   if (!valid_read)
+      _ups->Status = status;
+
     write_unlock(_ups);
     return 1;
  }


I have to think that it is a mistake to call what you have found a "bug"
since your UPS works fine when moved to a different machine.  Call this
a "local config" for your particular hardware, rather.

You should also be aware that there's never been a guarantee that the
BackUPS model you are referring to is going to work with apcupsd.  For a
number of years APC released versions of the backUPS that did support
the full UPS protocol via USB - but they didn't advertise this.
Eventually they modified it to not talk out that port and the most
modern ones of these they ship today lack the monitoring port completely.

The lowest level backups that supports monitoring today is the
750 ES and yes, you do pay a premium for it.  The cheap sub-750 VA
BackUPS don't have the capability any longer.

Ted


On 3/29/2017 5:39 PM, Paul Godard wrote:

> Hello,
>
> I have an APC Back-UPS 650VA Battery Backup device (BN650M1-CA)
> connected to my physical pfSense box (2.3.3_1 - FreeBSD 10.3 p17). When
> running apcupsd in debug mode on the command line (-b -d 300), I can see
> that all attempts to get a HID report fail. So all calls to
> usb_control_msg attempting to gather a report will fail but calls using
> usb_control_msg requesting other things may work (such as getting the
> UPS's HID interface's report descriptor - this work). Also, oddly
> enough, USB interrupts events work just fine; apcupsd is able to
> populate some information (including the UPS status, estimated runtime
> left, battery charge percentage, etc.) and is thus able to trigger
> alerts when a power outage occurs.
>
> Back to the HID report issue, I enabled kernel tracing and was able to
> isolate the failure to ioctl calls (that usb_control_msg utilizes); the
> HID report calls return -1 values and sets the "errno" value to 5
> (Input/Output Error). You can see the trace here:
>
>  32271 apcupsd  CALL  write(0x1,0x80201d000,0x4b)
>  32271 apcupsd  GIO   fd 1 wrote 75 bytes
>        "63.611 apcupsd: HidUps.cpp:438 get_report: id=0x0c, kind=2,
> length=4 pos=0
>        "
>  32271 apcupsd  RET   write 75/0x4b
>  32271 apcupsd  CALL  ioctl(0x8,0xc018556f,0x7fffffffe760)
>  32271 apcupsd  RET   ioctl -1 errno 5 Input/output error
>  32271 apcupsd  CALL  write(0x1,0x80201d000,0x62)
>  32271 apcupsd  GIO   fd 1 wrote 98 bytes
>        "63.620 apcupsd: HidUps.cpp:446 Error getting report: (-1)
> Operation not permitted [Unknown error]
>        "
>
> I've tried connecting that UPS to another box running a pfSense VM with
> USB-passthru (identical version to my physical box) - this is a
> different physical computer - and all works fine. So this leads me to
> believe it's some sort of issue with the physical USB controller on my
> physical pfSense box? If that's the case, I don't believe this is
> something that should (or can) be fixed in apcupsd. However, apcupsd is
> still somewhat usable thanks to the USB interrupts events but has
> another logical bug caused by the unexpected HID reports failures...
>
> So the actual bug that I want to report... When a power outage event is
> triggered, apcupsd reports it correctly, however, apcupsd also instantly
> reports an all clear ("Power is back. UPS running on mains.") when the
> outage is in-fact still occurring. When power is legitimately restored,
> apcupsd doesn't report anything different (since it already "recovered"
> immediately after the failure was initially detected). I tracked down
> the bug to the UsbUpsDriver::read_volatile_data() method (which is
> called periodically, and/or after an "important" event - which is
> determined in UsbUpsDriver::usb_report_event) in the
> src/drivers/usb/usb.c file. In this method, right before you loop
> through updated CI values by requesting HID reports (and by eventually
> calling usb_update_value), you reset the "Status" field bits to 0x00
> (thus getting rid of the "power outage" state) anticipating that the
> updated CI values will correctly repopulate the "Status" field. In my
> particular case, apcupsd is unable to fetch any HID reports, therefore,
> after you clear the "Status" field, no CI values will ever set it back
> to a failed "power outage" state. This is why immediately after
> detecting an outage, apcupsd thinks power was restored.
>
> The fix I'm proposing (which worked in my particular case) is to keep a
> copy of the "Status" field's value before clearing it. Then, in the CI
> update loop, I keep track of whether a single HID report came back
> successfully (in my particular case, none ever come back so the code now
> knows this). Right after the loop, if I detected that no HID reports
> were successfully queried, I reset the "Status" field back to its
> original value (before it gets cleared before the loop) thus keeping my
> failed "power outage" status. The moment power returns to the UPS,
> apcupsd gets an updated CI_STATUS through a USB interrupt event and
> correctly reports the event and everyone is happy.
>
> Anyone else experienced something similar to this? What are your
> thoughts on this? I created a simple patch to fix the bug of the
> immediate power restore issue that can be found
> here: https://pastebin.com/AnJ1VYTn
>
> Thanks,
> Paul
>
>
> ------------------------------------------------------------------------------
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>
>
>
> _______________________________________________
> Apcupsd-users mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/apcupsd-users
>

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Apcupsd-users mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/apcupsd-users
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Potential bug with USB driver and HID reports

Paul Godard
Hello Ted,

I apologize for the pastebin reference; I thought it would a cleaner way to read code and never thought about that reference potentially not being available in the future.

I spent a bit more time tracking down the primary issue I've been having and finally figure out the bug that was preventing apcupsd from getting any reports. The issue lies in HidUps::FetchReportDescr where the length of the report descriptor buffer is set to MAX_SANE_DESCRIPTOR_LEN. The fetching of the interface descriptor ("usb_control_msg" in HidUps::GetIntfDescr) with a buffer size of MAX_SANE_DESCRIPTOR_LEN actually works; apcupsd reports reading a descriptor of 1049 bytes (for my UPS device) and parses it successfully. However, it's at this moment that an unreported bug occurs... If I were to request a report prior to fetching the interface descriptor, it will work, however, if I were to request a report after fetching the interface descriptor, I get the "Input/Output error". So calling HidUps::GetIntfDescr triggers the issue.

I'm not sure why the "usb_control_msg" call in HidUps::GetIntfDescr works but all subsequent calls to "usb_control_msg" will fail unexpectedly (a bug in LIBUSB perhaps?).However, I'm able to prevent this bug from happening by passing a proper length to HidUps::GetIntfDescr (instead of MAX_SANE_DESCRIPTOR_LEN). I do this by reading the "wDescriptorLength" (page 22 here: http://www.usb.org/developers/hidpage/HID1_11.pdf) of the HID Descriptor and passing that length over to HidUps::GetIntfDescr. This solves all of my issues...

So I propose the following fix:
   - define a new method that reads the descriptor length of interface 0
   - use the actual length of the interface descriptor to define the buffer and request the data from the usb device (as opposed to the arbitrary MAX_SANE_DESCRIPTOR_LEN value)

What are you thoughts on this? Would you considered including this patch within apcupsd?

Here are my two patches (one for HidUps.h to declare the new method and one for HidUps.cpp to define the new method and use it GetIntfDescr):

--- HidUps.cpp 2017-05-16 19:36:56.000000000 -0400
+++ HidUps_new.cpp 2017-05-17 09:57:25.000000000 -0400
@@ -244,6 +244,26 @@
                          (type << 8) + index, 0, (char*)buf, size, 1000);
 }
 
+/* Get the interface descriptor's length */
+int HidUps::GetIntfDescrLength(unsigned char index)
+{
+   /* usb_hid_descriptor  */
+   char buf[9];
+   memset(buf, 0, sizeof(buf));
+   int ret = usb_control_msg(_fd, USB_ENDPOINT_IN | USB_RECIP_INTERFACE,
+                            USB_REQ_GET_DESCRIPTOR,
+                            (USB_DT_HID << 8) + index, 0, (char*)buf, sizeof(buf), 1000);
+   
+   int len = MAX_SANE_DESCRIPTOR_LEN;
+   if (ret >= 0) {
+      /* wDescriptorLength */
+      int desclen = buf[7] | (buf[8] << 8);
+      if (desclen > 0)
+         len = desclen;
+   }
+   return len;
+}
+
 /*
  * Fetch the report descriptor from the device given an _fd for the
  * device's control endpoint. Descriptor length is written to the
@@ -255,8 +275,9 @@
    unsigned char *ptr;
    int rdesclen;
 
-   ptr = (unsigned char*)malloc(MAX_SANE_DESCRIPTOR_LEN);
-   rdesclen = GetIntfDescr(USB_DT_REPORT, 0, ptr, MAX_SANE_DESCRIPTOR_LEN);
+   int desclen = GetIntfDescrLength(0);
+   ptr = (unsigned char*)malloc(desclen);
+   rdesclen = GetIntfDescr(USB_DT_REPORT, 0, ptr, desclen);
    if (rdesclen <= 0) {
       Dmsg(100, "Unable to get REPORT descriptor (%d).\n", rdesclen);
       free(ptr);

--- HidUps.h 2017-05-16 19:36:55.000000000 -0400
+++ HidUps_new.h 2017-05-17 09:57:48.000000000 -0400
@@ -107,6 +107,7 @@
    /* Fetch a descriptor from an interface (as opposed to from the device) */
    int GetIntfDescr(
       unsigned char type, unsigned char index, void *buf, int size);
+   int GetIntfDescrLength(unsigned char index);
 
    bool init_device(struct usb_device *dev, const char *serno);
 

Thanks,
Paul

On Fri, Mar 31, 2017 at 3:14 AM, Ted Mittelstaedt <[hidden email]> wrote:
Do not use pastebin since this may be picked up months or years in the
future and the patebin link may be gone.  Here is your patch:

--- src/drivers/usb/usb.c       2015-03-21 18:43:19.000000000 -0400
+++ src/drivers/usb/usb.c.new   2017-03-29 07:53:59.702173000 -0400
@@ -815,14 +815,20 @@
     _ups->poll_time = now;           /* save time stamp */

     /* Clear APC status bits; let the various CIs set them again */
+   int status = _ups->Status;
     _ups->Status &= ~0xFF;

     /* Loop through all known data, polling those marked volatile */
+   bool valid_read = false;
     for (int i=0; _known_info[i].usage_code; i++) {
        if (_known_info[i].isvolatile && _known_info[i].ci != CI_NONE)
-         usb_update_value(_known_info[i].ci);
+         valid_read |= usb_update_value(_known_info[i].ci);
     }

+   /* If we weren't able to read any of the CIs, reset the status back
to what it was before we cleared it (thinking the CIs would update it) */
+   if (!valid_read)
+      _ups->Status = status;
+
     write_unlock(_ups);
     return 1;
  }


I have to think that it is a mistake to call what you have found a "bug"
since your UPS works fine when moved to a different machine.  Call this
a "local config" for your particular hardware, rather.

You should also be aware that there's never been a guarantee that the
BackUPS model you are referring to is going to work with apcupsd.  For a
number of years APC released versions of the backUPS that did support
the full UPS protocol via USB - but they didn't advertise this.
Eventually they modified it to not talk out that port and the most
modern ones of these they ship today lack the monitoring port completely.

The lowest level backups that supports monitoring today is the
750 ES and yes, you do pay a premium for it.  The cheap sub-750 VA
BackUPS don't have the capability any longer.

Ted


On 3/29/2017 5:39 PM, Paul Godard wrote:
> Hello,
>
> I have an APC Back-UPS 650VA Battery Backup device (BN650M1-CA)
> connected to my physical pfSense box (2.3.3_1 - FreeBSD 10.3 p17). When
> running apcupsd in debug mode on the command line (-b -d 300), I can see
> that all attempts to get a HID report fail. So all calls to
> usb_control_msg attempting to gather a report will fail but calls using
> usb_control_msg requesting other things may work (such as getting the
> UPS's HID interface's report descriptor - this work). Also, oddly
> enough, USB interrupts events work just fine; apcupsd is able to
> populate some information (including the UPS status, estimated runtime
> left, battery charge percentage, etc.) and is thus able to trigger
> alerts when a power outage occurs.
>
> Back to the HID report issue, I enabled kernel tracing and was able to
> isolate the failure to ioctl calls (that usb_control_msg utilizes); the
> HID report calls return -1 values and sets the "errno" value to 5
> (Input/Output Error). You can see the trace here:
>
>  32271 apcupsd  CALL  write(0x1,0x80201d000,0x4b)
>  32271 apcupsd  GIO   fd 1 wrote 75 bytes
>        "63.611 apcupsd: HidUps.cpp:438 get_report: id=0x0c, kind=2,
> length=4 pos=0
>        "
>  32271 apcupsd  RET   write 75/0x4b
>  32271 apcupsd  CALL  ioctl(0x8,0xc018556f,0x7fffffffe760)
>  32271 apcupsd  RET   ioctl -1 errno 5 Input/output error
>  32271 apcupsd  CALL  write(0x1,0x80201d000,0x62)
>  32271 apcupsd  GIO   fd 1 wrote 98 bytes
>        "63.620 apcupsd: HidUps.cpp:446 Error getting report: (-1)
> Operation not permitted [Unknown error]
>        "
>
> I've tried connecting that UPS to another box running a pfSense VM with
> USB-passthru (identical version to my physical box) - this is a
> different physical computer - and all works fine. So this leads me to
> believe it's some sort of issue with the physical USB controller on my
> physical pfSense box? If that's the case, I don't believe this is
> something that should (or can) be fixed in apcupsd. However, apcupsd is
> still somewhat usable thanks to the USB interrupts events but has
> another logical bug caused by the unexpected HID reports failures...
>
> So the actual bug that I want to report... When a power outage event is
> triggered, apcupsd reports it correctly, however, apcupsd also instantly
> reports an all clear ("Power is back. UPS running on mains.") when the
> outage is in-fact still occurring. When power is legitimately restored,
> apcupsd doesn't report anything different (since it already "recovered"
> immediately after the failure was initially detected). I tracked down
> the bug to the UsbUpsDriver::read_volatile_data() method (which is
> called periodically, and/or after an "important" event - which is
> determined in UsbUpsDriver::usb_report_event) in the
> src/drivers/usb/usb.c file. In this method, right before you loop
> through updated CI values by requesting HID reports (and by eventually
> calling usb_update_value), you reset the "Status" field bits to 0x00
> (thus getting rid of the "power outage" state) anticipating that the
> updated CI values will correctly repopulate the "Status" field. In my
> particular case, apcupsd is unable to fetch any HID reports, therefore,
> after you clear the "Status" field, no CI values will ever set it back
> to a failed "power outage" state. This is why immediately after
> detecting an outage, apcupsd thinks power was restored.
>
> The fix I'm proposing (which worked in my particular case) is to keep a
> copy of the "Status" field's value before clearing it. Then, in the CI
> update loop, I keep track of whether a single HID report came back
> successfully (in my particular case, none ever come back so the code now
> knows this). Right after the loop, if I detected that no HID reports
> were successfully queried, I reset the "Status" field back to its
> original value (before it gets cleared before the loop) thus keeping my
> failed "power outage" status. The moment power returns to the UPS,
> apcupsd gets an updated CI_STATUS through a USB interrupt event and
> correctly reports the event and everyone is happy.
>
> Anyone else experienced something similar to this? What are your
> thoughts on this? I created a simple patch to fix the bug of the
> immediate power restore issue that can be found
> here: https://pastebin.com/AnJ1VYTn
>
> Thanks,
> Paul
>
>
> ------------------------------------------------------------------------------
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>
>
>
> _______________________________________________
> Apcupsd-users mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/apcupsd-users
>

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Apcupsd-users mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/apcupsd-users


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Apcupsd-users mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/apcupsd-users
Loading...