Replace Receive_Packet_Flag conditional variable with a semaphore and update the related library functions accordingly.
Analysis of the problem determined that the issue lay in the transfer of APDU packets between the FSM and the APDU packet handler thread.
The mechanism previously used by the FSM to notify the APDU packet handler thread that a packet was available for processing used a pthread conditional variable which packet handler thread was supposed to wait on before being signalled by the FSM.
However the packet handler thread has other tasks to perform and sometimes was not waiting on the conditional variable which it was signalled.
Unlike other synchronisation mechanisms such as semaphores, if the waiting task (the consumer) is not blocked on the conditional variable when the producer signals, then that signal is lost and the consumer is never signalled again, leading to a continual sequence of timeouts on the conditional variable.
This in turn led to the packet handler thread never being notified of a packet waiting to be processed thus causing the interface hang.
The main problem is that a conditional variable is supposed to be used with a mutex to prevent this behaviour occurring, but this mutex was not present (and in fact had been removed from the code, most likely because it was causing other synchronisation issues)
Further inspection revealed that this code was copied from another file but modified to remove the mutex which is an essential part of using a conditional variable for synchronisation. This then prevents the producer task being blocked until the consumer task is waiting on the conditional variable, thus leading to a race condition which is causing the issues seen.
The fix is to replace the conditional variable with a semaphore as this provides the required mechanism in this case.
Thank you Ian Smith at Abelon Systems Ltd. for the patch!
........
When a packet is received which expects a reply a copy is stored in the PDU ring buffer so it can be matched with the reply. Unfortunately when the reply is received it is only checked against the first entry in the ring buffer. This can cause a failure if a second packet expecting a reply has been received while waiting for the first reply to arrive.
This is a known issue in the bacnet-stack open source stack, and there is a outstanding FIXME in the latest version of the source code:
/* The ANSWER_DATA_REQUEST state is entered when a */
/* BACnet Data Expecting Reply, a Test_Request, or */
/* a proprietary frame that expects a reply is received. */
/* FIXME: MSTP_Get_Reply waits for a matching reply, but
if the next queued message doesn't match, then we
sit here for Treply_delay doing nothing */
The fix for this is to check all the messages in the PDU queue to see if any of them match, and if one does then handle it in the normal way. Thank you to Ian Smith of Abelon Systems Ltd. for the patch!
Added BACnet/IPv6 datalink layer and example BACnet/IPv4 to BACnet/IPv6 router.
BVLC6 layer is working on Linux port without BBMD features yet. Win32 is implemented, untested.
Tested during BACnet North American Plugfest 2016.
........
Implemented the majority of functionalities presented in the standard, but there are several features that this patch currently lacks:
- Set-Master-Key message has a specific order of key adding and decoding which is not covered
- There is no general secure-apdu-handler function
- Checks for the type of keys used for signing/encryption of specific messages is not implemented
- The status of encrypted flag during the calculation of the signature is ambiguous
There is a Linux implementation using the OpenSSL library, with function prototypes broad enough to allow for different implementations.
Thank you, Nikola Jelić!
Tested with Wireshark on Windows (mostly working).
To use extcap, run Wireshark and go to the About-dialog. Find a tab located there named "Folders". Locate the extcap search path. Copy the mstpcap.exe to that folder, which may not exist.
Restart Wireshark, and look for "BACnet MS/TP on COMx" interfaces.
Configure the interface to change baud rate.
Capture directly from the interface.
The function was modified to calculate the broadcast address from IP and netmask instead of using SIOCGIFBRDADDR. In some cases it is possible that the ioctl is successful, but the returned address is 0 (e.g. search for Bcast: 0.0.0.0). For some reason in Linux the local loopback device answers from 0.0.0.0 address. So messages broadcast to that address are received from 127.0.0.1 which can possibly create a broadcast loop. This has nothing to do with NAT, but makes the stack more robust. Thank you, Sami Pietikäinen, for the contribution!