Overly aggressive sending of ACKNACKs eats bandwidth and causes
unnecessary retransmits and lowers performance; but overly timid sending
of them also reduces performance. This commit reduces the
aggressiveness.
* It keeps more careful track of what ACKNACK (or NACKFRAG) was last
sent and when, suppressing ACKs that don't provide new information for
a few milliseconds and suppressing NACKs for the NackDelay
setting. (The setting was there all long, but it didn't honor it when
the writer asked for a response.)
* It ignores the NackDelay when all that was requested has arrived, or
when it receives a directed heartbeat from a Cyclone peer. The latter
is taken as an indication that no more is following, and allows the
recipient to ask far arbitrary amounts of data and rely on the sender
to limit the retransmit to what seems reasonable. (For NACKFRAG one
can do it in the recipient, but for ACKNACK one cannot, and so one
might as well do it at the sender always.)
* Sufficient state is maintained in the match object for the ACKNACK
generator to decide whether or not to send an ACKNACK following the
rules, and it may decide to send just an ACK even though there is data
missing, or nothing at all.
* If HEARTBEAT processing requires an immediate response, the response
message is generated by the receive thread, but still queued for
transmission. If a delayed response is required, it schedules the
ACKNACK event.
Signed-off-by: Erik Boasson <eb@ilities.com>
This adds tracking of whether a heartbeat should be generated until
processing of the message is complete or an ACKNACK or NACKFRAG from
another reader requires a response. This way, an ACKNACK + NACKFRAG
pair does not trigger multiple heartbeat messages.
Signed-off-by: Erik Boasson <eb@ilities.com>
The DDSI spec version 2.3 allows empty bit sets, so malformed GAPs
caused by a bug in the code for avoiding those is most easily fixed by
generating a GAP with an empty bit set.
Signed-off-by: Erik Boasson <eb@ilities.com>
This changes a few intertwined things at the same time:
* It allows configuring sending a partial message for large messages,
with a maximum derived from the discovered receive buffer sizes;
* It uses a different message size limit for datagrams that include
retransmits than for those that don't. The argument here is that,
having seen flaky networks where large datagrams cause trouble, it
makes sense to default to sending retransmits as datagrams that fit in
individual packets.
* The best performance is generally obtained using the maximum data gram
size, but the benefits do fall off quite quickly once they are
largish. For flaky networks, it doesn't make sense to go for 64kB
datagrams. This tries to find a reasonable compromise.
* It now packs mutiple fragments into a single DATAFRAG message to
eliminate the cost of using small fragment sizes.
The changes in buffer sizes cause the ddsperf sanity check to fail:
* The larger amounts of unacknowledged data cause the used memory to be
higher, failing the RSS check. Raising the limit seems
reasonable (the alternative would be to configure it back to the old
values, but it is all empirically determined anyway).
* The same also causes the publisher thread to get to run more and the
ping/pong bit gets less of a chance. Using fixed-frequency bursts
helps with this.
This therefore also adjust the test configuration and the thresholds a
bit.
Signed-off-by: Erik Boasson <eb@ilities.com>
An asymmetrical disconnect where the reader undiscovers and rediscovers
the writer, but the reader remains alive all the time for the writer
results in the "count" field of NACKFRAGs restarting. According to the
spec these must be ignored to protect against multi-pathing, but in this
scenario, ignoring them results in ignoring valid retransmit requests
until the "count" value catches up, which can take a very long time.
For ACKNACKs and HEARTBEATs the same problem exists, there it was
already handled by accepting backward jumps after some time has passed.
This reuses the same logic for NACKFRAGs.
This also changes the "count" fields to uint32_t throughout: the spec
defines them as int32_t, requires them to be strictly monotonically
increasing and omits any mention of a valid range or at what value the
counter should start. Thus, everything in [-2^31,2^31-1] is allowed,
switching to an uint32_t merely shifts the range. It also appears that
all implementations start at 0 or 1. The "strictly monotonically" part
was impossible to do without disconnecting anyway.
Signed-off-by: Erik Boasson <eb@ilities.com>
It is done by "do_locator" after it has decided that the locator is
well-formed and, crucially, not to be ignored. Setting it when there
are only ignored locators (of the unicast/multicast, data/metadata
variety) causes further processing to rely on uninitialized memory.
Signed-off-by: Erik Boasson <eb@ilities.com>
Reuse unicast data socket in MSM_NO_UNICAST, just like it did in all
modes before the extra socket was introduced in
d1ed8df9f3. This restores support for the
"raw ethernet" transport on Linux by no longer requiring the transport
to create a socket with an arbitrary "port".
Signed-off-by: Erik Boasson <eb@ilities.com>
The deinitialize would happen on most errors, but in all those cases it
would not have been initialized yet.
Signed-off-by: Erik Boasson <eb@ilities.com>
This removes the special handling of IP addresses in adding peer
locators from the configuration, instead relying on the general
string-to-locator conversion routines.
* This extends the common IP handling to code to handle the optional
presence of a port and the use of brackets, allowing them always for
IPv6 addresses, but requiring them only when needed for disambiguating
numerical IPv6 addresses when a port is present.
* The "multicast generator" format is now handled in UDPv4 code.
Signed-off-by: Erik Boasson <eb@ilities.com>
The src/core/ddsi/tests/locators.c test directly includes the header
files related to DDSI support for TCP and this pulled in openssl/ssl.h,
which in turn results in a build error in some environments because the
file can't be found.
There was no good reason why this dependency existed, the definitions
that relied on it were used only in the implementation of the TCP and
TLS support.
Signed-off-by: Erik Boasson <eb@ilities.com>
OpenSSL doesn't support using BIOs of the "fd" or "file" type when it is
built as a DLL and the executable didn't provide it with access to the
executable's CRT. Requiring all applications that wish to use security
to worry about this "applink.c" thing is too onerous a requirement.
* Check for the existence of "applink.c" in the OpenSSL include
directory, adding it to the security tests if it exists. This way,
all of OpenSSL can be used by the tests.
* Include it in the security core and built-in plugin tests. This way,
the test code can use the entirety of OpenSSL.
* In the authentication and access-control plugins, load X509 and
private keys from files by first reading them into a "mem" type BIO,
then reading them from that BIO.
* Take care not to call ddsrt_free on OpenSSL-allocated memory, either
by calling OPENSSL_free, or by allocating the memory using
ddsrt_malloc and letting OpenSSL fill that buffer.
Signed-off-by: Erik Boasson <eb@ilities.com>
CID 304509 - it does not affect behaviour because the called function
uses it as an out parameter and the result is never inspected.
Signed-off-by: Erik Boasson <eb@ilities.com>
Failure to generate a signature for in handshake tests attempted to free
the address of the pointer, instead of the pointed-to memory (CID
304462).
Signed-off-by: Erik Boasson <eb@ilities.com>
Triggered by CID 304462, 304471, 304517: dereference before null check.
Note that it is a second-order problem because it would require the
plug-in functions to be called with a null pointer for the plug-in
instance.
Signed-off-by: Erik Boasson <eb@ilities.com>
The changes in d92d491b83 to deal with
local readers and writers with the same topic and type name but
different underlying `struct ddsi_sertopic`s did not include the
provisioning of historical data from a (local) transient-local writer to
a (local) transient-local reader.
Signed-off-by: Erik Boasson <eb@ilities.com>
All incoming samples end up in ddsi_plist_fini, usually one with nothing
present, sometimes one containing status info or a keyhash. The
"present" flags allow this to be a very quick operation in these simple
cases, and this should be made use of.
Signed-off-by: Erik Boasson <eb@ilities.com>
The macOS 10.12 build was put in because of ROS2 "Dashing" specified
10.12 as the supported version, but Eloquent and later specify
10.14. The relevance of this is no longer there because of Foxy. The
build itself took an inordinate amount of time with lots of warnings
about the platform being deprecated.
Signed-off-by: Erik Boasson <eb@ilities.com>
Fix using the variable relay_only uninitialized in the function
connect_proxy_writer_with_reader when security is disabled in the
build configuration.
Signed-off-by: Dennis Potman <dennis.potman@adlinktech.com>
This addresses a number of issues with building Cyclone DDS including
DDS Security while using OpenSSL 1.0.2. Compatibility with 1.0.2 is a
courtesy towards those who are unable to move to 1.1.x or later because
of other libraries.
* On Windows, one must include Winsock2.h prior to including the OpenSSL
header files, or it'll pull in incompatible definitions from Winsock.h
and that breaks some of the files.
* OpenSSL 1.0.2 requires initializing the library (or more particular,
loading all the required algorithms) but this is no longer needed in
OpenSSL 1.1.x. It ends up being needed in a few places and having tons
of essentially dead initialization code lying around is unpleasant.
Hence this has been consolidated in a single function and protected
with ddsrt_once().
* One ought to undo the above initialization on 1.0.2g and older, but it
is impossible to know whether that can safely be done from a library.
This is also the reason OpenSSL deprecated all the initialization and
cleanup interfaces. So if one insists on trying it with such an old
version, let there be some leaks.
* Thread state cleanup is sort-of required prior 1.1.0, but that suffers
from the same problems; we'd have to do per-thread cleanup code for
OpenSSL for any thread that could call into it (which is pretty much
any thread). So once again, people should just use 1.1.0 or newer.
* There are some interfaces added in 1.1.0 that we use, but a few small
workarounds those can be made to work on 1.0.2 as well. These also
were replicated in a number of places and consolidated by this commit.
Signed-off-by: Erik Boasson <eb@ilities.com>