This changes a few intertwined things at the same time:
* It allows configuring sending a partial message for large messages,
with a maximum derived from the discovered receive buffer sizes;
* It uses a different message size limit for datagrams that include
retransmits than for those that don't. The argument here is that,
having seen flaky networks where large datagrams cause trouble, it
makes sense to default to sending retransmits as datagrams that fit in
individual packets.
* The best performance is generally obtained using the maximum data gram
size, but the benefits do fall off quite quickly once they are
largish. For flaky networks, it doesn't make sense to go for 64kB
datagrams. This tries to find a reasonable compromise.
* It now packs mutiple fragments into a single DATAFRAG message to
eliminate the cost of using small fragment sizes.
The changes in buffer sizes cause the ddsperf sanity check to fail:
* The larger amounts of unacknowledged data cause the used memory to be
higher, failing the RSS check. Raising the limit seems
reasonable (the alternative would be to configure it back to the old
values, but it is all empirically determined anyway).
* The same also causes the publisher thread to get to run more and the
ping/pong bit gets less of a chance. Using fixed-frequency bursts
helps with this.
This therefore also adjust the test configuration and the thresholds a
bit.
Signed-off-by: Erik Boasson <eb@ilities.com>
An asymmetrical disconnect where the reader undiscovers and rediscovers
the writer, but the reader remains alive all the time for the writer
results in the "count" field of NACKFRAGs restarting. According to the
spec these must be ignored to protect against multi-pathing, but in this
scenario, ignoring them results in ignoring valid retransmit requests
until the "count" value catches up, which can take a very long time.
For ACKNACKs and HEARTBEATs the same problem exists, there it was
already handled by accepting backward jumps after some time has passed.
This reuses the same logic for NACKFRAGs.
This also changes the "count" fields to uint32_t throughout: the spec
defines them as int32_t, requires them to be strictly monotonically
increasing and omits any mention of a valid range or at what value the
counter should start. Thus, everything in [-2^31,2^31-1] is allowed,
switching to an uint32_t merely shifts the range. It also appears that
all implementations start at 0 or 1. The "strictly monotonically" part
was impossible to do without disconnecting anyway.
Signed-off-by: Erik Boasson <eb@ilities.com>
It is done by "do_locator" after it has decided that the locator is
well-formed and, crucially, not to be ignored. Setting it when there
are only ignored locators (of the unicast/multicast, data/metadata
variety) causes further processing to rely on uninitialized memory.
Signed-off-by: Erik Boasson <eb@ilities.com>
Reuse unicast data socket in MSM_NO_UNICAST, just like it did in all
modes before the extra socket was introduced in
d1ed8df9f3. This restores support for the
"raw ethernet" transport on Linux by no longer requiring the transport
to create a socket with an arbitrary "port".
Signed-off-by: Erik Boasson <eb@ilities.com>
The deinitialize would happen on most errors, but in all those cases it
would not have been initialized yet.
Signed-off-by: Erik Boasson <eb@ilities.com>
This removes the special handling of IP addresses in adding peer
locators from the configuration, instead relying on the general
string-to-locator conversion routines.
* This extends the common IP handling to code to handle the optional
presence of a port and the use of brackets, allowing them always for
IPv6 addresses, but requiring them only when needed for disambiguating
numerical IPv6 addresses when a port is present.
* The "multicast generator" format is now handled in UDPv4 code.
Signed-off-by: Erik Boasson <eb@ilities.com>
The src/core/ddsi/tests/locators.c test directly includes the header
files related to DDSI support for TCP and this pulled in openssl/ssl.h,
which in turn results in a build error in some environments because the
file can't be found.
There was no good reason why this dependency existed, the definitions
that relied on it were used only in the implementation of the TCP and
TLS support.
Signed-off-by: Erik Boasson <eb@ilities.com>
OpenSSL doesn't support using BIOs of the "fd" or "file" type when it is
built as a DLL and the executable didn't provide it with access to the
executable's CRT. Requiring all applications that wish to use security
to worry about this "applink.c" thing is too onerous a requirement.
* Check for the existence of "applink.c" in the OpenSSL include
directory, adding it to the security tests if it exists. This way,
all of OpenSSL can be used by the tests.
* Include it in the security core and built-in plugin tests. This way,
the test code can use the entirety of OpenSSL.
* In the authentication and access-control plugins, load X509 and
private keys from files by first reading them into a "mem" type BIO,
then reading them from that BIO.
* Take care not to call ddsrt_free on OpenSSL-allocated memory, either
by calling OPENSSL_free, or by allocating the memory using
ddsrt_malloc and letting OpenSSL fill that buffer.
Signed-off-by: Erik Boasson <eb@ilities.com>
CID 304509 - it does not affect behaviour because the called function
uses it as an out parameter and the result is never inspected.
Signed-off-by: Erik Boasson <eb@ilities.com>
Failure to generate a signature for in handshake tests attempted to free
the address of the pointer, instead of the pointed-to memory (CID
304462).
Signed-off-by: Erik Boasson <eb@ilities.com>
Triggered by CID 304462, 304471, 304517: dereference before null check.
Note that it is a second-order problem because it would require the
plug-in functions to be called with a null pointer for the plug-in
instance.
Signed-off-by: Erik Boasson <eb@ilities.com>
The changes in d92d491b83 to deal with
local readers and writers with the same topic and type name but
different underlying `struct ddsi_sertopic`s did not include the
provisioning of historical data from a (local) transient-local writer to
a (local) transient-local reader.
Signed-off-by: Erik Boasson <eb@ilities.com>
All incoming samples end up in ddsi_plist_fini, usually one with nothing
present, sometimes one containing status info or a keyhash. The
"present" flags allow this to be a very quick operation in these simple
cases, and this should be made use of.
Signed-off-by: Erik Boasson <eb@ilities.com>
The macOS 10.12 build was put in because of ROS2 "Dashing" specified
10.12 as the supported version, but Eloquent and later specify
10.14. The relevance of this is no longer there because of Foxy. The
build itself took an inordinate amount of time with lots of warnings
about the platform being deprecated.
Signed-off-by: Erik Boasson <eb@ilities.com>
Fix using the variable relay_only uninitialized in the function
connect_proxy_writer_with_reader when security is disabled in the
build configuration.
Signed-off-by: Dennis Potman <dennis.potman@adlinktech.com>
This addresses a number of issues with building Cyclone DDS including
DDS Security while using OpenSSL 1.0.2. Compatibility with 1.0.2 is a
courtesy towards those who are unable to move to 1.1.x or later because
of other libraries.
* On Windows, one must include Winsock2.h prior to including the OpenSSL
header files, or it'll pull in incompatible definitions from Winsock.h
and that breaks some of the files.
* OpenSSL 1.0.2 requires initializing the library (or more particular,
loading all the required algorithms) but this is no longer needed in
OpenSSL 1.1.x. It ends up being needed in a few places and having tons
of essentially dead initialization code lying around is unpleasant.
Hence this has been consolidated in a single function and protected
with ddsrt_once().
* One ought to undo the above initialization on 1.0.2g and older, but it
is impossible to know whether that can safely be done from a library.
This is also the reason OpenSSL deprecated all the initialization and
cleanup interfaces. So if one insists on trying it with such an old
version, let there be some leaks.
* Thread state cleanup is sort-of required prior 1.1.0, but that suffers
from the same problems; we'd have to do per-thread cleanup code for
OpenSSL for any thread that could call into it (which is pretty much
any thread). So once again, people should just use 1.1.0 or newer.
* There are some interfaces added in 1.1.0 that we use, but a few small
workarounds those can be made to work on 1.0.2 as well. These also
were replicated in a number of places and consolidated by this commit.
Signed-off-by: Erik Boasson <eb@ilities.com>
The various network partition-related structs were left mostly
uninitialized by the various init functions (e.g., if_partition_mapping)
and these were moreover interpreted on parse errors. This initializes
them to null pointers and skips all configuration post-processing in
case of an error.
Signed-off-by: Erik Boasson <eb@ilities.com>
This changes the handling of the removal of the lease of a manual
liveliness (proxy) writer from the (proxy) participant, such that the
invariant maintained for the "min lease" objects in the (proxy)
participant changes from: a clone of some lease with the minimum
duration, to: a clone of the lease that is returned by the
ddsrt_fibheap_min operation on the lease heap.
This fixes a use-after-free of the entity pointed to by the cloned lease
object in a scenario where the shortest lease duration is used by
multiple writers and the removal of a lease from the heap shuffles the
remaining entries around. For example (before this change):
1. initial situation: three writers w1, w2 and w3 with equal lease
durations:
- pp.heap = w1.lease : w2.lease w3.lease
- pp.minl = clone of w1.lease
2. delete w2:
- assuming deleting w2.lease from the heap moves w3.lease to the
front (only guarantee is that there are no smaller keys in the heap
than that of the entry returned by minimum operation)
- min(pp.heap) = w1.lease != w2.lease
thus: pp.minl unchanged, pp.minl.entity = w1
- pp.heap = w3.lease : w1.lease
3. delete w1:
- min(pp.heap) = w3.lease != w1.lease,
thus: pp.minl unchanged, pp.minl.entity = w1
- pp.heap = w3
- free w1
- now pp.minl.entity has a dangling pointer, touched on deleting
the (proxy) particpant or on lease expiry.
With this chamge, pp.minl is updated in step 2 to be a clone of w3.lease
because the lease returned by min(pp.heap) changes. This ensures that
in step 3 there is no dangling pointer and no use-after-free.
Signed-off-by: Erik Boasson <eb@ilities.com>