Deleting a writer with unacknowledged data present in its WHC causes it
to linger for a configurable duration. Once it is lingering, there are
two routes to actually deleting the writer: because the samples get
acknowledged, or because the linger duration elapses.
When these two happen roughly concurrently, there was a possibility of
both succeeding in looking up the writer by its GUID, in which case one
of them then asserts on removing it from the entity index (if assertions
are enabled, if not, things are worse).
This fixes that by ensuring only one of the two actually does something,
as was always the intent.
Signed-off-by: Erik Boasson <eb@ilities.com>
* read/take failed to restore the null pointer in the first entry of the
sample pointer array it gets passed, in the case no "loan" had been
allocated yet and it returned an empty set. The consequence is that
on a subsequence read it will reuse the address without marking at as
in use, so that a *second* read using with a null pointer in that
first entry will overwrite the first result. (Introduced by
d16264fd82.)
* return_loan failed to free all memory if its argument wasn't actually
a loan. There are many good arguments why the read/take/return_loan
interface is messed up, but in the context of the existing interface
this is a perfectly reasonable case: there is at most one "loan" for
each reader, but one can keep calling read/take and return_loan as if
there's an infinite number of "loans". It's just that the first gets
cached and the others don't.
Signed-off-by: Erik Boasson <eb@ilities.com>
The moving around and cleaning up of network code broke the IPv6
multicast support by memcpy'ing a sockaddr_in6 instead of an in6_addr in
a multicast join record.
Signed-off-by: Erik Boasson <eb@ilities.com>
The participant listener creates a pong writer, setting a publication
matched listener on it. That listener can be invoked immediately and as
it queries the subscriptions reader, it must not be enabled before the
latter reader has been created.
Signed-off-by: Erik Boasson <eb@ilities.com>
* Move wctime, mtime, etime types to ddsrt
* Add ddsrt_time_wallclock
* Change ddsrt_time_monontic, elapsed to use mtime, etime types
* Remove now, now_mt, now_et
* Rename X_to_sec_usec to ddsrt_X_to_sec_usec
* add_duration_to_X to ddsrt_X_add_duration (to be in line with the
existing ddsrt_time_add_duration)
* elimination of ddsrt/timeconv.h, it added more in the way of
complications than it did in making things more elegant
* rename of q_time.[ch] to ddsi_time.[ch]: that now only deals with DDSI
timestamps and durations on the wire
Signed-off-by: Erik Boasson <eb@ilities.com>
wsock32.lib is only needed for the legacy version of Winsock and is not needed with Winsock2 (the current version).
This appears to be a root cause of the multicast issue on Win10 and may allow us to reverse #404
Signed-off-by: Dan Rose <dan@digilabs.io>
This is a workaround for interoperability issues, ultimately driven by a
Windows quirk that makes multicast delivery within a machine utterly
unreliable if the transmitting socket is bound to 0.0.0.0 (despite all
sockets having multicast interfaces set correctly) when there are also
sockets transmitting to the same multicast group that have been bound to
non-0.0.0.0. (Note: there may be other factors at play, but this is
what it looks like after experimentation.)
At least Fast-RTPS in some versions binds the socket it uses for
transmitting multicasts to non-0.0.0.0, so interoperability with
Fast-RTPS on Windows requires us to bind the socket we use for
transmitting multicasts (which was the same as the one we use for
receiving unicast data) also to non-0.0.0.0 or our multicasts get
dropped often.
This would work fine if other implementations honoured the set of
advertised addresses. However, at least Fast-RTPS and Connext (in some
versions) fail to do this and happily substitute 127.0.0.1 for the
advertised IP address. If we bind to, e.g., 192.168.1.1, then suddenly
those packets won't arrive anymore, breaking interoperability.
The only work around is to use a separate socket for sending.
Signed-off-by: Erik Boasson <eb@ilities.com>
* IPv6 extensions to patterns
* use full GUID prefix for Cyclone
* pattern fixes to deal with small changes in the formatting of QoS
* suppressinof local built-in topic publications
* asymmetrical disconnect detection improvements (better chance of
detecting it, plus better suppression of spurious notifications)
Signed-off-by: Erik Boasson <eb@ilities.com>
* Fix issue in dds_create_topic_arbitrary
Changed the behaviour of dds_create_topic_arbitrary with respect to the
sertopic parameter: the existing function dds_create_topic_arbitrary is
marked deprecated and replaced by dds_create_topic_generic, which returns
the sertopic that is actually used in as an out parameter. This can be eiter
the provided sertopic (if this sertopic was not yet known in the domain) or an
existing sertopic if the sertopic was registered earlier.
Signed-off-by: Dennis Potman <dennis.potman@adlinktech.com>
* Fix memory leaks in case topic creation fails.
Signed-off-by: Dennis Potman <dennis.potman@adlinktech.com>
[test_subscriber-12] /opt/ros/master/src/eclipse-cyclonedds/cyclonedds/src/ddsrt/src/mh3.c:28:53: runtime error: applying zero offset to null pointer
[test_subscriber-12] SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior /opt/ros/master/src/eclipse-cyclonedds/cyclonedds/src/ddsrt/src/mh3.c:28:53 in
Signed-off-by: Dan Rose <dan@digilabs.io>
* Don't pass null to memcmp
```
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior /opt/ros/master/src/ros2/rmw_cyclonedds/rmw_cyclonedds_cpp/include/rmw_cyclonedds_cpp/serdes.hpp:135:3 in
/opt/ros/master/src/eclipse-cyclonedds/cyclonedds/src/core/ddsi/src/ddsi_sertopic_default.c:41:15: runtime error: null pointer passed as argument 1, which is declared to never be null
/usr/include/string.h:64:33: note: nonnull attribute specified here
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior /opt/ros/master/src/eclipse-cyclonedds/cyclonedds/src/core/ddsi/src/ddsi_sertopic_default.c:41:15 in
/opt/ros/master/src/eclipse-cyclonedds/cyclonedds/src/core/ddsi/src/ddsi_sertopic_default.c:41:31: runtime error: null pointer passed as argument 2, which is declared to never be null
/usr/include/string.h:64:33: note: nonnull attribute specified here
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior /opt/ros/master/src/eclipse-cyclonedds/cyclonedds/src/core/ddsi/src/ddsi_sertopic_default.c:41:31 in
/opt/ros/master/src/eclipse-cyclonedds/cyclonedds/src/core/ddsi/src/ddsi_sertopic_default.c:45:15: runtime error: null pointer passed as argument 1, which is declared to never be null
/usr/include/string.h:64:33: note: nonnull attribute specified here
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior /opt/ros/master/src/eclipse-cyclonedds/cyclonedds/src/core/ddsi/src/ddsi_sertopic_default.c:45:15 in
/opt/ros/master/src/eclipse-cyclonedds/cyclonedds/src/core/ddsi/src/ddsi_sertopic_default.c:45:30: runtime error: null pointer passed as argument 2, which is declared to never be null
/usr/include/string.h:64:33: note: nonnull attribute specified here
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior /opt/ros/master/src/eclipse-cyclonedds/cyclonedds/src/core/ddsi/src/ddsi_sertopic_default.c:45:30 in
```
Signed-off-by: Dan Rose <dan@digilabs.io>
* clearer non-null check
Signed-off-by: Dan Rose <dan@digilabs.io>
The thread_states array resets the "state" to ZERO on thread termination
to indicate that the slot was unused, but it leaves the thread id
unchanged because some platforms don't have a defined value that will
never be used as a thread id. A consequence is that a newly created
thread may result in multiple slots containing their own thread id, but
generally there will only be one that is not in state ZERO.
However, the code for create_thread used to set the state to ALIVE prior
to creating the thread, and so if the events get scheduled like:
1. thread A: X.state = ALIVE
2. create new thread B, storing tid in X.tid
3. thread A: Y.state = ALIVE
4. new thread B: lookup self (and cache pointer)
5. create new thread C, storing tid in Y.tid
6. new thread C: lookup self (and cache pointer)
Then B will observe two slots in the ALIVE state, with X.tid certain to
match and Y.tid undefined (and hence possibly matching). It may
therefore pick Y. C will (in this schedule) of course always choose Y.
They cache the pointer and never look at X and Y again, except for
updating their virtual clocks.
These virtual clocks are updated non-atomically (by design it is private
to the thread) and so if both B & C use Y they can end up racing each
other in updating the virtual clock and cause the nesting level of the
"awake" state controlling garbage collection to get stuck (or wrap
around, or do other horrible things). The consequence can be anything,
from a somewhat benign variant where GC effectively stops and some
operations (deleting readers and writers and shutting down) block
forever, to use-after-free and the undefined behaviour that implies.
This commit avoids looking up the slot in the newly created threads,
instead passing the correct address in the argument. It also adds an
intermediate state INIT that serves to reserve the slot until the new
thread is actually running. It does make the look-up safe (if one were
to do it), and as it is essentially free and gives more insight in the
state of the system when viewed from a debugger, it appears a useful
addition.
Signed-off-by: Erik Boasson <eb@ilities.com>
```
/opt/ros/master/src/eclipse-cyclonedds/cyclonedds/src/tools/pubsub/common.c:586:28: warning: implicit conversion from 'long' to 'double' changes value from 9223372036854775807 to 9223372036854775808 [-Wimplicit-int-float-conversion]
if(nanosec > nextafter(INT64_MAX, 0)) {
~~~~~~~~~ ^~~~~~~~~
/usr/include/stdint.h:134:22: note: expanded from macro 'INT64_MAX'
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/usr/include/stdint.h:116:24: note: expanded from macro '__INT64_C'
^~~~~~
<scratch space>:345:1: note: expanded from here
9223372036854775807L
^~~~~~~~~~~~~~~~~~~~
1 warning generated.
```
Signed-off-by: Dan Rose <dan@digilabs.io>
gcc 5.4 correctly warned that a null pointer was being passed into the
entity-specific "set_qos" function when changing a topic QoS, where that
parameter was tagged as "non-null". As it was never dereferenced in
this case the resulting behaviour was still correct.
It turns out that the entire function was overly complicated and that
simply passing the entity pointer round allows eliminating a few
arguments as well.
(Oddly none of the more modern toolchains used pick this up.)
Signed-off-by: Erik Boasson <eb@ilities.com>
the destination cache of the network stack is in a certain state. The issue
is resolved by binding unicast sockets (incoming unicast and all outgoing
traffic) to the address of the interface instead of inaddr_any (0.0.0.0).
Set the new configuration option internal/BindUnicastToInterfaceAddr to
false to get the old behavior.
Co-authored-by: Erik Boasson <eb@ilities.com>
Signed-off-by: Dennis Potman <dennis.potman@adlinktech.com>
* Fix some typos.
Signed-off-by: ChenYing Kuo <evshary@gmail.com>
* Also update q_config.c, cyclonedds.rnc, cyclonedds.xsd for correct
build.
Signed-off-by: ChenYing Kuo <evshary@gmail.com>
* Remove cdds.md.
Signed-off-by: ChenYing Kuo <evshary@gmail.com>
ddsrt_asprintf did not copy non-English interface names. To fix this memory is
allocated with ddsrt_malloc and UTF-16 encoded interface names are converted to
UTF-8.
Signed-off-by: Jeroen Koekkoek <jeroen@koekkoek.nl>
This adds options to check for "unreasonable" RSS growth, receipt of a
minimum number of samples and having run a minimum number of roundtrips.
Signed-off-by: Erik Boasson <eb@ilities.com>
A keep-last volatile WHC retained data already overwritten by the writer
in the absence of ACKs, introduced by 231cb8c9.
Signed-off-by: Erik Boasson <eb@ilities.com>
The status mask on some readers got reduced to just "data available"
when used in conjunction with a waitset, but the consequence is that the
"subscription matched" listener would be suppressed.
Signed-off-by: Erik Boasson <eb@ilities.com>
This already was leaking out in the interface, so this name change was
needed too. The relationship between plist and xqos being so intimate,
doing the one but not the other made no sense.
Signed-off-by: Erik Boasson <eb@ilities.com>
The name (not its definition) now leaks out in ddsi_sertopic, and the
messy old names really shouldn't pollute the interface any more than
necessary.
Signed-off-by: Erik Boasson <eb@ilities.com>
This also removes the code duplication for the handling delivery from
local vs remote writers. (And it adds a test.)
Signed-off-by: Erik Boasson <eb@ilities.com>
This commit changes the implementation of topics so that multiple topic
entities can exist in a single participant for the same topic.
Different entities may refer to different topic implementations
(sertopics, akin to a type support in the DDS specification). All
entities (for the same participant) always have the same QoS, via the
new "ktopic" table in the participant.
Readers and writers are bound to a topic entity and inherit its
properties. If a topic comes in two definitions, say one for C and one
for C++, one can have a single participant with a reader delivering the
data in C representation and another reader delivering it in C++
representation.
This changes the behaviour of create_topic and find_topic: these now (on
successful return) always return a new entity (and thus with a unique
handle), where previously these would simply return a existing one when
possible.
This also requires some small additions to the sertopic/serdata
interface.
Signed-off-by: Erik Boasson <eb@ilities.com>
* Use the parameter tables to pretty-print QoS and plist, rather than a
hard-coded function supporting only the QoS.
* Support diffing two plists: a single table-driven function can handle
both nn_plist_t and ddsi_qos_t, and it removes the discrepancy between
the two types.
* Log content of discovery samples in trace rather than merely printing
"(plist)"
Signed-off-by: Erik Boasson <eb@ilities.com>