This fixes some issues with the new discovery data ("plist" topics)
discovered on interoperating with some other DDS implementations:
* The interpretation of a keyhash as if it were a valid sample was wrong
in various ways: inconsistent endianness, incorrect encoding
identifier and a missing sentinel. As Cyclone follows the spec and
always provides a well-formed payload, the problem only surfaces when
interoperating with implementations that expect the recipient to make
do with a keyhash.
* Various paths failed to check for failure causing potential null
pointer dereferences.
Signed-off-by: Erik Boasson <eb@ilities.com>
* Remove the "plist" and "rawcdr" abuse of the "serdata_default" sample
representation.
* Introduce a new "plist" topic type and a new "pserop" topic type. The
former represents parameter lists as used in discovery, the second
arbitrary samples using the serialiser in ddsi_plist.c.
* Introduce sertopics for each of the built-in "topics" used by the DDSI
discovery protocol using the two new topic types, and reference these
in the readers/writers used in discovery.
* Construct and deconstruct the discovery message by using the
conversion routines for these sample types, rather than fiddling with,
e.g., the baroque interface for adding parameter lists to messages.
* As a consequence, it introduces standardized logging of received and
transmitted discovery data and eliminates the annoying "(null)/(null)"
and "(blob)" descriptions in the trace.
* Limits the dumping of octet sequences in discovery data to the first
100 bytes to make the embedded certificates and permissions
documents (somewhat) manageable.
* Eliminates the (many) null pointer checks on reader/writer topics.
* Fixes the printing of nested sequences in discovery data (not used
before) and the formatting of GUIDs.
Various interfaces remain unchanged and so while this removes cruft from
the core code, it moves some of it into the conversion routines for the
new topic types.
It also now allocates some memory when processing incoming discovery
data, whereas before it had no need to do so. Allowing for aliasing of
data in the new sertopics and adding a way to initialize these specific
types on the stack (both minor changes) suffices for eliminating those
allocations.
Signed-off-by: Erik Boasson <eb@ilities.com>
Check actual topic type before "downcasting"
Signed-off-by: Erik Boasson <eb@ilities.com>
Free the memory we own and is actually allocated
Signed-off-by: Erik Boasson <eb@ilities.com>
Ignore logging newlines if nothing is buffered
Signed-off-by: Erik Boasson <eb@ilities.com>
Suffix data with "(trunc)" one byte earlier
The sample printing code changed over time and now stops as soon as it
can once it has filled up the buffer. As the return value is simply the
number of bytes written, if that number is equal to buffer size less
one (because of the terminating nul) it may or may not have been
truncated, but the likelihood is that it has been. So add the "(trunc)"
suffix once that point has been reached.
Signed-off-by: Erik Boasson <eb@ilities.com>
* Do not rewrite secure messages in retransmit queue
Messages to be retransmitted spend some time on a transmit queue, and
are subject to the rewriting of the destination information to reduce
the number of outgoing copies. Self-evidently, altering the message
header does not sit well with encryption and/or authentication of
messages.
The way the rewriting works is that the offset of the "reader entity id"
in the DATA submessage is saved on message construction (the GUID prefix
is at a fixed location), so that it can be read and possibly zero'd out
later. The crypto transformations move the message around and it so
happens that it can end up pointing to the key id in the encoded
message. Zeroing that one out leads to uninterpretable messages.
This commit adds a message/event kind to distinguish between retransmit
that may and retransmit that may not be merged (and thus rewritten) and
gets used when the crypto plugin is invoked to transform a message.
Signed-off-by: Erik Boasson <eb@ilities.com>
* Update comment on changing REXMIT to REXMIT_NOMERGE
Signed-off-by: Erik Boasson <eb@ilities.com>
Deleting a writer with unacknowledged data present in its WHC causes it
to linger for a configurable duration. Once it is lingering, there are
two routes to actually deleting the writer: because the samples get
acknowledged, or because the linger duration elapses.
When these two happen roughly concurrently, there was a possibility of
both succeeding in looking up the writer by its GUID, in which case one
of them then asserts on removing it from the entity index (if assertions
are enabled, if not, things are worse).
This fixes that by ensuring only one of the two actually does something,
as was always the intent.
Signed-off-by: Erik Boasson <eb@ilities.com>
As opposed to NOT_ALLOWED_BY_SECURITY. There is a meaningful
difference between something being disallowed and something being
impossible.
Co-Authored-By: Kyle Fazzari <github@status.e4ward.com>
Signed-off-by: Erik Boasson <eb@ilities.com>
Currently:
* DDS_HAS_SECURITY for DDS Security support
* DDS_HAS_LIFESPAN for lifespan QoS support
* DDS_HAS_DEADLINE_MISSED for "deadline missed" event support
These are defined to 1 if support for the feature is included in the
build and left undefined if it isn't.
Signed-off-by: Erik Boasson <eb@ilities.com>
When built without support for DDS Security, any attempt to create a
participant QoS settings in the security name space (those prefixed by
"dds.sec.") must fail.
Signed-off-by: Erik Boasson <eb@ilities.com>
* read/take failed to restore the null pointer in the first entry of the
sample pointer array it gets passed, in the case no "loan" had been
allocated yet and it returned an empty set. The consequence is that
on a subsequence read it will reuse the address without marking at as
in use, so that a *second* read using with a null pointer in that
first entry will overwrite the first result. (Introduced by
d16264fd82.)
* return_loan failed to free all memory if its argument wasn't actually
a loan. There are many good arguments why the read/take/return_loan
interface is messed up, but in the context of the existing interface
this is a perfectly reasonable case: there is at most one "loan" for
each reader, but one can keep calling read/take and return_loan as if
there's an infinite number of "loans". It's just that the first gets
cached and the others don't.
Signed-off-by: Erik Boasson <eb@ilities.com>
The moving around and cleaning up of network code broke the IPv6
multicast support by memcpy'ing a sockaddr_in6 instead of an in6_addr in
a multicast join record.
Signed-off-by: Erik Boasson <eb@ilities.com>
* Move wctime, mtime, etime types to ddsrt
* Add ddsrt_time_wallclock
* Change ddsrt_time_monontic, elapsed to use mtime, etime types
* Remove now, now_mt, now_et
* Rename X_to_sec_usec to ddsrt_X_to_sec_usec
* add_duration_to_X to ddsrt_X_add_duration (to be in line with the
existing ddsrt_time_add_duration)
* elimination of ddsrt/timeconv.h, it added more in the way of
complications than it did in making things more elegant
* rename of q_time.[ch] to ddsi_time.[ch]: that now only deals with DDSI
timestamps and durations on the wire
Signed-off-by: Erik Boasson <eb@ilities.com>
The test gates access-control plugin invocation and with the inverted
condition all remote readers/writers requiring access control are
blocked eiter because of the permissions handle, or because a NIL handle
is passed to the access control plugin.
Signed-off-by: Erik Boasson <eb@ilities.com>
* access-control check_remote_datareader has "relay_only" as an out
parameter, so should pass in an address instead of "false";
* value of "relay_only" returned by check_remote_datareader must be
passed to crypto register_matched_remote_datareader
Signed-off-by: Erik Boasson <eb@ilities.com>
This is a workaround for interoperability issues, ultimately driven by a
Windows quirk that makes multicast delivery within a machine utterly
unreliable if the transmitting socket is bound to 0.0.0.0 (despite all
sockets having multicast interfaces set correctly) when there are also
sockets transmitting to the same multicast group that have been bound to
non-0.0.0.0. (Note: there may be other factors at play, but this is
what it looks like after experimentation.)
At least Fast-RTPS in some versions binds the socket it uses for
transmitting multicasts to non-0.0.0.0, so interoperability with
Fast-RTPS on Windows requires us to bind the socket we use for
transmitting multicasts (which was the same as the one we use for
receiving unicast data) also to non-0.0.0.0 or our multicasts get
dropped often.
This would work fine if other implementations honoured the set of
advertised addresses. However, at least Fast-RTPS and Connext (in some
versions) fail to do this and happily substitute 127.0.0.1 for the
advertised IP address. If we bind to, e.g., 192.168.1.1, then suddenly
those packets won't arrive anymore, breaking interoperability.
The only work around is to use a separate socket for sending.
Signed-off-by: Erik Boasson <eb@ilities.com>
updated plugin loading tests to use these instead of specific wrappers per test. Added
test for securing communication ad handshake fail (using different identity CAs)
Signed-off-by: Dennis Potman <dennis.potman@adlinktech.com>
* Fix issue in dds_create_topic_arbitrary
Changed the behaviour of dds_create_topic_arbitrary with respect to the
sertopic parameter: the existing function dds_create_topic_arbitrary is
marked deprecated and replaced by dds_create_topic_generic, which returns
the sertopic that is actually used in as an out parameter. This can be eiter
the provided sertopic (if this sertopic was not yet known in the domain) or an
existing sertopic if the sertopic was registered earlier.
Signed-off-by: Dennis Potman <dennis.potman@adlinktech.com>
* Fix memory leaks in case topic creation fails.
Signed-off-by: Dennis Potman <dennis.potman@adlinktech.com>
[test_subscriber-12] /opt/ros/master/src/eclipse-cyclonedds/cyclonedds/src/ddsrt/src/mh3.c:28:53: runtime error: applying zero offset to null pointer
[test_subscriber-12] SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior /opt/ros/master/src/eclipse-cyclonedds/cyclonedds/src/ddsrt/src/mh3.c:28:53 in
Signed-off-by: Dan Rose <dan@digilabs.io>
* Don't pass null to memcmp
```
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior /opt/ros/master/src/ros2/rmw_cyclonedds/rmw_cyclonedds_cpp/include/rmw_cyclonedds_cpp/serdes.hpp:135:3 in
/opt/ros/master/src/eclipse-cyclonedds/cyclonedds/src/core/ddsi/src/ddsi_sertopic_default.c:41:15: runtime error: null pointer passed as argument 1, which is declared to never be null
/usr/include/string.h:64:33: note: nonnull attribute specified here
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior /opt/ros/master/src/eclipse-cyclonedds/cyclonedds/src/core/ddsi/src/ddsi_sertopic_default.c:41:15 in
/opt/ros/master/src/eclipse-cyclonedds/cyclonedds/src/core/ddsi/src/ddsi_sertopic_default.c:41:31: runtime error: null pointer passed as argument 2, which is declared to never be null
/usr/include/string.h:64:33: note: nonnull attribute specified here
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior /opt/ros/master/src/eclipse-cyclonedds/cyclonedds/src/core/ddsi/src/ddsi_sertopic_default.c:41:31 in
/opt/ros/master/src/eclipse-cyclonedds/cyclonedds/src/core/ddsi/src/ddsi_sertopic_default.c:45:15: runtime error: null pointer passed as argument 1, which is declared to never be null
/usr/include/string.h:64:33: note: nonnull attribute specified here
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior /opt/ros/master/src/eclipse-cyclonedds/cyclonedds/src/core/ddsi/src/ddsi_sertopic_default.c:45:15 in
/opt/ros/master/src/eclipse-cyclonedds/cyclonedds/src/core/ddsi/src/ddsi_sertopic_default.c:45:30: runtime error: null pointer passed as argument 2, which is declared to never be null
/usr/include/string.h:64:33: note: nonnull attribute specified here
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior /opt/ros/master/src/eclipse-cyclonedds/cyclonedds/src/core/ddsi/src/ddsi_sertopic_default.c:45:30 in
```
Signed-off-by: Dan Rose <dan@digilabs.io>
* clearer non-null check
Signed-off-by: Dan Rose <dan@digilabs.io>
The thread_states array resets the "state" to ZERO on thread termination
to indicate that the slot was unused, but it leaves the thread id
unchanged because some platforms don't have a defined value that will
never be used as a thread id. A consequence is that a newly created
thread may result in multiple slots containing their own thread id, but
generally there will only be one that is not in state ZERO.
However, the code for create_thread used to set the state to ALIVE prior
to creating the thread, and so if the events get scheduled like:
1. thread A: X.state = ALIVE
2. create new thread B, storing tid in X.tid
3. thread A: Y.state = ALIVE
4. new thread B: lookup self (and cache pointer)
5. create new thread C, storing tid in Y.tid
6. new thread C: lookup self (and cache pointer)
Then B will observe two slots in the ALIVE state, with X.tid certain to
match and Y.tid undefined (and hence possibly matching). It may
therefore pick Y. C will (in this schedule) of course always choose Y.
They cache the pointer and never look at X and Y again, except for
updating their virtual clocks.
These virtual clocks are updated non-atomically (by design it is private
to the thread) and so if both B & C use Y they can end up racing each
other in updating the virtual clock and cause the nesting level of the
"awake" state controlling garbage collection to get stuck (or wrap
around, or do other horrible things). The consequence can be anything,
from a somewhat benign variant where GC effectively stops and some
operations (deleting readers and writers and shutting down) block
forever, to use-after-free and the undefined behaviour that implies.
This commit avoids looking up the slot in the newly created threads,
instead passing the correct address in the argument. It also adds an
intermediate state INIT that serves to reserve the slot until the new
thread is actually running. It does make the look-up safe (if one were
to do it), and as it is essentially free and gives more insight in the
state of the system when viewed from a debugger, it appears a useful
addition.
Signed-off-by: Erik Boasson <eb@ilities.com>
gcc 5.4 correctly warned that a null pointer was being passed into the
entity-specific "set_qos" function when changing a topic QoS, where that
parameter was tagged as "non-null". As it was never dereferenced in
this case the resulting behaviour was still correct.
It turns out that the entire function was overly complicated and that
simply passing the entity pointer round allows eliminating a few
arguments as well.
(Oddly none of the more modern toolchains used pick this up.)
Signed-off-by: Erik Boasson <eb@ilities.com>
the destination cache of the network stack is in a certain state. The issue
is resolved by binding unicast sockets (incoming unicast and all outgoing
traffic) to the address of the interface instead of inaddr_any (0.0.0.0).
Set the new configuration option internal/BindUnicastToInterfaceAddr to
false to get the old behavior.
Co-authored-by: Erik Boasson <eb@ilities.com>
Signed-off-by: Dennis Potman <dennis.potman@adlinktech.com>
* Fix some typos.
Signed-off-by: ChenYing Kuo <evshary@gmail.com>
* Also update q_config.c, cyclonedds.rnc, cyclonedds.xsd for correct
build.
Signed-off-by: ChenYing Kuo <evshary@gmail.com>
* Remove cdds.md.
Signed-off-by: ChenYing Kuo <evshary@gmail.com>