* Move the project top-level CMakeLists.txt to the root of the project;
this allows building Cyclone as part of ROS2 without any special
tricks;
* Clean up the build options:
ENABLE_SSL: whether to check for and include OpenSSL support if a
library can be found (default = ON); this used to be
called DDSC_ENABLE_OPENSSL, the old name is deprecated
but still works
BUILD_DOCS: whether to build docs (default = OFF)
BUILD_TESTING: whether to build test (default = OFF)
* Collect all documentation into top-level "docs" directory;
* Move the examples to the top-level directory;
* Remove the unused and somewhat misleading pseudo-default
cyclonedds.xml;
* Remove unused cmake files
Signed-off-by: Erik Boasson <eb@ilities.com>
* use multicast only for participant discovery if using a WiFi network
* default to using unicast for retransmits
Signed-off-by: Erik Boasson <eb@ilities.com>
It is an excellent platform for catching bugs: big-endian, slow enough
that a context switch in the middle of an operation becomes a regular
occurrence, and all that on a SMP box. Or: I just wanted to see if it
would work.
Signed-off-by: Erik Boasson <eb@ilities.com>
Creating a reader/writer in a listener for a built-in topic (as ddsperf
does) recursively ends up in reader/writer matching, recursively
read-locking qoslock. As nobody ever takes a write-lock, it is a
non-issue provided the rwlock really is an rwlock. On Solaris 2.6 those
don't exist, and mapping it onto a mutex deadlocks.
This commit removes the thing in its entirety, the fact that it is
currently only ever locked for reading is hint that perhaps it is not
that valuable a thing. The way it was used in the code would in any
case not have helped with re-matching on QoS changes (save for
duplicating all the matching code), and it is doubtful that serializing
the matching in that case would be necessary in the first place.
Signed-off-by: Erik Boasson <eb@ilities.com>
The two do essentially the same think, and ddsrt_strtok_r was only used
in one place. (Triggered by Solaris 2.6 not providing strtok_r.)
Signed-off-by: Erik Boasson <eb@ilities.com>
On 32-bit machines uintmax_t is likely to be slower than uintptr_t, and
for that reason, using an uintmax_t for a thread id seems highly
unlikely. For the current platforms, uintptr_t works.
Signed-off-by: Erik Boasson <eb@ilities.com>
The parameter list table indices ought to be a small as possible to
avoid wasting space, and that means the index size is dependent on
whether or not DDSI_INCLUDE_SSM is set.
Signed-off-by: Erik Boasson <eb@ilities.com>
The payload in a struct serdata_default is assumed to be at a 64-bit
offset for conversion to/from a dds_{i,o}stream_t and getting padding
calculations in the serialised representation correct. The definition
did not guarantee this and got it wrong on a 32-bit release build.
This commit computes the required padding at compile time and at
verifies the assumption holds where it matters.
Signed-off-by: Erik Boasson <eb@ilities.com>
Signed-off-by: Thijs Sassen <thijs.sassen@adlinktech.com>
Adjusted the close methode not to expand by the lwip close macro and added a check for DDSI_INCLUDE_SSM to match the correct pid table size.
Signed-off-by: Thijs Sassen <thijs.sassen@adlinktech.com>
Currently each DDSC (not DDSI) writer has its own "xpack" for packing
submessages into larger messages, but that is a bit wasteful, especially
when a lot of samples are being generated that never need to go onto the
wire. Lazily allocating them and only pushing message into them when
they have a destination address saves memory and improves speed for
local communications.
Signed-off-by: Erik Boasson <eb@ilities.com>
Multiple writers for a single instance is pretty rare, so it makes sense
to lazily allocate the tables for keeping track of them. The more
elegant solution would be to have a single lock-free table.
Signed-off-by: Erik Boasson <eb@ilities.com>
Rather than allocate a HH_HOP_RANGE large array of buckets, allocate
just 1 if the initial size is 1, then jump to HH_HOP_RANGE as soon as a
second element is added to the table. There are quite a few cases where
hash tables are created where there never be more than 1 (or even 0)
elements in the table (e.g., a writer without readers, a reader for a
keyless topic).
Signed-off-by: Erik Boasson <eb@ilities.com>
There were inconsistencies in the order in which entity locks were taken
when multiple entities needed to be locked at the same time. In most
cases, the order was first locking entity X, then locking the parent
entity of X. However, in some cases the order was reversed, a likely
cause of deadlocks.
This commit sorts these problems, and in particular propagating
operations into children. The entity refcount is now part of the handle
administration so that it is no longer necessary to lock an entity to
determine whether it is still allowed to be used (previously it had to
check the CLOSED flag afterward). This allows recursing into the
children while holding handles and the underlying objects alive, but
without violating lock order.
Attendant changes that would warrant there own commits but are too hard
to split off:
* Children are now no longer in a singly linked list, but in an AVL
tree; this was necessary at some intermediate stage to allow unlocking
an entity and restarting iteration over all children at the "next"
child (all thanks to the eternally unique instance handle);
* Waitsets shifted to using arrays of attached entities instead of
linked lists; this was a consequence of dealing with some locking
issues in reading triggers and considering which operations on the
"triggered" and "observed" sets are actually needed.
* Entity status flags and waitset/condition trigger counts are now
handled using atomic operations. Entities are now classified as
having a "status" with a corresponding mask, or as having a "trigger
count" (conditions). As there are fewer than 16 status bits, the
status and its mask can squeeze into the same 32-bits as the trigger
count. These atomic updates avoid the need for a separate lock just
for the trigger/status values and results in a significant speedup
with waitsets.
* Create topic now has a more rational behaviour when multiple
participants attempt to create the same topic: each participant now
gets its own topic definition, but the underlying type representation
is shared.
Signed-off-by: Erik Boasson <eb@ilities.com>
Add the instance handle to the DDSC entity type, initialize it properly
for all types, and remove the per-type handling of
dds_get_instance_handle. Those entities that have a DDSI variant take
the instance handle from DDSI (which plays tricks to get the instance
handles of the entities matching the built-in topics). For those that
do not have a DDSI variant, just generate a unique identifier using the
same generate that DDSI uses.
Signed-off-by: Erik Boasson <eb@ilities.com>
Lease_renew is the key one, and that one only ever shifts the lease
expiry to the future, providing the lease hasn't expired already. All
other operations work within leaseheap_lock.
Other updates to lease end time are in set_expiry (which is used in some
special cases). So ... the number of ways this can go wrong is rather
limited.
Tracking pings and expected number of pongs was done without holding the
correct locks. Terminate flag was also not a ddsrt_atomic... and hence
flagged by thread sanitizer as a race condition.
Signed-off-by: Erik Boasson <eb@ilities.com>
Thread sanitizer warns about reads and writes of variables that are
meant to be read without holding a lock:
* Global "keep_going" is now a ddsrt_atomic_uint32_t
* Thread "vtime" is now a ddsrt_atomic_uint32_t
Previously the code relied on the assumption that a 32-bit int would be
treated as atomic, now that is all wrapped in ddsrt_atomic_{ld,st}32.
These being inline functions doing exactly the same thing, there is no
functional change, but it does allow annotating the loads and stores for
via function attributes on the ddsrt_atomic_{ld,st}X.
The concurrent hashtable implementation is replaced by a locked version
of the non-concurrent implementation if thread sanitizer is used. This
changes eliminates the scores of problems signalled by thread sanitizer
in the GUID-to-entity translation and the key-to-instance id lookups.
Other than that, this replaces a flag used in a waitset test case to be
a ddsrt_atomic_uint32_t.
Signed-off-by: Erik Boasson <eb@ilities.com>
* calling ddsrt_memdup, ddsrt_strdup with a null pointer (they handle it
gracefully but forbid it in the interface ...)
* replacement of all pre-C99 flexible arrays (i.e., declaring as
array[1], then mallocing and using as if array[N]) by C99 flexible
arrays.
* also add a missing null-pointer test in dds_dispose_ts, and fix the
test cases that pass a null pointer and a non-writer handle to it to
instead pass an invalid adress
Signed-off-by: Erik Boasson <eb@ilities.com>
* dds_set_allocator
* dds_set_aligned_allocator
The intent behind them is good, but the approach too primitive ... there
is far more work to be done for managing dynamic allocation in a
meaningful way.
Signed-off-by: Erik Boasson <eb@ilities.com>
It appears the Chocolately maven package now installs in different
location (or some shims that used to be installed no longer are).
Because the Travis build uses bash instead of cmd/powershell it doesn't
properly pick up M2_HOME. This commits adds the new location.
Signed-off-by: Erik Boasson <eb@ilities.com>
* Add PacketCapture for receiving messages in UDP
Signed-off-by: Juan Oxoby <joxoby@irobot.com>
* Explicit cast ret to size_t in write_pcap_received()
Signed-off-by: Juan Oxoby <joxoby@irobot.com>
The CMake files now add "-Werror"/"/WX" if the "WERROR" CMake variable
is true. By default it is not; the CI builds set this.
Signed-off-by: Erik Boasson <eb@ilities.com>
Missing prototypes for exported functions cause a really huge issue on
Windows. Enabling the "missing prototypes" warning makes it much easier
to catch this problem. Naturally, any warnings caused by this have been
fixed.
Signed-off-by: Erik Boasson <eb@ilities.com>
This commit adds support for changing all mutable QoS except those that
affect reader/writer matching (i.e., deadline, latency budget and
partition). This is simply because the recalculation of the matches
hasn't been implemented yet, it is not a fundamental limitation.
Implementing this basically forced fixing up a bunch of inconsistencies
in handling QoS in entity creation. A silly multi-process ping-pong
test built on changing the value of user data has been added.
Signed-off-by: Erik Boasson <eb@ilities.com>
These topics are generated internally and never sent over the wire.
Performing full discovery for these is therefore a significant waste of
effort.
Signed-off-by: Erik Boasson <eb@ilities.com>
The DDSI spec version 2.1 forbade the use of ACKNACK/GAP messages with
an empty bitset, but most vendors used these forms anyway. The DDSI
stack of Cyclone had code to avoid generating these (though with a bug
where it could generate an invalid GAP), but for no real benefit.
Because the other vendors used them anyway, the stack has always been
perfectly capable of handling them.
DDSI spec version 2.3 allows these forms, and so there's no value in
maintaining the old complications. This also eliminates the invalid GAP
messages that could be generated at times.
Signed-off-by: Erik Boasson <eb@ilities.com>
The Compatibility/ArrivalOfDataAssertsPpAndEpLiveliness option was a
rather strange option: receipt of a message is proof of the existence of
the sender, so having an option to not treat it as such only adds
complexity without any benefit.
Signed-off-by: Erik Boasson <eb@ilities.com>
Padding used to not be cleared in this code base, but that has the
downside of valgrind reporting nuisance warnings (which could be fixed
using valgrind's programmatic interface) but also of potentially leaking
information. The cost of clearing the padding appears to be
insignificant compared to the cost of doing the real work, and so it is
probably best to just clear it.
Signed-off-by: Erik Boasson <eb@ilities.com>
The old parameter list parsing was a mess of custom code with tons of
duplicated checks, even though parameter list parsing really is a fairly
straightforward affair. This commit changes it to a mostly table-driven
implementation, where the vast majority of the settings are handled by a
generic deserializer and the irregular ones (like reliability, locators)
are handled by custom functions. The crazy ones (IPv4 address and port
rely on additional state and are completely special-cased).
Given these tables, the serialization, finalisation, validation,
merging, unalias'ing can all be handled by a very small amount of custom
code and an appropriately defined generic function for the common cases.
This also makes it possible to have all QoS validation in place, and so
removes the need for the specialized implementations for the various
entity kinds in the upper layer.
QoS inapplicable to an entity were previously ignored, allowing one to
have invalid values set in a QoS object when creating an entity,
provided that the invalid values are irrelevant to that entity. Whether
this is a good thing or not is debatable, but certainly it is a good
thing to avoid copying in inapplicable QoS settings. That in turn means
the behaviour of the API can remain the same.
It does turn out that the code used to return "inconsistent QoS" also
for invalid values. That has now been rectified, and it returns
"inconsistent QoS" or "bad parameter" as appropriate. Tests have been
updated accordingly.
Signed-off-by: Erik Boasson <eb@ilities.com>
There are some cases where "int" or "unsigend" actually makes sense, but
in a large number of cases it is really supposed to be either a 32-bit
integer, or, in some cases, at least a 32-bit integer. It is much to be
preferred to be clear about this.
Another reason is that at least some embedded platforms define, e.g.,
int32_t as "long" instead of "int". For the ones I am aware of the
"int" and "long" are actually the same 32-bit integer, but that
distinction can cause trouble with printf format specifications. So
again a good reason to be consistent in avoiding the
implementation-defined ones.
Signed-off-by: Erik Boasson <eb@ilities.com>
The functions did not touch the callback pointer if a null pointer was
passed in for the listener. That means one would have to initialize the
out parameter before the call or manually check the listener pointer to
know whether the callback point has a defined value following the call.
That's asking for trouble.
Thus, the decision to return a callback of 0 when no listener object is
passed in.
Signed-off-by: Erik Boasson <eb@ilities.com>
All this duplication was rather useless: the values are standardized
anyway and the conversion was a simple type cast without any check.
This commit unifies the definitions.
* DDSI now uses the definitions of the various QoS "kind" values from
the header file;
* The durations in the QoS objects are no longer in wire-format
representation, the conversions now happen only in conversion to/from
wire format;
* The core DDSI stack no longer uses DDSI time representations for time
stamps, instead using the "native" one;
* QoS policy ids duplication has been eliminated, again using the IDs
visible in the API -- the actual values are meaningless to the DDSI
stack anyway.
Signed-off-by: Erik Boasson <eb@ilities.com>
Code formatting was quite a mess (different indentation, completely
different ideas on where opening braces should go, spacing in various
places, early out versus single return or goto-based error handling,
&c.). This commit cleans it up.
A few doxygen comment fixes allowed turning on Clang's warnings for
doxygen comments, so those are no enabled by default as least on
Xcode-based builds.
Signed-off-by: Erik Boasson <eb@ilities.com>
* Remove dds_return_t / dds_retcode_t distinction (now there is only
dds_return_t and all error codes are always negative)
* Remove Q_ERR_... error codes and replace them by DDS_RETCODE_...
ones so that there is only one set of error codes
* Replace a whole bunch "int" return types that were used to return
Q_ERR_... codes by "dds_return_t" return types
Signed-off-by: Erik Boasson <eb@ilities.com>
A long-standing bug of Cyclone is that a sample written immediately
after a publication-matched event may never arrive at the reader that
was just matched. This happened because the reader need not have
completed discovery of the writer by the time the writer discovers the
reader, at which point the reader ignores the sample because it either
doesn't know the writer at all, or it hasn't yet seen a Heartbeat from
it.
That Heartbeat arrives shortly after, but by then it is too late: the
reader slaves decides to accept the next sample to be written by the
writer. (It has no choice, really: either you risk losing some data, or
you will be requesting all historical data, which is empathically not
what a volatile reader is about ...)
A related issue is the handling of historical data for transient-local
readers: it used to deliver this out-of-order, but that is firstly
against the specification, and secondly, against reasonable expectations
of those who use DDS as a mere publish-subscribe messaging system. To
add insult to injury, it didn't completely handle some reordering issues
with disposes ...
This commit changes the way writers respond to a request for
retransmission from volatile proxy readers and the way the
in-sync/out-of-sync setting of a reader with respect to a proxy-writer
is used. The first makes it safe for a Cyclone reader to ask a Cyclone
writer for all data (all these details not being covered in the specs it
errs on the reasonable side for other vendors, but that may cause the
data loss mentioned above): the writer simply send a Gap message to the
reader for all the sequence numbers prior to the matching.
The second changes the rule for switching from out-of-sync to in-sync:
that transition is now simply once the next sequence number to be
delivered to the reader equals the next sequence number that will be
delivered directly from the proxy writer object to all readers. (I.e.,
a much more intuitive notion than reaching some seemingly arbitrary
sequence number.)
To avoid duplicates the rule for delivery straight from a proxy writer
has changed: where samples were delivered from the proxy writer to all
matching readers, they are now delivered only to the matching readers
that are in-sync. To avoid ordering problems, the idea that historical
data can be delivered through the asynchronous delivery path even when
the regular data goes through the synchronous delivery path has been
abandoned. All data now always follows the same path.
As these same mechanisms are used for getting historical data into
transient-local readers, the ordering problem for the historical data
also disappeared.
The test stuff in src/core/xtests/initsampledeliv covers a lot of the
interesting cases: data published before the existene of a reader, after
it, mixes of volatile and transient-local. Running them takes quite a
bit of time, and they are not yet integrated in the CI builds (if ever,
because of that time).
Note: the "conservative built-in startup" option has been removed,
because it really makes no sense to keep a vague compatibility option
added a decade ago "just in case" that has never been used ...
Note: the workaround in the src/mpt/tests/basic/procs/hello.c (use
transient-local to ensure delivery of data) has been removed, as has
been its workaround for the already-fixed #146.
Signed-off-by: Erik Boasson <eb@ilities.com>
If a firewall blocks traffic over one network but not another, and the
one happens to be the default pick of Cyclone, then the MPT basic tests
won't work properly.
With the recent changes to the configuration handling for supporting
multiple sources of configuration, it makes far more sense to remove the
hardcoded network interface selection in the test configurations and to
append the test configuration file to the environment list rather than
to replace it.
Signed-off-by: Erik Boasson <eb@ilities.com>