Some of the former are required to implement alternative serialisation
methods; the latter is just generally useful. For the time being these
are not part of the formal API and not subject to backwards
compatibility. Still, they have value for quickly building tools on that
use Cyclone and happen to need any of these functions.
Signed-off-by: Erik Boasson <eb@ilities.com>
The introduction of properly functioning query conditions adds some
overhead, this commit removes some of that cost by avoiding some calls
to update_conditions when there are no query conditions.
It also removes the has_changed field from the instance, instead using a
local boolean to track whether DATA_AVAILABLE should be raised or not.
Signed-off-by: Erik Boasson <eb@ilities.com>
Adding and removing reader/writer matches can be done by multiple
threads, and this can result in two threads simultaneously trying to do
this on a single reader/writer pair. The code therefore always checks
first whether the pair is (not) matched before proceeding.
However, removing a reader from a proxy writer had part of the code
outside this check. Therefore, if both entities are being deleted
simultanously, there is a risk that local_reader_ary_remove is called
twice for the same argument, and in that case, it asserts in one of them
because the reader can no longer be found. The counting of the number
of matched reliable readers suffers from the same race condition.
This commit eliminates these race conditions by moving these operations
into the block guarded by the aforementioned check.
Signed-off-by: Erik Boasson <eb@ilities.com>
Deleting a writer causes unregisters (and possibly disposes) in the rest
of the network, and these updates to the instances should trigger
DATA_AVAILABLE.
Signed-off-by: Erik Boasson <eb@ilities.com>
When data arrives before a heartbeat has been received, it is impossible
to know whether this is a new "live" sample or a retransmit, and for
this reason the requesting of historical data is delayed until a
heartbeat arrives that informs the readers of the range of sequence
numbers to request as historical data.
However, by this time, and without this new condition in place, the
reader may have already received some data directly, and may
consequently request some data twice. That's not right.
Requiring a heartbeat to have been received before delivering the data
avoids this problem, but potentially delays receiving data after a new
writer/reader pair has been matched. The delay caused by a full
handshake at that point seems less bad that the odd case of stuttering
where that isn't expected. There are almost certainly some tricks
possible to avoid that delay in the common cases, but there are more
important things to do ...
Best-effort readers on a reliable proxy writer are a bit special: if
there are only best-effort readers, there is no guarantee that a
heartbeat will be received, and so the condition does not apply. This
commit attempts to deal with that by only requiring a heartbeat if some
reliable readers exist, but that doesn't allow a smooth transition from
"only best-effort readers" to "some reliable readers".
One could moreover argue that this condition should not be imposed on
volatile readers (at worst you get a little bit of data from before the
match), but equally well that it should (there's no guarantee that no
sample would be skipped in the case of a keep-all writer, if the first
sample happened to be a retransmit).
Signed-off-by: Erik Boasson <eb@ilities.com>
For compatibility with TwinOaks CoreDX, ignore an all-zero durability
service QoS received over SEDP for volatile and transient-local
endpoints.
Signed-off-by: Erik Boasson <eb@ilities.com>
The introduction of multiple receive threads could trigger the assertion
because a set of samples ready for delivery may have been received by
multiple threads (the problem manifests itself most easily with
fragmented samples). This is actually a non-issue:
* while synchronously processing a packet, there is a bias of 2**31
added to the refcount, to prevent any thread under any circumstance
from ever freeing the data;
* while data lives in the defragment buffers or reorder buffer of the
proxy writer, a bias of 2**20 is added to it until this particular
function is called, after delivery of the data to the readers, and
(if needed) after inserting the samples in the reorder buffer of
any readers that are out-of-sync with the proxy writer;
* the relevant refcount is updated atomically in such a manner that this
particular operation atomically removes the bias and performs the
delayed increment of the refcount to account for the data being stored
in any of the defragmenting or reorder buffers;
* the only ordinary decrementing of the refcount happens either
synchronously (if synchronous delivery is chosen), or asynchronously
in a delivery queue thread, and so the entire mechanism exists to
avoid premature freeing of the underlying data because the data is
delivered very quickly (possibly synchronously);
* as the biases are removed after all the delayed refcount increments
are taken into account and there are no increments following the call
to rmbias_and_adjust, the "ordinary" decrements can do no harm.
* the case of data from multiple writers being combined in a single
packet is dealt with by the 2**20 bias, and so there is potentially a
problem if there are more than 2**20 out-of-sync readers attached to
a single proxy writer, or data submessages from more than 2**11
writers in a single packet. The minimum possible data message is 32
bytes (headers, encoding, data, padding), so packets up to 64kB are
safe.
None of this is in any way related to which threads originally accepted
the packets, and therefore I see no argument for the existence of the
assertion.
That said, it is a rather complicated mechanism of unknown benefit, and
a major simplification is definitely something to be considered. In UDP
mode I see no chance of abuse, but there may be network protocols (TCP,
for sure) where there might be packets larger than 64kB and those could,
under worst-case assumptions, cause trouble. That, too, is a reason to
rethink it.
The call to rmbias_and_adjust was sometimes called with the proxy writer
locked, and sometimes after unlocking it. This commit changes it to
consistently call it with the lock held.
Signed-off-by: Erik Boasson <eb@ilities.com>
Sizing/ReceiveBufferSize must be >= Sizing/ReceiveBufferChunkSize + N
for some small N, and if it is not, Cyclone will crash reading beyond
allocated memory in a nasty way. Ordinarily this should be handled by
the configuration validation, but that would put the burden of knowing
the details of computing N upon the user, an unreasonable requirement.
The old state of an assertion presupposes a check, and brings us back
that same requirement.
Thus, a change to ensure that ReceiveBufferSize will be taken as the
minimum of the configured value and the actual minimal value as
determined by ChunkSize and whatever N happens to be.
Signed-off-by: Erik Boasson <eb@ilities.com>
During shutdown, the optional "debmon" thread for getting some
information about internal state of the DDSI stack had a tendency to run
into errors from calling write on a connection that had already been
closed immediately after connecting successfully to wake the thread.
Instead of blindly writing into the connection, it now checks whether it
is supposed to shutdown before doing anything, avoiding this particular
problem.
Signed-off-by: Erik Boasson <eb@ilities.com>
The rewrite of the abstraction layer changed some details in thread ids
used in tracing and functions to get those ids, with a result of always
printing the parent thread's id in create_thread rather than the newly
create thread's id. As all supported platforms use thread names in the
trace, it is a rather insignificant matter, and so this provides the
trivial fix by letting the new thread log the message.
Signed-off-by: Erik Boasson <eb@ilities.com>
The rtps_init function used to initialize all data structures and start
all threads used by the protocol stack, allowing discovery of remote
entities before the built-in topic data structures were initialized.
(Very) early discovery of a remote participant thus led to a crash.
This commit splits the initialisation, providing a separate function for
starting, in particular, the threads receiving data from the network.
In terms of threads created, it matches exactly with the rtps_stop /
rtps_fini split that already existed to address the exact same problem
on termination.
Signed-off-by: Erik Boasson <eb@ilities.com>
This adds an Internal/EnableExpensiveChecks setting for enabling some or all expensive run-time checks to avoid a massive slowdown when assertions are enabled at compile-time. Currently these cover only the writer and reader-history cache checking.
Signed-off-by: Erik Boasson <eb@ilities.com>
The configuration handling already allowed specifying multiple files in CYCLONEDDS_URI to be read in-order, this extends the behaviour to also allow the contents of these files to be embedded. This makes it possible to set a configuration without requiring a file system, or to add some ad-hoc options.
Signed-off-by: Erik Boasson <eb@ilities.com>
Reorganize and clean up abstraction layer
The include files have moved from the (somewhat illogical) ``include/ddsc`` to the more logical ``include/dds``. To avoid breaking existing code, a ``include/ddsc/dds.h`` is added that simply includes the one in the new location.
- Replace os_result by dds_retcode_t and move DDS return code defines down.
Eliminates the need to convert between different return code types.
- Move dds_time_t down and remove os_time.
Eliminates the need to convert between different time representations and
reduces code duplication.
- Remove use of Microsoft source-code annotation language (SAL).
SAL annotations are Microsoft specific and not very well documented. This
makes it very difficult for contributers to write.
- Rearrange the abstraction layer to be feature-based. The previous layout
falsely assumed that the operating system dictates which implementation is
best suited. For general purpose operating systems this is mostly true, but
embedded targets require a slightly different approach and may not even offer
all features. The new layout makes it possible to mix-and-match feature
implementations and allows for features to not be implemented at all.
- Replace the os prefix by ddsrt to avoid name collisions.
- Remove various portions of unused and unwanted code.
- Export thread names on all supported platforms.
- Return native thread identifier on POSIX compatible platforms.
- Add timed wait for condition variables that takes an absolute time.
- Remove system abstraction for errno. The os_getErrno and os_setErrno were
incorrect. Functions that might fail now simply return a DDS return code
instead.
- Remove thread-specific memory abstraction. os_threadMemGet and accompanying
functions were a mess and their use has been eliminated by other changes in
this commit.
- Replace attribute (re)defines by ddsrt_ prefixed equivalents to avoid name
collisions and problems with faulty __nonnull__ attributes.
Signed-off-by: Jeroen Koekkoek <jeroen@koekkoek.nl>
Internally time stamps and durations are all in nanoseconds, but the
platform abstraction uses {sec,nsec} (essentially a struct timespec) and
Windows uses milliseconds. The conversion to milliseconds with upwards
rounding was broken, adding ~1s to each timeout. In most of the handful
of uses the effect is minor in practice, but it does matter a lot in the
scheduling of Heartbeat and AckNack messages, e.g., by causing a simple
throughput test to exhibit periodic drops in throughput.
Signed-off-by: Erik Boasson <eb@ilities.com>
A writer blocking on a full WHC will still send out whatever it has
buffered but not sent yet. For this, the writer lock must be released,
but that means an ACK can sneak in between sending out the packet and
relocking the writer (not likely if there's a real network in between,
but over a loopback interface it is definitely possible).
Therefore, the amount of unacknowledged data that controls the blocking
and triggering of it must be refreshed before deciding to block,
otherwise it may hang indefinitely.
Signed-off-by: Erik Boasson <eb@ilities.com>
The writer tracks whether it is throttled because of a full WHC, but
does so by treating it as a simple flag. This is fine if there is at
most one thread blocked on any single writer at any time, but if there
are multiple threads using the same writer it would be possible for one
thread to be woken up, clear the flag, and so affect the wakeup of other
threads.
Turning it from a flag to a counter avoids that problem.
Signed-off-by: Erik Boasson <eb@ilities.com>
The build information in the README was very limited (now it is less
limited) and, the list of prerequisites was incomplete, documentation
link was out of date ... This cleans it up a bit.
Signed-off-by: Erik Boasson <eb@ilities.com>