During shutdown, the optional "debmon" thread for getting some
information about internal state of the DDSI stack had a tendency to run
into errors from calling write on a connection that had already been
closed immediately after connecting successfully to wake the thread.
Instead of blindly writing into the connection, it now checks whether it
is supposed to shutdown before doing anything, avoiding this particular
problem.
Signed-off-by: Erik Boasson <eb@ilities.com>
The rewrite of the abstraction layer changed some details in thread ids
used in tracing and functions to get those ids, with a result of always
printing the parent thread's id in create_thread rather than the newly
create thread's id. As all supported platforms use thread names in the
trace, it is a rather insignificant matter, and so this provides the
trivial fix by letting the new thread log the message.
Signed-off-by: Erik Boasson <eb@ilities.com>
The rtps_init function used to initialize all data structures and start
all threads used by the protocol stack, allowing discovery of remote
entities before the built-in topic data structures were initialized.
(Very) early discovery of a remote participant thus led to a crash.
This commit splits the initialisation, providing a separate function for
starting, in particular, the threads receiving data from the network.
In terms of threads created, it matches exactly with the rtps_stop /
rtps_fini split that already existed to address the exact same problem
on termination.
Signed-off-by: Erik Boasson <eb@ilities.com>
This adds an Internal/EnableExpensiveChecks setting for enabling some or all expensive run-time checks to avoid a massive slowdown when assertions are enabled at compile-time. Currently these cover only the writer and reader-history cache checking.
Signed-off-by: Erik Boasson <eb@ilities.com>
The configuration handling already allowed specifying multiple files in CYCLONEDDS_URI to be read in-order, this extends the behaviour to also allow the contents of these files to be embedded. This makes it possible to set a configuration without requiring a file system, or to add some ad-hoc options.
Signed-off-by: Erik Boasson <eb@ilities.com>
- Replace os_result by dds_retcode_t and move DDS return code defines down.
Eliminates the need to convert between different return code types.
- Move dds_time_t down and remove os_time.
Eliminates the need to convert between different time representations and
reduces code duplication.
- Remove use of Microsoft source-code annotation language (SAL).
SAL annotations are Microsoft specific and not very well documented. This
makes it very difficult for contributers to write.
- Rearrange the abstraction layer to be feature-based. The previous layout
falsely assumed that the operating system dictates which implementation is
best suited. For general purpose operating systems this is mostly true, but
embedded targets require a slightly different approach and may not even offer
all features. The new layout makes it possible to mix-and-match feature
implementations and allows for features to not be implemented at all.
- Replace the os prefix by ddsrt to avoid name collisions.
- Remove various portions of unused and unwanted code.
- Export thread names on all supported platforms.
- Return native thread identifier on POSIX compatible platforms.
- Add timed wait for condition variables that takes an absolute time.
- Remove system abstraction for errno. The os_getErrno and os_setErrno were
incorrect. Functions that might fail now simply return a DDS return code
instead.
- Remove thread-specific memory abstraction. os_threadMemGet and accompanying
functions were a mess and their use has been eliminated by other changes in
this commit.
- Replace attribute (re)defines by ddsrt_ prefixed equivalents to avoid name
collisions and problems with faulty __nonnull__ attributes.
Signed-off-by: Jeroen Koekkoek <jeroen@koekkoek.nl>
A writer blocking on a full WHC will still send out whatever it has
buffered but not sent yet. For this, the writer lock must be released,
but that means an ACK can sneak in between sending out the packet and
relocking the writer (not likely if there's a real network in between,
but over a loopback interface it is definitely possible).
Therefore, the amount of unacknowledged data that controls the blocking
and triggering of it must be refreshed before deciding to block,
otherwise it may hang indefinitely.
Signed-off-by: Erik Boasson <eb@ilities.com>
The writer tracks whether it is throttled because of a full WHC, but
does so by treating it as a simple flag. This is fine if there is at
most one thread blocked on any single writer at any time, but if there
are multiple threads using the same writer it would be possible for one
thread to be woken up, clear the flag, and so affect the wakeup of other
threads.
Turning it from a flag to a counter avoids that problem.
Signed-off-by: Erik Boasson <eb@ilities.com>
The "rhc" test runs a random sequence of operations (writes, reads, &c.)
through an RHC with conditions attached to it. All possible state masks
are used, and query conditions are tried with a condition that only
tests the key value, and one that tests attribute values. It depends on
the internal checking logic of the RHC, which is currently enabled only
in Debug builds because of the associated run-time overhead.
Signed-off-by: Erik Boasson <eb@ilities.com>
Various details:
- index replaced throughout by bitmask
- caching a single sample in RHC for deserialising samples when query conditions are used
- short-circuiting the trivial case of an instance matched neither before nor after a change to its state
- combining inc/dec counters into a single delta in condition evaluation
- major speed-up of rhc_check_counts by not checking the condition bit masks every single time
Signed-off-by: Erik Boasson <eb@ilities.com>
The specification says the data in an invalid sample must not be inspected, but that means retrieving the key values from a disposed/unregistered instance after a take is impossible. So, the elegant thing to do is to always provide the key values in the data. The remaining question is then whether the non-key fields should be left in whatever state they happened to be in, or be set to zero. The latter is more consistent in the interface, and that is almost invariably a good thing.
Signed-off-by: Erik Boasson <eb@ilities.com>
Whenever a sample or an instance is added, check it against the attached query conditions and indicate which ones match
Signed-off-by: Erik Boasson <eb@ilities.com>
This index can then be used as an index into a bitmap to keep track
which query conditions are matched by a sample or an instance.
Signed-off-by: Erik Boasson <eb@ilities.com>
The read condition and the query condition are represented by the same data type internally, and a read condition therefore has a "m_filter" attribute. It makes more sense to initialise this properly as part of the read condition, instead of initialisation-by-memset in the dds_create_readcond, then overwriting it in dds_create_querycond.
Signed-off-by: Erik Boasson <eb@ilities.com>
The RHC tracing produces so much junk that is hardly ever useful that a
normal trace should definitely not include it.
Signed-off-by: Erik Boasson <eb@ilities.com>
A read restricted to samples in "read" state would not enter the condition update code on the false assumption that no read conditions could become triggered if the number of read samples remained the same, but it is nonetheless possible that the instance was transitions from "new" to "old" as a consequence, at least in my interpretation of the spec and the current implementation of read() in Cyclone. This commit brings consistency to the implementation without the intention of confirming the current behaviour as being desirable.
Signed-off-by: Erik Boasson <eb@ilities.com>
The name change missed the uses of the macro, with the result that
datagram truncation on reception does not result in warning (but in
the default configuration, truncation cannot occur); and that the
message flags are undefined on sending datagrams, but judging by the
man page, the likelihood of this causing problems is also small in
practice.
Signed-off-by: Erik Boasson <eb@ilities.com>
Fallback to unicast should set options for unicast discovery (#104)
A very simple change that addresses a real usability issue, does not rely on any platform-specific changes and moreover builds cleanly on the source branch. So I am not going to wait until the AppVeyor build completes.
The default behaviour is to allow multicast, with a fallback of
disabling multicast altogether if the selected interface doesn't support
it. The trouble is that the default discovery configuration assumes that
multicast is available, avoiding the "well-known" port numbers and
avoiding sending any participant discovery messages via unicast. The
result is that the process will run in isolation, which is typically not
the desired result. (It is a quite annoying problem because it happens
on Linux when only a loopback interface is available. It appears that
multicast over loopback works fine, if only you try it, but the kernel
doesn't advertise it and so it doesn't get used.)
This commit changes two things: firstly, it this case it forces the
allocation of "well-known" unicast ports, and secondly, it automatically
adds the local interface's address to the discovery address set. This
way, at least communication inside the machine works.
(Note: AssumeMulticastCapable can be still be used to force it to treat
a Linux loopback interface has multicast capable. It is only the default
behaviour that has changed.)
With thanks to @jwcesign who did all the work except writing the commit
message.
Signed-off-by: Erik Boasson <eb@ilities.com>
One is reasonable (a difference between size_t and the type used for a blob in an iovec), the others are SOCKET-to-int conversions that are caused by the openssl API. Since I'm not going to fix openssl and every indication is that the conversion is safe in practice, silencing the compiler is a sensible thing to do.
Signed-off-by: Erik Boasson <eb@ilities.com>
A configuration setting to allow negotiating TLS1.2 is included. If I
got things done correctly, OpenSSL pre-1.1 can be used but requires
explicitly lowering the minimum allowed TLS version in the
configuration file.
Signed-off-by: Erik Boasson <eb@ilities.com>
With TLS1.3 the socket can indicate data is available even when there is no application data. This commit ignores a 0-byte read when no data is required. Long-term, handling short reads/writes in TCP mode need to be handled completely differently, but that is a story for another day.
Signed-off-by: Erik Boasson <eb@ilities.com>
OpenSSL support remains optional, but instead of including it or not solely depending on whether cmake manages to find it, there is now a DDSC_ENABLE_OPENSSL option that defaults to "ON". Setting to OFF removes the dependency. The Conan configuration has been updated to automatically provide openssl 1.1.1a.
Signed-off-by: Erik Boasson <eb@ilities.com>
Fixing that produces a lot of noise on stderr because of inappropriate use of the "info" category in various place and, on macOS, because of a rather stupid way of messing with thread scheduling priorities even when none have been specified explicitly in the configuration.
Signed-off-by: Erik Boasson <eb@ilities.com>
Getting a QoS from an entity is akin to reading, and all read/take operations reuse or free/reallocate memory to avoid memory leaks, and so it is a reasonable assumption that calling dds_get_qos repeatedly without intervening calls to dds_reset_qos would not leak any memory either. (This was actually an assumption in the builtin topics test.) Therefore, it is reasonable to first call dds_reset_qos in dds_get_qos. All operations in the API that yield or modify a QoS object result in a properly initialised one, therefore the input to dds_get_qos is necessarily initialised, and so this is safe.
Signed-off-by: Erik Boasson <eb@ilities.com>
Stopping and restarting the DDSI stack in a single process would not re-initialise the TCP support code properly
Signed-off-by: Erik Boasson <eb@ilities.com>
Listener/status management invocation was rather expensive, and
especially the cost of checking listeners, then setting status flags and
triggering waitsets ran into severe lock contention.
A major cost was the repeated use of dds_entity_lock and
dds_entity_unlock, these have been eliminated. Another cost was that
each time an event occurred (with DATA_AVAILABLE the most problematic
one) it would walk the chain of ancestors to see if any had a relevant
listener, and only if none of them had any, it would set the status
flags.
The locking/unlocking of the entity has been eliminated by moving the
listener/status flag manipulation from the general entity lock to its
m_observers_lock. That lock has a much smaller scope, and consequently
contention has been significantly reduced.
Instead of walking the entity hierarchy looking for listeners, an entity
now inherits the ancestors' listeners. The set_listener operation has
been made a little more complicated by the need to not only set the
listeners for the specified entity, but to also update any inherited
listeners its descendants.
The commit is a bit larger than strictly needed ... I've started
reformatting the code to reduce the variety of styles ... as there I
haven't been able to find a single tool that does what I want, it may
well end up as manual work.
Signed-off-by: Erik Boasson <eb@ilities.com>
dds_init generates an entity name for the participant derived from the
process name and process id, but the name is currently uninitialised
(temporarily so, caused by a recent update to the OS abstraction layer).
Signed-off-by: Erik Boasson <eb@ilities.com>