Commit graph

29 commits

Author SHA1 Message Date
Erik Boasson
59459b9b8b Change PrismTech references to Adlink
Signed-off-by: Erik Boasson <eb@ilities.com>
2020-03-18 17:31:20 +01:00
Erik Boasson
ad19f571ae Rename nn_plist, xqos to ddsi_plist, xqos
This already was leaking out in the interface, so this name change was
needed too.  The relationship between plist and xqos being so intimate,
doing the one but not the other made no sense.

Signed-off-by: Erik Boasson <eb@ilities.com>
2020-02-11 23:26:01 +01:00
Erik Boasson
08c9db0934 Rework plist/qos printing, diffing and logging
* Use the parameter tables to pretty-print QoS and plist, rather than a
  hard-coded function supporting only the QoS.

* Support diffing two plists: a single table-driven function can handle
  both nn_plist_t and ddsi_qos_t, and it removes the discrepancy between
  the two types.

* Log content of discovery samples in trace rather than merely printing
  "(plist)"

Signed-off-by: Erik Boasson <eb@ilities.com>
2020-02-11 23:26:01 +01:00
Dennis Potman
231cb8c9f7 Deadline Missed QoS implementation
This commit contains the implementation of the deadline QoS
for readers and writers. The description of this QoS in
the DDS specification (section 2.2.3.7):

"This policy is useful for cases where a Topic is expected to
have each instance updated periodically. On the publishing side this
setting establishes a contract that the application must meet.
On the subscribing side the setting establishes a minimum
requirement for the remote publishers that are expected to supply
the data values."

On the writer side, the deadline missed event also needs to trigger in
case only local readers exist. The implementation for this inserts
the sample in the writer history cache temporary, so that an instance
is created in the whc. Immediately after inserting the sample, it is
removed again. With the creation of the instance, the deadline missed event
is created, which will take care of triggering the deadline missed
callback if required. In case the instance already existed, the timer
of the event is renewed.

To verify the changes to the writer history cache, add an additional
test to check the write history cache state. This test checks the state
of the whc after writing samples by a writer with specific combinations
of qos settings. The state of the whc is checked for stored
samples (min/max sequence number) and the absence of unacked data, after
writing samples and wait for acks by the local and/or remote
readers (which is also a parameter for this test). This test is
introduced as part of the deadline implementation, but its scope is
wider than only the changes that were made in the whc implementation for
the deadline qos.

This test showed that even before the deadline support was added,
whc_default_remove_acked_messages_full data was not marked as acked in
case of transient-local keep-all. This resulted in data in whc that
never gets in acked state. This has been fixed as well.

Signed-off-by: Dennis Potman <dennis.potman@adlinktech.com>
2020-01-17 14:35:07 +01:00
Dennis Potman
ef0f4c2ae7 Lifespan QoS implementation
This commit enables specifying a duration for data to be valid when writing
samples. After this duration, samples are dropped from the reader and writer
history cache. See section 2.2.3.16 of the DDS specification for more
details on this QoS.

The expiration of samples in the reader history cache is calculated
based on the reception timestamp of the sample and uses the monotonic
clock. As a result, the current implementation does not rely on clock
synchronisation between reader and writer. There may be reasons to
change this behavior in future and use the source timestamp instead.

Signed-off-by: Dennis Potman <dennis.potman@adlinktech.com>
2019-12-17 13:02:28 +01:00
Dan Rose
c5b22bf629 Fix most of the validation problems
Signed-off-by: Dan Rose <dan@digilabs.io>
2019-11-06 20:39:20 +01:00
Dan Rose
ae1a8130c7 Namespace the schema and add references in xml files
Signed-off-by: Dan Rose <dan@digilabs.io>
2019-11-06 20:39:20 +01:00
Erik Boasson
389d6b1789 Remove repetitive debug print from "mpt_qosmatch"
Signed-off-by: Erik Boasson <eb@ilities.com>
2019-11-06 14:17:40 +01:00
Thijs Sassen
36b1b9da3d Adjusted mpt qos test includes to be in line with other tests
Signed-off-by: Thijs Sassen <thijs.sassen@adlinktech.com>
2019-10-22 16:27:15 +02:00
Erik Boasson
94483e3371 Address Coverity, Clang static analyzer warnings
* Fix type of num reliable readers (int to int32_t)

* Conversion codes in debug monitor printf formats

* Dead code elimination

* Skipping a test case where SIZE_MAX is assumed > INT32_MAX if
  assumption is false on target platform

* Error handling in os_sockWaitsetNew

* Stick to unsigned in fragment size calculations

  This check is actually guarded by valid_DataFrag and was safe for
  datagrams up to 2GB, but the unintended and implicit conversion to is
  still best eliminated.

* A "server" connection never has an invalid socket in TCP wrapper

* Handle error return from gethostname in SPDP write (CID 248183)

* Handle extended retcodes in dds_strretcode

  CID 248131, introduced by 19aec98b8a

* Remove dead code in ddsrt logging test (CID 248195)

* Validate command-line argument in process test (CID 248117)

* Allow for extremely delayed store in test

  Test is constructed to have the events trigger only at the appropriate
  times, but it does assume that the store to cb_called becomes visible
  prior to the listener callback.  I'm pretty sure that will always be
  the case in practice, but I'm also pretty sure there is no formal
  guarantee without a memory barrier, which mutex_unlock provides.

  CID 248088, 248136, 248177, 253590, 253591, 253593

* Check unsetenv return value in test (CID 248099)

Signed-off-by: Erik Boasson <eb@ilities.com>
2019-09-25 10:46:40 +02:00
Erik Boasson
9b9a07a8e5 Test delivery under discovery stress
This commit adds a test that checks all data published by (reliable)
volatile writers is delivered to readers that existed at the time of the
writer creation, while continuously creating and deleting these writers
and relying on the writer lingering long enough for the data to be
actually delivered.  It also verifies that the match counts end up at 0.

By also semi-continuously creating and deleting an additional reader in
the subscribing process, all three delivery paths (steady state, proxy
writer deletion or matching in progress, and delivery of data to
out-of-sync readers) are hit, as verified by a coverage test.

Signed-off-by: Erik Boasson <eb@ilities.com>
2019-09-03 12:23:50 +02:00
Erik Boasson
bf095a32d4 Parenthesize macro arguments in MPT_ASSERT
Signed-off-by: Erik Boasson <eb@ilities.com>
2019-09-03 12:23:50 +02:00
Erik Boasson
711026114b Add multi-domain version of topic_data test
This is the same test for verifying set_qos on topic_data is reflected
in the DCPSPublication and DCPSSubscription topics as the one that
already exists, but it uses two threads in two domains.  A barrier is
used to ensure the threads in the two processes indeed execute
concurrently, and different topic data sequences are used to increase
the likelihood of detecting leakage from one domain into the other.

Signed-off-by: Erik Boasson <eb@ilities.com>
2019-08-21 14:16:51 +02:00
Erik Boasson
966ec0dda7 Make logging config per-domain
Signed-off-by: Erik Boasson <eb@ilities.com>
2019-08-21 14:16:51 +02:00
Erik Boasson
d494ba4eda Make qos MPTs more robust
Signed-off-by: Erik Boasson <eb@ilities.com>
2019-08-06 08:32:14 +02:00
Erik Boasson
9cf4b97f1a Reorganize repository
* Move the project top-level CMakeLists.txt to the root of the project;
  this allows building Cyclone as part of ROS2 without any special
  tricks;

* Clean up the build options:

  ENABLE_SSL:    whether to check for and include OpenSSL support if a
                 library can be found (default = ON); this used to be
                 called DDSC_ENABLE_OPENSSL, the old name is deprecated
                 but still works
  BUILD_DOCS:    whether to build docs (default = OFF)
  BUILD_TESTING: whether to build test (default = OFF)

* Collect all documentation into top-level "docs" directory;

* Move the examples to the top-level directory;

* Remove the unused and somewhat misleading pseudo-default
  cyclonedds.xml;

* Remove unused cmake files

Signed-off-by: Erik Boasson <eb@ilities.com>
2019-07-30 10:52:44 +02:00
Erik Boasson
4cdcff8be7 Combine {user,group,topic}_data set_qos tests
Signed-off-by: Erik Boasson <eb@ilities.com>
2019-06-28 12:47:27 +02:00
Erik Boasson
c6c5a872eb Trivial changes for thread sanitizer
Thread sanitizer warns about reads and writes of variables that are
meant to be read without holding a lock:

* Global "keep_going" is now a ddsrt_atomic_uint32_t

* Thread "vtime" is now a ddsrt_atomic_uint32_t

Previously the code relied on the assumption that a 32-bit int would be
treated as atomic, now that is all wrapped in ddsrt_atomic_{ld,st}32.
These being inline functions doing exactly the same thing, there is no
functional change, but it does allow annotating the loads and stores for
via function attributes on the ddsrt_atomic_{ld,st}X.

The concurrent hashtable implementation is replaced by a locked version
of the non-concurrent implementation if thread sanitizer is used.  This
changes eliminates the scores of problems signalled by thread sanitizer
in the GUID-to-entity translation and the key-to-instance id lookups.

Other than that, this replaces a flag used in a waitset test case to be
a ddsrt_atomic_uint32_t.

Signed-off-by: Erik Boasson <eb@ilities.com>
2019-06-28 12:47:27 +02:00
Erik Boasson
32b683bf37 Enable "missing prototypes" warning for gcc, clang
Missing prototypes for exported functions cause a really huge issue on
Windows.  Enabling the "missing prototypes" warning makes it much easier
to catch this problem.  Naturally, any warnings caused by this have been
fixed.

Signed-off-by: Erik Boasson <eb@ilities.com>
2019-06-13 12:54:35 +02:00
Erik Boasson
a4d8aba4f9 Add limited support for QoS changes
This commit adds support for changing all mutable QoS except those that
affect reader/writer matching (i.e., deadline, latency budget and
partition).  This is simply because the recalculation of the matches
hasn't been implemented yet, it is not a fundamental limitation.

Implementing this basically forced fixing up a bunch of inconsistencies
in handling QoS in entity creation.  A silly multi-process ping-pong
test built on changing the value of user data has been added.

Signed-off-by: Erik Boasson <eb@ilities.com>
2019-06-13 12:54:35 +02:00
Erik Boasson
3322fc086d Table-driven parameter list handling
The old parameter list parsing was a mess of custom code with tons of
duplicated checks, even though parameter list parsing really is a fairly
straightforward affair.  This commit changes it to a mostly table-driven
implementation, where the vast majority of the settings are handled by a
generic deserializer and the irregular ones (like reliability, locators)
are handled by custom functions.  The crazy ones (IPv4 address and port
rely on additional state and are completely special-cased).

Given these tables, the serialization, finalisation, validation,
merging, unalias'ing can all be handled by a very small amount of custom
code and an appropriately defined generic function for the common cases.
This also makes it possible to have all QoS validation in place, and so
removes the need for the specialized implementations for the various
entity kinds in the upper layer.

QoS inapplicable to an entity were previously ignored, allowing one to
have invalid values set in a QoS object when creating an entity,
provided that the invalid values are irrelevant to that entity.  Whether
this is a good thing or not is debatable, but certainly it is a good
thing to avoid copying in inapplicable QoS settings.  That in turn means
the behaviour of the API can remain the same.

It does turn out that the code used to return "inconsistent QoS" also
for invalid values.  That has now been rectified, and it returns
"inconsistent QoS" or "bad parameter" as appropriate.  Tests have been
updated accordingly.

Signed-off-by: Erik Boasson <eb@ilities.com>
2019-06-10 10:45:53 +02:00
Erik Boasson
19aec98b8a Clean up return code types
* Remove dds_return_t / dds_retcode_t distinction (now there is only
  dds_return_t and all error codes are always negative)

* Remove Q_ERR_... error codes and replace them by DDS_RETCODE_...
  ones so that there is only one set of error codes

* Replace a whole bunch "int" return types that were used to return
  Q_ERR_... codes by "dds_return_t" return types

Signed-off-by: Erik Boasson <eb@ilities.com>
2019-06-10 10:42:52 +02:00
Erik Boasson
a652ecb78e ensure delivery of writes immediately following pub match event (#165)
A long-standing bug of Cyclone is that a sample written immediately
after a publication-matched event may never arrive at the reader that
was just matched.  This happened because the reader need not have
completed discovery of the writer by the time the writer discovers the
reader, at which point the reader ignores the sample because it either
doesn't know the writer at all, or it hasn't yet seen a Heartbeat from
it.

That Heartbeat arrives shortly after, but by then it is too late: the
reader slaves decides to accept the next sample to be written by the
writer.  (It has no choice, really: either you risk losing some data, or
you will be requesting all historical data, which is empathically not
what a volatile reader is about ...)

A related issue is the handling of historical data for transient-local
readers: it used to deliver this out-of-order, but that is firstly
against the specification, and secondly, against reasonable expectations
of those who use DDS as a mere publish-subscribe messaging system.  To
add insult to injury, it didn't completely handle some reordering issues
with disposes ...

This commit changes the way writers respond to a request for
retransmission from volatile proxy readers and the way the
in-sync/out-of-sync setting of a reader with respect to a proxy-writer
is used.  The first makes it safe for a Cyclone reader to ask a Cyclone
writer for all data (all these details not being covered in the specs it
errs on the reasonable side for other vendors, but that may cause the
data loss mentioned above): the writer simply send a Gap message to the
reader for all the sequence numbers prior to the matching.

The second changes the rule for switching from out-of-sync to in-sync:
that transition is now simply once the next sequence number to be
delivered to the reader equals the next sequence number that will be
delivered directly from the proxy writer object to all readers.  (I.e.,
a much more intuitive notion than reaching some seemingly arbitrary
sequence number.)

To avoid duplicates the rule for delivery straight from a proxy writer
has changed: where samples were delivered from the proxy writer to all
matching readers, they are now delivered only to the matching readers
that are in-sync.  To avoid ordering problems, the idea that historical
data can be delivered through the asynchronous delivery path even when
the regular data goes through the synchronous delivery path has been
abandoned.  All data now always follows the same path.

As these same mechanisms are used for getting historical data into
transient-local readers, the ordering problem for the historical data
also disappeared.

The test stuff in src/core/xtests/initsampledeliv covers a lot of the
interesting cases: data published before the existene of a reader, after
it, mixes of volatile and transient-local.  Running them takes quite a
bit of time, and they are not yet integrated in the CI builds (if ever,
because of that time).

Note: the "conservative built-in startup" option has been removed,
because it really makes no sense to keep a vague compatibility option
added a decade ago "just in case" that has never been used ...

Note: the workaround in the src/mpt/tests/basic/procs/hello.c (use
transient-local to ensure delivery of data) has been removed, as has
been its workaround for the already-fixed #146.

Signed-off-by: Erik Boasson <eb@ilities.com>
2019-05-29 13:20:37 +02:00
Erik Boasson
e822dba9c1 MPT basic: append config, don't hardcode network
If a firewall blocks traffic over one network but not another, and the
one happens to be the default pick of Cyclone, then the MPT basic tests
won't work properly.

With the recent changes to the configuration handling for supporting
multiple sources of configuration, it makes far more sense to remove the
hardcoded network interface selection in the test configurations and to
append the test configuration file to the environment list rather than
to replace it.

Signed-off-by: Erik Boasson <eb@ilities.com>
2019-05-29 13:20:37 +02:00
Erik Boasson
4fc56c5cbe MPT driver incorrectly reporting a waitpid error
The multi-process test driver program waits until all its children have
stopped (or timeout) by calling ddsrt_proc_waitpids in a loop.  The way
the loop is written, it can try once more when all children have already
been waited for, in which case it reports a spurious error.

This fixes that problem by checking whether the error code was to be
expected.

Signed-off-by: Erik Boasson <eb@ilities.com>
2019-05-29 13:20:37 +02:00
Erik Boasson
62ccd9f7da Move CycloneDDS/DDSI2E/* to CycloneDDS/* in config
The entirely historical "DDSI2E" element within the CycloneDDS
configuration element is herewith eliminated.  All settings contained in
that element (such as General, Discovery, Tracing) are now subelements
of the CycloneDDS top-level element.  Old configurations continue to
work but will print a deprecation warning:

  //CycloneDDS/DDSI2E: settings moved to //CycloneDDS

Any warnings/errors related for an element //CycloneDDS/DDSI2E/x will be
reported as errors for the new location, that is, for //CycloneDDS/x.
As the "settings moved" warning always precedes any other such warning,
confusion will hopefully be avoided.

Signed-off-by: Erik Boasson <eb@ilities.com>
2019-05-05 20:46:50 +08:00
Martin Bremmer
74ca68e550 Improved mpt default timeout.
Signed-off-by: Martin Bremmer <martin.bremmer@adlinktech.com>
2019-04-24 15:00:37 +02:00
Martin Bremmer
973ae87e17 Moved expand_envvars.
Signed-off-by: Martin Bremmer <martin.bremmer@adlinktech.com>
2019-04-24 15:00:37 +02:00
Martin Bremmer
17f9c361ea Multi Process Testing framework
Signed-off-by: Martin Bremmer <martin.bremmer@adlinktech.com>
2019-04-24 14:46:46 +02:00