cyclonedds

Author	SHA1	Message	Date
Erik Boasson	d693d8eac9	limit WHC, serdata, xmsg freelist memory use (#168 ) High sample rates require rather high rates of allocating and freeing WHC nodes, serialised samples (serdata), and RTPS message fragments (xmsg). A bunch of dedicated parallel allocators help take some pressure off the regular malloc/free calls. However, these used to gobble up memory like crazy, in part because of rather generous limits, and in part because there was no restriction on the size of the samples that would be cached, and it could end up caching large numbers of multi-MB samples. It should be noted that there is no benefit to caching large samples anyway, because the sample rate will be that much lower. This commit reduces the maximum number of entries for all three cases, it furthermore limits the maximum size of a serdata or xmsg that can be cached, and finally instead of instantiating a separate allocator for WHC nodes per WHC, it now shares one across all WHCs. Total memory use should now be limited to a couple of MB. The caching can be disabled by setting ``FREELIST_TYPE`` to ``FREELIST_NONE`` in ``q_freelist.h``. Signed-off-by: Erik Boasson <eb@ilities.com>	2019-05-02 20:53:20 +08:00
Jeroen Koekkoek	c9d827e420	Fix warnings related to fixed type integers Signed-off-by: Jeroen Koekkoek <jeroen@koekkoek.nl>	2019-04-29 19:22:11 +02:00
Erik Boasson	b686ba858c	make internal header files more C++ friendly Generally one doesn't need to include any internal header files in an application, but the (unstable) interface for application-defined sample representation and serialization does require including some. It turns out a keyword clash had to be resolved (typename => type_name) and that a whole bunch of them were missing the #ifdef __cplusplus / extern "C" bit. It further turned out that one had to pull in nearly all of the type definitions, including some typedefs that are illegal in C++, e.g., typedef struct os_sockWaitset *os_sockWaitset; C++ is right to forbid this, but Cyclone's header files were wrong to force inclusion of so much irrelevant stuff. This commit leaves these typedefs in place, but eliminates a few header file inclusions to avoid the problem. Signed-off-by: Erik Boasson <eb@ilities.com>	2019-04-29 11:15:41 +02:00
Erik Boasson	d146716d1d	remove Lease element from test config The element has long been meaningless and got deprecated in commit `c3dca32a2f`. Signed-off-by: Erik Boasson <eb@ilities.com>	2019-04-24 14:09:30 +02:00
Erik Boasson	06245d0d4a	initial version of permance/network check tool The current situation for performance measurements and checking network behaviour is rather unsatisfactory, as the only tools available are ``pubsub`` and the ``roundtrip`` and ``throughput`` examples. The first can do many things thanks to its thousand-and-one options, but its purpose really is to be able to read/write arbitrary data with arbitrary QoS -- though the arbitrary data bit was lost in the hacked conversion from the original code. The latter two have a terrible user interface, don't perform any verification that the measurement was successful and do not provide the results in a convenient form. Furthermore, the abuse of the two examples as the primary means for measuring performance has resulted in a reduction of their value as an example, e.g., they can do waitset- or listener-based reading (and the throughput one also polling-based), but that kind of complication does not help a new user understand what is going on. Especially not given that these features were simply hacked in. Hence the need for a new tool, one that integrates the common measurements and can be used to verify that the results make sense. It is not quite done yet, in particular it is lacking in a number of aspects: * no measurement of CPU- and network load, memory usage and context switches yet; * very limited statistics (min/max/average, if you're lucky; no interesting things such as jitter on a throughput test yet); * it can't yet gather the data from all participants in the network using DDS; * it doesn't output the data in a convenient file format yet; * it doesn't allow specifying boundaries within which the results must fall for the run to be successful. What it does verify is that all the endpoint matches that should exist given the discovered participant do in fact come into existence, reporting an error (and exiting with an exit status code of 1) if they don't, as well as checking the number of participants. With the way the DDSI protocol works, this is a pretty decent network connectivity check. The raw measurements needed for the desired statistics (apart from system-level measurements) are pretty much made, so the main thing that still needs to be done is exploit them and output them. It can already replace the examples for most benchmarks (only the 50%/90%/99% percentiles are still missing for a complete replacement). Signed-off-by: Erik Boasson <eb@ilities.com>	2019-04-24 14:09:30 +02:00
Erik Boasson	46f61e09f5	missing m_observer_lock on (re)setting statuses Most of the places where the status flags were reset, this happened without holding m_observer_lock protecting these status flags. For most of these statuses, they are only ever set/reset while also holding the entity lock, but this is not true for all of them (DATA_AVAILABLE for example), and thus there are some cases where retrieving the status could lead to losing the raising of a (at least a DATA_AVAILABLE) status. The problem was introduced in `ba46cb1140`. Signed-off-by: Erik Boasson <eb@ilities.com>	2019-04-24 14:09:30 +02:00
Erik Boasson	1a3d5c7aba	Fix DATA_AVAILABLE race condition The DATA_AVAILABLE status was reset by read and take while holding the upper-layer reader lock, but after completing the read/take operation on the RHC. As data can be written into the RHC without holding the upper-layer reader lock, new data could arrive in between the reading/taking and the resetting of the DATA_AVAILABLE status, leading to a missed detection. Resetting DATA_AVAILABLE prior to accessing the RHC solves this. Signed-off-by: Erik Boasson <eb@ilities.com>	2019-04-24 14:09:30 +02:00
Erik Boasson	1ecad3c047	remove "Error occurred on locking entity" messages Those should not be printed to stderr (or wherever), there are errors returned in these cases ... Signed-off-by: Erik Boasson <eb@ilities.com>	2019-04-24 14:09:30 +02:00
Erik Boasson	1672268481	always append 0 byte to user/group/topic data Changes the semantics of dds_qget_{user,group,topic}data to always append a 0 byte to any non-empty value without counting it in the size. (An empty value is always represented by a null pointer and a size of 0). The advantage is that any code treating the data as the octet sequence it formally is will do exactly the same, but any code written with the knowledge that it should be a string can safely interpret it as one. Signed-off-by: Erik Boasson <eb@ilities.com>	2019-04-24 14:09:30 +02:00
Erik Boasson	6c171a890d	move util library into ddsrt As was the plan with the introduction of ddsrt; this includes renaming the identifiers to match the capitalization style and removes old junk. Signed-off-by: Erik Boasson <eb@ilities.com>	2019-04-24 14:09:30 +02:00
Erik Boasson	e965df5db7	add participant instance handle to builtin topics Extend the endpoint built-in topic data with the participant instance handle (the GUID was already present). Having the instance handle available makes it trivial to look up the participant, whereas a lookup of the GUID is rather impractical. Signed-off-by: Erik Boasson <eb@ilities.com>	2019-04-24 14:09:30 +02:00
Erik Boasson	5735b5775d	add setter for partition QoS for a single name This adds dds_qset_partition1 as a convenience function to set the partition QoS to a single name. Signed-off-by: Erik Boasson <eb@ilities.com>	2019-04-24 14:09:30 +02:00
Erik Boasson	4778d6c5df	add QoS to ignore local readers/writers (#78 ) Adds a new "ignorelocal" QoS to the readers/writers to ignore local matching readers/writers, with three settings: * DDS_IGNORELOCAL_NONE: default * DDS_IGNORELOCAL_PARTICIPANT: ignores readers/writers in the same participant * DDS_IGNORELOCAL_PROCESS: ignores readers/writers in the same process These can be set/got using dds_qset_ignorelocal and dds_qget_ignorelocal. If a matching reader or writer is ignored because of this setting, it is as-if that reader or writer doesn't exist. No traffic will be generated or data retained on its behalf. There are no consequences for interoperability as this is (by definition) a local affair. Signed-off-by: Erik Boasson <eb@ilities.com>	2019-04-24 14:09:30 +02:00
Erik Boasson	a6b5229510	crash invoking data available on built-in reader The DDSI reader/writer pointers are now returned as out parameters instead of as a return value, so that the upper-layer reference is set before any listener can be invoked. Signed-off-by: Erik Boasson <eb@ilities.com>	2019-04-24 14:09:30 +02:00
Erik Boasson	0202039f61	remove dds_rhc_fini abomination It was called strangely early in the deleting of the reader, even before the DDSI reader was no longer being accessed by other threads. The immediate and obvious problem is that it resets the pointer to the upper-layer entity even though this can still be dereferenced in invoking a listener, resulting in a crash. Secondly it blocks until there are no listener calls any more (and the resetting of that pointer will prevent any further listener invocations), but a similar piece of logic is already in generic entity code that resets the mask and then waits for all listener invocations to complete. Having both is a problem. Signed-off-by: Erik Boasson <eb@ilities.com>	2019-04-24 14:09:30 +02:00
Erik Boasson	d6edfada81	fix deadlock between listener, deleting reader, &c If a (proxy) writer delivers data to a reader that has a data_available listener calling read/take while that reader is being deleted, blocked in set_listener waiting for the listeners to complete, then a deadlock can occur: * listener calling read/take then attempt to lock reader; * deleting the reader locks the reader, then waits for the listeners to complete while holding the lock This commits unlocks the reader before waiting for the listeners to complete. Signed-off-by: Erik Boasson <eb@ilities.com>	2019-04-24 14:09:30 +02:00
Erik Boasson	2dd20c4273	add dds_entity_release counterpart to entity_claim Signed-off-by: Erik Boasson <eb@ilities.com>	2019-04-24 14:09:30 +02:00
Erik Boasson	6227fe00b3	eliminate clang static analyzer false positive Signed-off-by: Erik Boasson <eb@ilities.com>	2019-04-21 16:05:06 +02:00
Erik Boasson	c3dca32a2f	nestable calls to thread_[state_]awake Remove all the "if asleep then awake ..." stuff from the code by making awake/asleep calls nestable, whereas before it "awake ; awake" really meant a transition through "asleep". This self-evidently necessitates fixing those places where the old behaviour was relied on upon, but fortunately those are few. Signed-off-by: Erik Boasson <eb@ilities.com>	2019-04-21 16:05:06 +02:00
Erik Boasson	9b3a71e1ab	lift limits on handle allocation and reuse (#95 ) The old entity handle mechanism suffered from a number of problems, the most terrible one being that it would only ever allocate 1000 handles (not even have at most 1000 in use at the same time). Secondarily, it was protected by a single mutex that actually does show up as a limiting factor in, say, a polling-based throughput test with small messages. Thirdly, it tried to provide for various use cases that don't exist in practice but add complexity and overhead. This commit totally rewrites the mechanism, by replacing the old array with a hash table and allowing a near-arbitrary number of handles as well as reuse of handles. It also removes the entity "kind" bits in the most significant bits of the handles, because they only resulted in incorrect checking of argument validity. All that is taken out, but there is still more cleaning up to be done. It furthermore removes an indirection in the handle-to-entity lookup by embedding the "dds_handle_link" structure in the entity. Handle allocation is randomized to avoid the have a high probability of quickly finding an available handle (the total number of handles is limited to a number much smaller than the domain from which they are allocated). The likelihood of handle reuse is still dependent on the number of allocated handles -- the fewer handles there are, the longer the expected time to reuse. Non-randomized handles would give a few guarantees more, though. It moreover moves the code from the "util" to the "core/ddsc" component, because it really is only used for entities, and besides the new implementation relies on the deferred freeing (a.k.a. garbage collection mechanism) implemented in the core. The actual handle management has two variants, selectable with a macro: the preferred embodiment uses a concurrent hash table, the actually used one performs all operations inside a single mutex and uses a non-concurrent version of the hash table. The reason the less-predeferred embodiment is used is that the concurrent version requires the freeing of entity objects to be deferred (much like the GUID-to-entity hash tables in DDSI function, or indeed the key value to instance handle mapping). That is a fair bit of work, and the non-concurrent version is a reasonable intermediate step. Signed-off-by: Erik Boasson <eb@ilities.com>	2019-04-21 16:05:06 +02:00
Erik Boasson	dd9aceb713	small performance improvement in RHC The introduction of properly functioning query conditions adds some overhead, this commit removes some of that cost by avoiding some calls to update_conditions when there are no query conditions. It also removes the has_changed field from the instance, instead using a local boolean to track whether DATA_AVAILABLE should be raised or not. Signed-off-by: Erik Boasson <eb@ilities.com>	2019-04-21 16:05:06 +02:00
Erik Boasson	671e73ec98	set DATA_AVAILABLE when deleting writer (#148 ) Deleting a writer causes unregisters (and possibly disposes) in the rest of the network, and these updates to the instances should trigger DATA_AVAILABLE. Signed-off-by: Erik Boasson <eb@ilities.com>	2019-04-11 10:09:35 +02:00
Jeroen Koekkoek	3bdd2a140d	Move md5 from ddsi to ddsrt Signed-off-by: Jeroen Koekkoek <jeroen@koekkoek.nl>	2019-04-11 10:04:06 +02:00
Jeroen Koekkoek	63a5c87baf	Fix format strings and signatures for fixed size integers Signed-off-by: Jeroen Koekkoek <jeroen@koekkoek.nl>	2019-04-11 10:04:06 +02:00
Erik Boasson	f0f76454c7	timely initialization of builtin topics (#138 ) The rtps_init function used to initialize all data structures and start all threads used by the protocol stack, allowing discovery of remote entities before the built-in topic data structures were initialized. (Very) early discovery of a remote participant thus led to a crash. This commit splits the initialisation, providing a separate function for starting, in particular, the threads receiving data from the network. In terms of threads created, it matches exactly with the rtps_stop / rtps_fini split that already existed to address the exact same problem on termination. Signed-off-by: Erik Boasson <eb@ilities.com>	2019-03-27 09:30:15 +01:00
Erik Boasson	fcb6b935ea	support for building/running on OpenIndiana Signed-off-by: Erik Boasson <eb@ilities.com>	2019-03-23 15:40:29 +01:00
Erik Boasson	7540ac8229	make expensive checks in asserts optional (#125 ) This adds an Internal/EnableExpensiveChecks setting for enabling some or all expensive run-time checks to avoid a massive slowdown when assertions are enabled at compile-time. Currently these cover only the writer and reader-history cache checking. Signed-off-by: Erik Boasson <eb@ilities.com>	2019-03-23 15:40:29 +01:00
Jeroen Koekkoek	cd6742ee12	Rearrange and fixup abstraction layer - Replace os_result by dds_retcode_t and move DDS return code defines down. Eliminates the need to convert between different return code types. - Move dds_time_t down and remove os_time. Eliminates the need to convert between different time representations and reduces code duplication. - Remove use of Microsoft source-code annotation language (SAL). SAL annotations are Microsoft specific and not very well documented. This makes it very difficult for contributers to write. - Rearrange the abstraction layer to be feature-based. The previous layout falsely assumed that the operating system dictates which implementation is best suited. For general purpose operating systems this is mostly true, but embedded targets require a slightly different approach and may not even offer all features. The new layout makes it possible to mix-and-match feature implementations and allows for features to not be implemented at all. - Replace the os prefix by ddsrt to avoid name collisions. - Remove various portions of unused and unwanted code. - Export thread names on all supported platforms. - Return native thread identifier on POSIX compatible platforms. - Add timed wait for condition variables that takes an absolute time. - Remove system abstraction for errno. The os_getErrno and os_setErrno were incorrect. Functions that might fail now simply return a DDS return code instead. - Remove thread-specific memory abstraction. os_threadMemGet and accompanying functions were a mess and their use has been eliminated by other changes in this commit. - Replace attribute (re)defines by ddsrt_ prefixed equivalents to avoid name collisions and problems with faulty __nonnull__ attributes. Signed-off-by: Jeroen Koekkoek <jeroen@koekkoek.nl>	2019-03-22 15:19:09 +01:00
Erik Boasson	753f910aad	consistently use Eclipse Cyclone DDS in API header files Signed-off-by: Erik Boasson <eb@ilities.com>	2019-02-15 11:35:00 +01:00
Erik Boasson	1c963b5c3b	add torture test for read, query conditions The "rhc" test runs a random sequence of operations (writes, reads, &c.) through an RHC with conditions attached to it. All possible state masks are used, and query conditions are tried with a condition that only tests the key value, and one that tests attribute values. It depends on the internal checking logic of the RHC, which is currently enabled only in Debug builds because of the associated run-time overhead. Signed-off-by: Erik Boasson <eb@ilities.com>	2019-02-12 14:22:06 +01:00
Erik Boasson	a1e827cf7e	minor improvements to query condition handling Various details: - index replaced throughout by bitmask - caching a single sample in RHC for deserialising samples when query conditions are used - short-circuiting the trivial case of an instance matched neither before nor after a change to its state - combining inc/dec counters into a single delta in condition evaluation - major speed-up of rhc_check_counts by not checking the condition bit masks every single time Signed-off-by: Erik Boasson <eb@ilities.com>	2019-02-11 15:56:58 +01:00
Erik Boasson	96649c435d	always zero out non-key fields in invalid samples returned by read/take The specification says the data in an invalid sample must not be inspected, but that means retrieving the key values from a disposed/unregistered instance after a take is impossible. So, the elegant thing to do is to always provide the key values in the data. The remaining question is then whether the non-key fields should be left in whatever state they happened to be in, or be set to zero. The latter is more consistent in the interface, and that is almost invariably a good thing. Signed-off-by: Erik Boasson <eb@ilities.com>	2019-02-11 11:12:07 +01:00
Erik Boasson	f0675ca7f1	support for query conditions This also fixes #87. Signed-off-by: Erik Boasson <eb@ilities.com>	2019-02-11 11:12:07 +01:00
Erik Boasson	b21c7f032c	set masks in samples/instances for query conditions Whenever a sample or an instance is added, check it against the attached query conditions and indicate which ones match Signed-off-by: Erik Boasson <eb@ilities.com>	2019-02-10 19:19:41 +01:00
Erik Boasson	7739341e71	some errors on locking entities are not worth logging Signed-off-by: Erik Boasson <eb@ilities.com>	2019-02-10 18:23:51 +01:00
Erik Boasson	eee8f6cc59	assign an index to query conditions This index can then be used as an index into a bitmap to keep track which query conditions are matched by a sample or an instance. Signed-off-by: Erik Boasson <eb@ilities.com>	2019-02-10 18:23:51 +01:00
Erik Boasson	1d746a866d	remove unused conds_lock from RHC Signed-off-by: Erik Boasson <eb@ilities.com>	2019-02-10 18:23:51 +01:00
Erik Boasson	de5021dc55	change initialisation of a read condition to set the query condition as well The read condition and the query condition are represented by the same data type internally, and a read condition therefore has a "m_filter" attribute. It makes more sense to initialise this properly as part of the read condition, instead of initialisation-by-memset in the dds_create_readcond, then overwriting it in dds_create_querycond. Signed-off-by: Erik Boasson <eb@ilities.com>	2019-02-10 18:23:51 +01:00
Erik Boasson	fcdffa8cf8	add a separate RHC tracing category The RHC tracing produces so much junk that is hardly ever useful that a normal trace should definitely not include it. Signed-off-by: Erik Boasson <eb@ilities.com>	2019-02-10 18:23:51 +01:00
Erik Boasson	e0e1e67f24	read condition on view "old" not triggered when reading only "read" samples A read restricted to samples in "read" state would not enter the condition update code on the false assumption that no read conditions could become triggered if the number of read samples remained the same, but it is nonetheless possible that the instance was transitions from "new" to "old" as a consequence, at least in my interpretation of the spec and the current implementation of read() in Cyclone. This commit brings consistency to the implementation without the intention of confirming the current behaviour as being desirable. Signed-off-by: Erik Boasson <eb@ilities.com>	2019-02-10 18:23:51 +01:00
eboasson	4483807e71	Merge pull request #100 from eboasson/master the "tracing" category in the configuration should only affect the "trace" sink and leave the "log" sink alone	2019-01-22 09:14:43 +01:00
Erik Boasson	c35c5f9190	the "tracing" category in the configuration should only affect the "trace" sink and leave the "log" sink alone Fixing that produces a lot of noise on stderr because of inappropriate use of the "info" category in various place and, on macOS, because of a rather stupid way of messing with thread scheduling priorities even when none have been specified explicitly in the configuration. Signed-off-by: Erik Boasson <eb@ilities.com>	2019-01-21 12:53:55 +01:00
Erik Boasson	8157d3bec8	fix taking only some samples from an instance using a query condition and add a test case Signed-off-by: Erik Boasson <eb@ilities.com>	2019-01-18 12:14:21 +01:00
Erik Boasson	98f757a5ab	let dds_get_qos reset the input qos first Getting a QoS from an entity is akin to reading, and all read/take operations reuse or free/reallocate memory to avoid memory leaks, and so it is a reasonable assumption that calling dds_get_qos repeatedly without intervening calls to dds_reset_qos would not leak any memory either. (This was actually an assumption in the builtin topics test.) Therefore, it is reasonable to first call dds_reset_qos in dds_get_qos. All operations in the API that yield or modify a QoS object result in a properly initialised one, therefore the input to dds_get_qos is necessarily initialised, and so this is safe. Signed-off-by: Erik Boasson <eb@ilities.com>	2019-01-17 12:34:37 +01:00
Erik Boasson	54b5bed8d2	use enum with values log2(STATUS) for identifying status/listener Signed-off-by: Erik Boasson <eb@ilities.com>	2019-01-17 10:18:14 +01:00
Erik Boasson	ba46cb1140	rework listener invocation and entity status flags Listener/status management invocation was rather expensive, and especially the cost of checking listeners, then setting status flags and triggering waitsets ran into severe lock contention. A major cost was the repeated use of dds_entity_lock and dds_entity_unlock, these have been eliminated. Another cost was that each time an event occurred (with DATA_AVAILABLE the most problematic one) it would walk the chain of ancestors to see if any had a relevant listener, and only if none of them had any, it would set the status flags. The locking/unlocking of the entity has been eliminated by moving the listener/status flag manipulation from the general entity lock to its m_observers_lock. That lock has a much smaller scope, and consequently contention has been significantly reduced. Instead of walking the entity hierarchy looking for listeners, an entity now inherits the ancestors' listeners. The set_listener operation has been made a little more complicated by the need to not only set the listeners for the specified entity, but to also update any inherited listeners its descendants. The commit is a bit larger than strictly needed ... I've started reformatting the code to reduce the variety of styles ... as there I haven't been able to find a single tool that does what I want, it may well end up as manual work. Signed-off-by: Erik Boasson <eb@ilities.com>	2019-01-17 10:18:14 +01:00
Erik Boasson	2e5ecb2e76	requiring checking the return value of dds_{get,set}_listener is pedantry Signed-off-by: Erik Boasson <eb@ilities.com>	2019-01-17 10:18:14 +01:00
Erik Boasson	e4360d25a0	code cleanup: replacement of lots of function-like macros by inline functions, removal of unnecessary casts Signed-off-by: Erik Boasson <eb@ilities.com>	2019-01-17 10:18:14 +01:00
Erik Boasson	336a9da8b4	fix read of uninitialized program name in dds_init dds_init generates an entity name for the participant derived from the process name and process id, but the name is currently uninitialised (temporarily so, caused by a recent update to the OS abstraction layer). Signed-off-by: Erik Boasson <eb@ilities.com>	2019-01-15 15:07:40 +01:00
Erik Boasson	66076817e1	rework built-in topics again Move details of built-in topics out of the DDSI core (so the only hooks remain). For this, rtps_term had to be split, so now it is "stop" followed by "fini". Add a notion of local writers that are not bound to a participant ("local orphans"), so that the local built-in topic writers can be created during initialization. This eliminates the "builtin" participant. This uncovered in inconsistency in the unit tests: on the one hand, a newly created participant is expected to have no child entities; on the other hand, the built-in topics were expected to be returned by find_topic ... This inconsistency has been resolved by creating them lazily and accepting that find_topic can't return them until they have been created. Special code was in place in dds_create_reader anyway, so it is not expected to have any real consequence for applications. Use a special WHC implementation that regenerates the data on the fly using the internal discovery tables of DDSI, so that the samples are only stored by readers. This eliminates the memory overhead of that existed previously when the WHC of the writers stored the data. No longer return topic name and type name in the built-in topics, they have been extracted already and are not accessible through the normal interface but do cause problems when comparing QoS. Signed-off-by: Erik Boasson <eb@ilities.com>	2019-01-09 08:38:56 +01:00

1 2 3 4 5

219 commits