Diffusion 6.8 Release Notes

6.8.10 (11 April 2024)

Fixes in 6.8.10

.NET Client

DIF-1177: Client ignores maximum message size set by user

The Dotnet client ignored the maximum message size set by the user. This has been fixed.

DIF-1196: GetSessionProperties does not throw NoSuchSessionException

Calling GetSessionProperties for a session that has been closed did not throw NoSuchSessionException. This has been fixed.

DIF-238: Authentication Control example throws exception before completing

If SetAuthenticationHandlerAsync is called with await then creating a session throws SessionEstablishmentTransientException before the authenticator is called. This has been fixed.

DIF-63: SessionClosedException thrown when not in closed state

If timeout occurs during reconnection the client can hang. This has been fixed.
A connection attempt does not timeout if there is no response from the server. This has been fixed.
Connection and reconnection failure will transition the session state to CLOSED_FAILED.

FB-30738: Unable to connect successfully through a load balancer

Addresses an issue where the .NET client was unable to connect through a GCP (Google Cloud Platform) load balancer.

Apple Client

DIF-1560: Apple SDK unable to decode 16 Bit Float

The Apple SDK raised "Parsing of type 25 is unhandled." when attempting to parse a 16-bit floating point. This issue has now been resolved.

DIF-1642: iOS SDK Stability of PTDiffusionRequestStreamRegistry and PTDiffusionTopicRoutingStreamRegistry

Several improvements were made to PTDiffusionRequestStreamRegistry and PTDiffusionTopicRoutingStreamRegistry

DIF-1710: iOS crash in PTDiffusionConversation openWithConversationId:

NSInternalInconsistencyException was being raised in PTDiffusionConversation when the session disconnected while the App was in the background. This issue has been resolved.

DIF-779: Missing Session State Listener

The Session State Listener was missing from PTDiffusionSessionConfiguration. This issue has been fixed.

DIF-797: Invalid URL string when opening a session in the Apple SDK raises uncaught NSInvalidArgumentException

When an invalid URL string is passed when opening a session a NSInvalidArgumentException is raised.
In Objective-C we can trap it, but in Swift it’s not possible resulting in an uncaught exception that crashes the application. This has been fixed.

DIF-847: Reconnection Timeout closes successfully recovered connection

When a successful reconnection occurred, the reconnection timeout was not properly cleared and resulted in a disconnection. This issue has been resolved.

Console

DIF-180: Topic views errors do not show the correct column position

The console topic view editor provided incorrect column information when presenting syntax error feedback for topic views.

Java Tests

DIF-1537: Time series topic edit events are not correctly mapped through topic views

Due to a bug in previous releases, time series edit events could fail to be reflected through some types of topic view. The bug has been fixed in this release.

JavaScript Client

DIF-20: Javascript API has no mechanism for setting trusted certificates

Additional options have been added to diffusion.connect, allowing modification of the underlying TLS configuration. The new options are passed directly to tls.createSecureContext()

DIF-479: Shared session has initialising state immediately after connecting

Shared sessions would return the connected session before updating the state to connected. This could lead to a new session to appear to still be in the `initialising` state. The order of internal state updates has been corrected. A new session will now reliably be in `connected` state.

FB-30473: Incorrect close reason type on SessionPropertiesListener.onSessionClose

The TypeScript definition of SessionPropertiesListener.onSessionClose() did not specify the correct type for the close reason. A new ClientCloseReason type has been introduced to the signature of the callback.

Persistence

DIF-1536: Edit events of persisted replicated time series topics are not recovered from file persistence

Due to a bug in previous releases, edit events of replicated time series topics would not be correctly recovered from file persistence. If a cluster was restarted, the edit events and their corresponding original events would be silently discarded. The bug has been fixed in this release.

FB-30716: Topic persistence operations could fail silently

Fixed an issue that could cause commit of topic operations to fail silently.

Python Client

DIF-30: No support for Python 3.10-12

The Python client now supports Python versions 3.8-3.12 on MacOS (arm64 and x86_64), Windows and any x86_64 Linux distribution supported by the Manylinux project (https://github.com/pypa/manylinux).

Replication

DIF-149: Error while handling a multiplexer event - IllegalStateException: Has already connected

In previous release, when new servers joined a cluster, a server could log a stack trace with the error message "IllegalStateException: Has already connected". This is harmless, but overly concerning. From this release, the error message will no longer be logged.

FB-30613: If cluster topology changes during topic removal, topics can be left in a state where they can never be removed

If the cluster topology changes, that is, a server joins or leaves the cluster, in-flight operations can fail with a transient CLUSTER_REPARTITION error. Due to a bug introduced in Diffusion 6.5, topic removal operations that fail if the cluster topology changes can leave the topics in an invalid state such that they can never be removed.

The bug has been fixed in this release.

FB-30699: Failure to start connector because readiness conditions not triggered

Fixed an issue that could result in Diffusion instances joining a cluster failing to reach readiness.

Server

DIF-1023: Corruption of client queue index can cause unnecessary CPU use

Due to a bug in previous releases, long lived sessions that subscribed to many different topics could perform unnecessary processing. The bug has been fixed in this release.

DIF-1574: Memory leak caused by session data retention

Over some time it could be observed that memory usage was increasing and class histograms showed client session-related objects (e.g. ClientInfo and SessionData) numbering far more than the number of connected sessions. This problem has now been resolved.

DIF-1759: Deadlock when session closed prevents multiplexer processing

Due to a bug in previous releases, session close processing could deadlock the server. The bug has been fixed in this release.

DIF-555: Hazelcast Security Vulnerabilities

Due to security vulnerabilities in Hazelcast (detailed in CVE-2022-36437 and CVE-2023-33265), we have upgraded our version from v4.2 to 4.2.8

DIF-917: Server can fail to accept updates at a high rate

Due to a bug in previous releases, a server could fail to accept updates at a high rate (more than tens of thousands of updates a second). The bug has been fixed in this release.

FB-30703: Problems when restarting a cluster member with an out of date persistence file

Fixed an issue that could cause unexpected restarts when an instance with an out-of-date persistence file joined a cluster.

Session Trees

DIF-608: Range queries fail on timeseries topics exposed through session trees

An attempt to perform a range query on a topic exposed by session trees would fail with "Range query failed because topic 'xxx' does not exist.". This has now been resolved so that range queries can be performed on session tree time series topics.

Topic Views

FB-30688: Mapping single value topic to time series topic does not respect specified retained range

Using topic views to map a single value topic to a time series topic would always assume the default retained range (10) regardless of what was specified in the with properties clause. This was because the retained range for time series reference topics is always limited to the source retained range so that cross cluster inconsistencies do not occur. This has now been changed so that when mapping a single value topic to a time series topic any specified retained range is respected rather than assuming the default.
However, there is a separate known issue relating to this type mapping in that using a single value topic mapping to a time series target can lead to inconsistent views of the target topic events across a cluster and therefore such a mapping is not suitable for use in a clustered environment. Even in a single server environment it is worth noting that the retained event range of the target time series topic will not be restored in server restart.

Topics

DIF-849: The UNSUBSCRIBE conflation policy does not handle branch mapping subscriptions correctly

Due to a bug in previous releases, the UNSUBSCRIBE conflation policy did not correctly handle branch mapping subscriptions. This has been fixed in this release.

6.8.9 (19 May 2023)

Improvements in 6.8.9

Java & Android Client

FB-30024: Java Client and Diffusion server now use HTTP/1.0 standard headers

The Java client and the Diffusion server now follow HTTP/1.0 standards, including an ASCII space following the colon in HTTP headers. Other Diffusion clients already follow this practice and are unchanged.

Fixes in 6.8.9

.NET Client

FB-30091: NoReconnection does not correctly disable reconnection

Addresses an issue where setting NoReconnection on a .NET session did not disable reconnection. This has now been resolved.

FB-30310: .NET client incorrectly accepts empty string as valid JSON

The dotnet client incorrectly accepted empty strings as valid JSON. This has now been fixed.

Apple Client

FB-30171: Frequent crash in Diffusion iOS client library

We have addressed a bug that crashed the Diffusion Apple Client in PTDiffusionRequestStreamRegistry.
This issue is now resolved.

C Client

FB-29809: Session properties listener seg faults when an update event is received

Addresses an issue where a C client application could crash (segfault) if a session close event was received in a session properties listener.

Documentation

FB-29952: Documentation incorrectly states that Topic View 'process' conditionals support 'pointer operator pointer'

The documentation for the topic views 'process' transformation incorrectly suggests that a condition can take the form 'pointer operator [pointer/constant]'. This is incorrect as only 'pointer operator constant' is supported. The documentation has been corrected for this release. The ability to use 'pointer operator pointer' will be included in a future release.

Federation

FB-29735: Remote server connections can block indefinitely

Networking issues between clustered Diffusion instances could cause the server to stall.

This has been fixed in this release.

JavaScript Client

FB-29874: MutableRecordModel.add not working as expected

MutableRecordModel.add was not working as expected and records were not added. This has been fixed.

Security

FB-29929: Users with no permissions can view the server logs in the console

Users without adequate permissions were able to view server logs in the Diffusion Management Console. This has now been resolved.

FB-30280: A path permission assignment for a role would sometimes not override permissions inherited from a parent path

Due to a bug in previous releases, if a security role had path permission assignments for paths X and Y, where X is a parent path of Y, the path permission assignments for Y were sometimes not applied correctly.

The bug has been fixed in this release.

Server

FB-29683: Multiplexer stuck processing selection of a missing topic notification handler

Due to a concurrency bug in previous releases, the server could stall while trying to select a control session to handle a request.

The bug has been fixed in this release.

FB-30367: StringIndexOutOfBoundsException in PathSelector.confirmSelects

In certain circumstances, the Diffusion server could report a StringIndexOutOfBoundsException in PathSelector.confirmSelects. This has now been resolved.

Topic Views

FB-29997: Topic view INSERT can incorrectly insert the value of a topic with DONT_RETAIN_VALUE=true

Due to a bug in previous releases, a topic view INSERT transformation could insert a stale value of a stateless topic (that is, a topic with its DONT_RETAIN_VALUE property set to true). This would happen if the topic view was added after the stateless topic, and there were other topic views with topic selectors that referenced the stateless topic.

The bug has been fixed in this release. An INSERT transformation will never insert the value of a stateless topic.

FB-30001: Reference topics can incorrectly use the stale value of a topic with DONT_RETAIN_VALUE=true

Due to a bug in previous releases, topic view could create reference topics using a stale value of a stateless topic (that is, a topic with its DONT_RETAIN_VALUE property set to true). This would happen if the topic view was added after the stateless topic, and there were other topic views with topic selectors that referenced the stateless topic.

The bug has been fixed in this release.

Topics

FB-29992: Subscriptions fail to be re-evaluated if topic selections match topics in the source branch of a removed branch mapping

Due to a bug in previous releases, if a subscription to a topic was redirected by a branch mapping, and the branch mapping was removed, the session may not be resubscribed to a topic at the original topic path. The bug has been fixed in this release.

6.8.8 (17 November 2022)

Fixes in 6.8.8

.NET Client

FB-29317: Topic update performance varies significantly across sessions

The performance of topic updates can vary considerably across sessions. This has been fixed.

Java & Android Client

FB-29569: Client does not mask websocket frames

Implemented section 5.3 of RFC 6455, resolving an issue where intermediate proxies expected client WebSocket frames to have a mask.

Persistence

FB-29587: Persistence restore stops on encountering file inconsistency, leading to loss of topic data

Previously, restore from persistence files would stop on encountering a file inconsistency, possibly due to some file corruption (write errors). This meant that data (topics) after the file inconsistency would be lost. The restore process will now log such file errors but will continue to restore everything after.

6.8.7 (8 September 2022)

Fixes in 6.8.7

C Client

FB-29221: Value stream does not fire on_error when session is closed

An issue was detected when closing a session with attached value streams did not generate an on_error callback indicating the session had been closed. This has been fixed.

JavaScript Client

FB-29271: The promise returned by session.closeSession() will never resolve when closeSession() is called a second time

Previously, the promise returned when calling session.closeSession() on a closed session would never resolve. This has been fixed.

Replication

FB-29228: Replication data corruption on migration (Hazelcast serialisation exception)

In previous releases, due to a replication bug, data shared across a cluster could become corrupted if the cluster topology changed (a server joined or left the cluster). This would be logged to the server log as a HazelcastSerializationException.

The bug has been fixed in this release. HazelcastSerializationExceptions and the associated data corruption should no longer occur.

FB-29246: Removing a non-replicated topic throws a ClusterRepartitionException

Due to bugs in a previous releases, a request to remove a single, unreplicated topic could fail with a ClusterRepartitionException. Additionally, a removal request of a single replicated topic path would always return a removed topic count of 1, even if there was no such topic.

Both bugs have been fixed in this release.

Topics

FB-29159: JSON patch applied to a file recovered topic not persisted correctly

The fix to bug 28940 in release 6.8.4 was incomplete. The fix allowed a JSON patch to be applied to a topic newly recovered from the cluster or file persistence. However, the patch could be persisted incorrectly, preventing the topic from being restored after a further restart or cluster migration.
The bug has been fixed in this release.

6.8.6 (11 August 2022)

Fixes in 6.8.6

C Client

FB-29035: Unable to provide session context before session creation

The C Client was unable to receive as a parameter a user context before the session creation. This has now been addressed.

FB-29052: C library dependency clashes

Since v6.8.0, the C client library has internally packaged OpenSSL (and other third party libraries since before v6.8.0). As a result of this, applications have been unable to link their own versions of these libraries as it would result in symbol clashes when linking the application. This issue has now been addressed.

FB-29060: hash_clear not correctly clearing memory allocations

A double free was possible when using the hash_clear function, and then using hash_free on the same HASH_T *. This issue has now been resolved.

FB-29065: missing argument in on_handler_error handler for session_listener

The callback function on_handler_error on the session listener was not being passed the message that caused the error. This issue has now been addressed.

FB-29066: Buffer read functions not protected against NULL pointer

Certain buf_read* functions would throw a Segmentation Fault if passed a NULL pointer. This issue has now been addressed.

FB-29069: Buffer read functions not protected against storing into a NULL argument

Certain buff_read_* functions would cause a memory error when the assignment variable was NULL. This issue has been resolved.

FB-29089: APR symbol renaming

Since v6.8.0, the C client library has internally packaged APR libraries. As a result of this, applications have been unable to link their own versions of these libraries as it would result in symbol clashes when linking the application. This issue has now been addressed.

Console

FB-28548: Remote servers tab does not appear if license contains FANOUT features

If the Diffusion license in use declares FANOUT_SERVER or FANOUT_CLIENT instead of REMOTE_CONNECTIONS the remote servers tab does not appear in the console even though these feature options are logically equivalent.
This has been fixed so that the remote servers tab will appear if any of the named options are present in the license.

JavaScript Client

FB-29072: Principal cannot be null

Previously, when an empty string or null was passed as the principal when connecting, it was treated as undefined resulting in a connection being attempted without principal AND credentials, even if credentials were explicitly set. Now, an empty string is interpreted as a valid principal and a connection will be attempted with credentials in this case.

System Monitoring/Statistics

FB-29079: Prometheus output contains redundant HELP and TYPE information

Due to a bug in previous releases, the Prometheus HTTP gateway would produce repeated HELP and TYPE lines for each unique combination of dimension labels a metric had. This information was redundant, and caused problems for downstream tools such as New Relic.

The bug has been resolved in this release. Each metric now has a single HELP and TYPE line, regardless of the number of dimensions.

6.8.5 (14 July 2022)

Fixes in 6.8.5

Persistence

FB-28963: Memory Leak : Persistence compaction retains references to removed topic values after removal

A memory leak in persistence file compaction caused references to removed topic values to be retained. This would lead to increasing heap usage which was especially noticeable when using time series topics. This problem has now been resolved.

6.8.4 (7 July 2022)

Fixes in 6.8.4

Server

FB-28897: Removal of a single replicated topic could silently fail

Due to a bug in previous releases, an API operation to remove a single topic could erroneously report success when topic removal failed due to the cluster repartitioning. The bug has been fixed in this release.

Topics

FB-28918: Valid JSON patch can be rejected

Due to a bug in previous releases, the server could incorrectly reject a valid JSON patch. The bug has been fixed in this release.

FB-28940: JSON patch can't be applied to a newly recovered topic

Due to a bug in previous releases, a JSON patch couldn't be applied to topics newly recovered from the cluster or file persistence. Attempting to do so would result fail with a IncompatibleTopicStateException error with a message such as "state FILE cannot have delta applied by APPLICATION".

The bug has been fixed in this release.

6.8.3 (8 June 2022)

Fixes in 6.8.3

C Client

FB-28697: OpenSSL Library failure in the C Client for Windows

The C Client for Windows would fail as the openSSL embedded library was still depending externally on an openSSL DLL.
A fully self contained openSSL library is now embedded in the C Client for Windows.

FB-28801: Usage of reserved C++ keyword in C API

The reserved keyword for C++, export, was being used in the C Client public facing API. This has now been fixed.

Persistence

FB-28837: PersistenceException - Failed to remove REMOVE_TOPIC during compaction

A PersistenceException stating "Failed to remove REMOVE_TOPIC" could occur when compacting persistence files. This led to the compaction stopping, which in turn could lead to disk space filling.
This is likely due to some previous file corruption. The processing has now been changed so that if a remove operation is found for a topic that there is no record of then the compaction will log an error message and proceed.

Server

FB-28651: Server deadlock in DefaultTimeoutSupervisor

In previous releases, a server-side deadlock could occur between the DefaultTimeoutSupervisor thread and a thread from the background thread pool. The only remedy was to restart the server.

The bug has been fixed in this release.

FB-28664: NullpointerException in TopicLoadMessage due to null value provided for reference topic

If a primitive (String, Int64, Double) topic was updated to null (removing its value) and the topic was selected as a source for a topic view, this would cause a NullPointerException to be logged in com.pushtechnology.diffusion.multiplexer.server.subscription.TopicLoadMessage.

This has now been resolved and such an update will now be published to clients as the CBOR null value (hex F6).

FB-28827: NegativeArraySizeException in IBytesOutputStreamImpl

A NegativeArraySizeException in IBytesOutputStreamImpl could occur if a partition was being migrated in a clustered environment and one or more topics within the partition were exceptionally large (most likely time series topics with a large number of events).

The size of data that can be accommodated in the partition log has now been doubled so this is far less likely to happen, however, the partition log size is currently limited by the size of a Java integer.

In the future, the partition log capability will be extended to accommodate such very large values. However, in the meantime, users must be aware that very large values in a single topic could possibly lead to such problems.

FB-28850: NullPointerException in ServerMultiplexerStateImpl when adding topics

Due to a bug in previous releases, an internal subscription index could be corrupted when a session closed. This could cause subscriptions by unrelated sessions to fail, and was apparent from PUSH-000229 or PUSH-000872 messages in the server log. The bug has been fixed in this release.

System Monitoring/Statistics

FB-28771: Correct Prometheus metric name to diffusion_topics_subscriber_update_compressed_bytes

In previous releases, the subscriber_updated_compressed_bytes metric was exported to Prometheus under an incorrect name ("diffusion_topics_subscriber_update_message_bytes"). From this release, the Prometheus metric name has been corrected to "diffusion_subscriber_updated_compressed_bytes".

6.8.2 (27 April 2022)

Fixes in 6.8.2

Federation

FB-28647: Newly connecting remote servers can overload the primary server outbound queues

Previously the use of remote topic views could cause a very large number of subscription and initial value notifications to be queued in the primary server outbound queue. If the queue was of an insufficient size to accommodate these notifications the secondary server would be disconnected. The secondary server would then retry the connection causing the same problem to repeat indefinitely.

This problem has now been mitigated by ensuring that the subscription for each distinct remote topic view is executed in series, so each topic view would only start to subscribe when the previous one was complete. This reduces the number of notifications that need to be queued in the primary server at any one time. It is therefore now within the user's control to divide up the subscriptions in remote topic views and thus control the notification rate. It should be noted that any individual remote topic view should still not request a subscription that will request more notifications (up to 2 per topic) than the primary server's outbound queue is able to cope with.

Persistence

FB-28616: Compaction stopped due to "has invalid timestamp for time series topic" error

In certain circumstances the recovery of persistence files containing time series topics could fail with an error reporting "has invalid timestamp for time series topic" and this in turn could lead to further persistence compaction failures.
This issue has now been resolved.

Server

FB-28539: Failure to update to a topic can be logged twice

In previous releases, a PUSH-000464 message could be logged twice for a topic update failure. The bug has been corrected in this release.

6.8.1 (22 March 2022)

Fixes in 6.8.1

.NET Client

DIF-1177: Client ignores maximum message size set by user

The Dotnet client ignored the maximum message size set by the user. This has been fixed.

DIF-1196: GetSessionProperties does not throw NoSuchSessionException

Calling GetSessionProperties for a session that has been closed did not throw NoSuchSessionException. This has been fixed.

DIF-238: Authentication Control example throws exception before completing

If SetAuthenticationHandlerAsync is called with await then creating a session throws SessionEstablishmentTransientException before the authenticator is called. This has been fixed.

DIF-63: SessionClosedException thrown when not in closed state

If timeout occurs during reconnection the client can hang. This has been fixed.
A connection attempt does not timeout if there is no response from the server. This has been fixed.
Connection and reconnection failure will transition the session state to CLOSED_FAILED.

FB-30738: Unable to connect successfully through a load balancer

Addresses an issue where the .NET client was unable to connect through a GCP (Google Cloud Platform) load balancer.

Apple Client

DIF-1560: Apple SDK unable to decode 16 Bit Float

The Apple SDK raised "Parsing of type 25 is unhandled." when attempting to parse a 16-bit floating point. This issue has now been resolved.

DIF-1642: iOS SDK Stability of PTDiffusionRequestStreamRegistry and PTDiffusionTopicRoutingStreamRegistry

Several improvements were made to PTDiffusionRequestStreamRegistry and PTDiffusionTopicRoutingStreamRegistry

DIF-1710: iOS crash in PTDiffusionConversation openWithConversationId:

NSInternalInconsistencyException was being raised in PTDiffusionConversation when the session disconnected while the App was in the background. This issue has been resolved.

DIF-779: Missing Session State Listener

The Session State Listener was missing from PTDiffusionSessionConfiguration. This issue has been fixed.

DIF-797: Invalid URL string when opening a session in the Apple SDK raises uncaught NSInvalidArgumentException

When an invalid URL string is passed when opening a session a NSInvalidArgumentException is raised.
In Objective-C we can trap it, but in Swift it’s not possible resulting in an uncaught exception that crashes the application. This has been fixed.

DIF-847: Reconnection Timeout closes successfully recovered connection

When a successful reconnection occurred, the reconnection timeout was not properly cleared and resulted in a disconnection. This issue has been resolved.

FB-13630: Apple SDK not available via Swift Package Manager

You can now easily add Apple Diffusion SDK to your Xcode solution via Swift Package Manager, available in https://github.com/pushtechnology/diffusion-swift

Console

DIF-180: Topic views errors do not show the correct column position

The console topic view editor provided incorrect column information when presenting syntax error feedback for topic views.

FB-28368: Cross site scripting vulnerability in the console logs tab

A remote code execution vulnerability in the server logs tab of the Diffusion Management Console has been resolved at this release.

FB-28416: Gateway endpoints tab not visible if no endpoints defined

If a gateway application supported endpoints but none had as yet been defined in the configuration then the endpoints tab was not made visible in the console and so it was not possible to define new endpoints from the console. This has now been resolved.

Environment

FB-28351: Correct parameterisation of Diffusion start scripts

Due to a packaging error, enhancements to the start scripts made under case 28366 were not released in 6.8.0. This has been rectified. The server start scripts can be customised using the additional environment variables DIFFUSION_EXT_DIR, LOG4J_CONFIGURATION, JVM_LOG_DIR, and EXTRA_JAVA_PARAMETERS. See the explanatory comments in the scripts for more details.

Java Tests

DIF-1537: Time series topic edit events are not correctly mapped through topic views

Due to a bug in previous releases, time series edit events could fail to be reflected through some types of topic view. The bug has been fixed in this release.

JavaScript Client

DIF-20: Javascript API has no mechanism for setting trusted certificates

Additional options have been added to diffusion.connect, allowing modification of the underlying TLS configuration. The new options are passed directly to tls.createSecureContext()

DIF-479: Shared session has initialising state immediately after connecting

Shared sessions would return the connected session before updating the state to connected. This could lead to a new session to appear to still be in the `initialising` state. The order of internal state updates has been corrected. A new session will now reliably be in `connected` state.

FB-30473: Incorrect close reason type on SessionPropertiesListener.onSessionClose

The TypeScript definition of SessionPropertiesListener.onSessionClose() did not specify the correct type for the close reason. A new ClientCloseReason type has been introduced to the signature of the callback.

Persistence

DIF-1536: Edit events of persisted replicated time series topics are not recovered from file persistence

Due to a bug in previous releases, edit events of replicated time series topics would not be correctly recovered from file persistence. If a cluster was restarted, the edit events and their corresponding original events would be silently discarded. The bug has been fixed in this release.

FB-30716: Topic persistence operations could fail silently

Fixed an issue that could cause commit of topic operations to fail silently.

Python Client

DIF-30: No support for Python 3.10-12

The Python client now supports Python versions 3.8-3.12 on MacOS (arm64 and x86_64), Windows and any x86_64 Linux distribution supported by the Manylinux project (https://github.com/pypa/manylinux).

Replication

DIF-149: Error while handling a multiplexer event - IllegalStateException: Has already connected

In previous release, when new servers joined a cluster, a server could log a stack trace with the error message "IllegalStateException: Has already connected". This is harmless, but overly concerning. From this release, the error message will no longer be logged.

FB-30613: If cluster topology changes during topic removal, topics can be left in a state where they can never be removed

If the cluster topology changes, that is, a server joins or leaves the cluster, in-flight operations can fail with a transient CLUSTER_REPARTITION error. Due to a bug introduced in Diffusion 6.5, topic removal operations that fail if the cluster topology changes can leave the topics in an invalid state such that they can never be removed.

The bug has been fixed in this release.

FB-30699: Failure to start connector because readiness conditions not triggered

Fixed an issue that could result in Diffusion instances joining a cluster failing to reach readiness.

Server

DIF-1023: Corruption of client queue index can cause unnecessary CPU use

Due to a bug in previous releases, long lived sessions that subscribed to many different topics could perform unnecessary processing. The bug has been fixed in this release.

DIF-1574: Memory leak caused by session data retention

Over some time it could be observed that memory usage was increasing and class histograms showed client session-related objects (e.g. ClientInfo and SessionData) numbering far more than the number of connected sessions. This problem has now been resolved.

DIF-1759: Deadlock when session closed prevents multiplexer processing

Due to a bug in previous releases, session close processing could deadlock the server. The bug has been fixed in this release.

DIF-555: Hazelcast Security Vulnerabilities

Due to security vulnerabilities in Hazelcast (detailed in CVE-2022-36437 and CVE-2023-33265), we have upgraded our version from v4.2 to 4.2.8

DIF-917: Server can fail to accept updates at a high rate

Due to a bug in previous releases, a server could fail to accept updates at a high rate (more than tens of thousands of updates a second). The bug has been fixed in this release.

FB-30703: Problems when restarting a cluster member with an out of date persistence file

Fixed an issue that could cause unexpected restarts when an instance with an out-of-date persistence file joined a cluster.

Session Trees

DIF-608: Range queries fail on timeseries topics exposed through session trees

An attempt to perform a range query on a topic exposed by session trees would fail with "Range query failed because topic 'xxx' does not exist.". This has now been resolved so that range queries can be performed on session tree time series topics.

Topic Views

FB-30688: Mapping single value topic to time series topic does not respect specified retained range

Using topic views to map a single value topic to a time series topic would always assume the default retained range (10) regardless of what was specified in the with properties clause. This was because the retained range for time series reference topics is always limited to the source retained range so that cross cluster inconsistencies do not occur. This has now been changed so that when mapping a single value topic to a time series topic any specified retained range is respected rather than assuming the default.
However, there is a separate known issue relating to this type mapping in that using a single value topic mapping to a time series target can lead to inconsistent views of the target topic events across a cluster and therefore such a mapping is not suitable for use in a clustered environment. Even in a single server environment it is worth noting that the retained event range of the target time series topic will not be restored in server restart.

Topics

DIF-849: The UNSUBSCRIBE conflation policy does not handle branch mapping subscriptions correctly

Due to a bug in previous releases, the UNSUBSCRIBE conflation policy did not correctly handle branch mapping subscriptions. This has been fixed in this release.

6.8.0 (28 February 2022)

Improvements in 6.8.0

.NET Client

FB-26071: New SessionEstablishmentTransientException from SessionFactory.open

A new exception called SessionEstablishmentTransientException has been introduced which can be returned from SessionFactory.Open. This exception indicates a transient failure and the client application can reasonably retry the connection.

Adapters

FB-24656: Changes to Adapter Security Permissions

From this release users will need specific permissions to control adapters (Kafka, CDC, and JMS) and to implement adapters.
A console user that controls adapters will need VIEW_SERVER permissions to view connected adapters and CONTROL_SERVER permissions to manipulate them.
The principal used to implement an adapter will need REGISTER_HANDLER permission.

C Client

FB-27588: Include OpenSSL in C Client

OpenSSL is now internally linked in the Diffusion C Client.

Configuration

FB-27820: Deprecated WhoIs Service has been removed

The deprecated WhoIs service has been removed.

FB-27822: Deprecated store directory removed from PersistenceConfig and Server.xml

The deprecated store-directory item from the persistence element in Server.xsd has been removed along with the corresponding deprecated methods in the PersistenceConfig interface of the server configuration API.

Environment

FB-26833: Additional environment variables for the Diffusion start scripts

From this release, the server start scripts can be customised using the additional environment variables DIFFUSION_EXT_DIR, LOG4J_CONFIGURATION, JVM_LOG_DIR, and EXTRA_JAVA_PARAMETERS. See the explanatory comments in the scripts for more details.

Java Client

FB-28021: New subclasses of SessionSecurityException - AuthenticationException and PermissionsException

SessionSecurityException now has subclasses of AuthenticationException and PermissionsException to allow for differentiation between the two possible reasons for the security exception.

JavaScript Client

FB-27553: All existing members on enum-like types deprecated

All members on enum-like types have been deprecated. Affected types are CloseReason, ErrorReason, UnsubscribeReason, UpdateFailReason, and TopicAddFailReason

Kafka Adapter

FB-27942: Include headers of Kafka records in content of Diffusion topic updates

Headers in Kafka records can now be included in Diffusion updates. In the config, a new config param 'headers' is introduced for 'regexSubscriptions' and 'topicSubscriptions' in publisher. This expects a list of header keys whose values would be looked into in Kafka record and published to Diffusion, together with Key, value and partition details. If "$all" is used a first item in list, all headers will be included.

Logging

FB-27365: Journal feature

The new Diffusion journal allows certain 'actions' to be written out to a log file. Actions that are written contain data about what is being performed, along with which principal performed the action. The journal uses Log4J2 allowing the file output to be configured as required.

Please refer to the user manual for more details on how to configure the journal.

FB-27821: Deprecated Diffusion logging library has been removed

Diffusion uses Log4j2 as its default logging library.

Previous releases included a legacy logging library, which was deprecated in Diffusion 6.4. The legacy library is no longer supported and has been removed from this release.

Python Client

FB-27531: Python Core repo with CBOR and Delta bindings

This new package provides Python bindings for native functionality in the Python Client.

This includes:
1. the CBOR support previously provided by diffusion-cbor
2. Myers-Diff Binary deltas, used for deltas in Diffusion.

At present, we provide binary wheels *only*. We cover all Manylinux platforms, as well as for Python 3.7-3.9 on MacOS (10.14-11.1) and Windows.

Other binaries can be built as required, although the covered platforms should cover the vast majority of use cases.

Replication

FB-28074: Improve memory footprint of cluster partition log compaction

The log compaction process has been tuned, significantly reducing the working memory required to handle a series of large messages sent to a replicated topic.

System Monitoring/Statistics

FB-23364: Topic metric grouping by path segments

A topic metric collector can be configured to partition its results into groups based on topic path. If the new "group by path segments" setting is configured to be a positive number, the metrics will be grouped by path prefix. The setting specifies the number of path segments in the prefix. This avoids the need to create and maintain separate metric collectors for each child path. The setting can be changed using the console or the client API.

In the path a/x, the path segments are "a" and "x". A topic metric collector with the topic selector of ?a// will produce a single set of aggregated metrics for the topics with paths starting a/. If the metric collector is altered to set group by path segments to 2, it will produce separate aggregated metrics for the topic with the path a, for topics with paths starting with a/x, for topics with paths starting a/y, and so on.

See also 27330 for a complementary, separate new setting to limit the number of groups created by a metric collector.

FB-27330: Option to limit the number of groups created by a metric collector

A single metric collector can produce many sets of metrics. For example, a session metric collector can be configured to group by $SessionId, which will create a separate set of metrics for every unique session. Similarly from this release (see 23364), a topic metric collector can group by path segments to create a separate set of metrics for every branch of the topic tree having a unique path prefix with the configured number of segments.

A new "maximum groups" setting has been added to both session and topic metric collectors to place an upper limit on the number of groups created. This provides protection against a metric collector creating an arbitrary number of metric sets, potentially impacting system performance. The setting can be changed using the console or the client API.

Topic Views

FB-20654: New 'process' transformation - providing conditionals and calculations

This release introduces a new feature to 'topic views'.

There is now a new 'process' transformation that can be used within a topic view to perform calculations on fields within JSON input and set the results in the output JSON. Conditional processing is also supported, so it is possible to generate reference topics only if certain conditions (based upon the JSON input) are true. Conditions and calculations can be used together, so it is possible to conditionally set fields in the output based upon calculations performed upon the input.

See the SDK documentation or the user manual for full details of how to use the new 'process' transformation.

FB-27870: New getTopicView method in the TopicViews feature

The TopicViews feature of the Client APIs now has a new getTopicView method allowing a single topic view to be retrieved by name.

Fixes in 6.8.0

.NET Client

FB-27630: Disconnection due to "Http fragmentation and extension not supported"

An issue was identified with the .Net client's handling of partial reads. This has now been addressed.

C Client

FB-27527: hash_num_new is using minimum slots instead of maximum slots provided as parameter

hash_num_new now correctly uses the maximum number of slots when creating a hash map.

Console

FB-17776: Console shows fractional users connected

The Diffusion management console previously displayed some metrics in graphs with unnecessary decimal places. Graphs consisting only of only integer metrics will no longer have fractional ticks on the Y axis.

FB-27314: Unable to set remote server missing topic notification filter through console

The Diffusion management console did not allow the configuration of a topic notification filter while creating remote servers, functionality which was added to the server in Diffusion 6.7.0. This setting can now be configured through the console user interface.

FB-27371: Topic paths with trailing spaces not handled correctly

In the Diffusion management console, there was no provision made for distinguishing topics whose paths differed in trailing whitespace. Behaviour in such cases has been improved.

FB-27377: Topic view editor discards patch when attempting to edit existing topic view

When using the Diffusion management console to view existing topic views with a JSON patch clause, the console could incorrectly display the topic view without the JSON patch clause. This has been fixed.

FB-27860: Console does not allow connection timeout to be specified

The Diffusion management console did not allow a connection timeout to be specified at login. This option has been added.

FB-27913: Nonsense on license page for commercial license

The Diffusion management console's license page could show some contradictory text when deployed with a commercial license. This has been resolved.

Federation

FB-28042: Remote server connection failure to connect stalls multiplexer

A problem with thread locking could cause multiplexers to stall if secondary remote servers fail to connect. This problem has now been resolved.

FB-28180: Inbound threads can be indefinitely blocked by Remote Server API calls

If a remote server connection was blocking for a long time due to other issues then other calls to the remote server feature (create, remove, check, get) could also block inbound threads indefinitely.
This has now been changed so that such calls will time out if unable to proceed due to locks being held by remote server connections.

Java & Android Client

FB-27491: Java examples do not build out of the box

It was not possible to build the Java examples with "mvn package" without first adding a dependency for jackson-annotations. This has now been resolved.

Java Client

FB-27783: Memory leak in Java client on multiple reconnections

Repeated reconnections from the Java client could cause a memory leak of session related objects in the client VM. This has now been resolved.

JavaScript Client

FB-24910: Sessions can reconnect even if explicitly closed by another session

An issue has been resolved where the server allowed clients to reconnect during the reconnection timeout, even if they had been explicitly closed by another session. This would only occur if session replication was enabled.

FB-26449: Unresponsive shared session prevents login

When connecting to a shared session, a timeout has been introduced in the case where the SharedWorker is unresponsive

FB-27510: TypeScript definition for RemoteServerBuilder.missingTopicNotificationFilter doesn't allow null parameter

The documentation of RemoteServerBuilder.missingTopicNotificationFilter states that a null parameter can be used to clear the filter. The TypeScript definition didn't allow null to be passed. Now, the type definition has been updated to allow a null parameter.

FB-27859: Connection timeout not configurable

The JavaScript client was missing an option to specify the connection timeout on establishing a session. This has been rectified.

FB-27884: Authenticator throws 'Cannot read properties of null'

When closing an authenticator, it would throw a "REGISTERED_HANDLER_EXCEPTION TypeError: Cannot read properties of null". This has been fixed.

Kafka Adapter

FB-27187: Time series topic creation does not work in Kafka adapter

Fixed a bug where creating timeseries Diffusion topic was not working when publishing to DIffusion from Kafka Topics.

FB-27260: Editing Diffusion topic related detail in Kafka adapter from console does not work

Fixed a bug where updating Diffusion publisher service configuration during runtime prevented updates to be published to updated Diffusion topics.

Persistence

FB-28159: Persistence restore failure due to file corruption restores no topics

There is the possibility for topic persistence files to become corrupt. The most likely cause of this is some resource issue (memory or disk space) at the time of writing which can lead to a truncated file.
Previously, when restarting a server with such corrupt files the restore would be abandoned, files would be moved to the recovery directory, and the server started with no topics restored.
This fix allows the server to proceed with topics restored so far if a file corruption is detected. The faulty files will still be copied to the recovery directory but the current state already read from files will be written back to the persistence directory as a compacted file.
An error will be logged if this occurs, but as file corruption typically occurs at the end of the persistence files then in most cases this will mean that all topic state at the point of failure, except for the very last write, will be restored successfully.
Files copied to recovery are for diagnostic purposes only and should be manually deleted to save space. However, before deletion, they may be sent to Push Technology support for analysis.

Python Client

FB-27347: Recursively decode Model-based objects

Fixes an issue where some pydantic.BaseModel-based objects were not being fully decoded from the CBOR. This only affected Session Metrics.

FB-27678: python/client-docs/docs/usage.md embedded code is invalid

Updated API Documentation usage example to reflect breaking changes in the API.

Replication

FB-27283: Correct cluster recovery of time-series topics

In previous releases, due to a coding error, re-distributing time series topic data when servers join and leave a cluster did not scale to large numbers of time series events. This could cause protracted instability whenever the cluster topology changed. The problem has been fixed in this release.

FB-27321: "replicated-topics-restored" start condition does not work

Connectors can be configured not to accept connections until a set of conditions is satisfied.

Due to a coding error in previous releases, the "replicated-topics-restored" condition was never triggered. This has been corrected in this release. The condition is satisfied after a server has joined the cluster and received all of the topic data from existing members of the cluster. The server will log a PUSH-000834 INFO message when this happens.

FB-27970: A server joining a stable cluster should not merge topics recovered from file persistence

When servers configured for topic replication first form a cluster, the replicated topics are initialised from the servers persistent files. Each server that forms the initial cluster is responsible for recovering a proportion of the topics. While the cluster continues to run, persistent files are written but not read again.

Due to a bug in previous releases, a server joining a stable cluster could add topics from its persistent files. This bug has been fixed in this release.

FB-27986: Unnecessary assertion from compaction

Due to a bug in previous releases, if the server was run with assertions enabled (-ea), topic replication could fail due to an assertion error during compaction of the persistence log. The bug has been fixed in this release.

FB-28136: Connecting sessions that time out due to Hazelcast blocking never complete and remain in memory and metrics

When running in a cluster connecting sessions could time out if there was never a response from Hazelcast during the connection phase. This led to the session remaining in an unconnected state in the server memory and still showing in session metrics. This problems has now been resolved so that if the Hazelcast interaction does not complete the session closes tidily.

FB-28201: Servers in a cluster are unresponsive when loading a large persistence file

Starting replicated servers in a Diffusion cluster could slow down to such a degree that it would appear that they had completely stalled when restoring from very large persistence files.This was due to unnecessary delta calculations occurring when restoring the cluster.
This has now been resolved.

FB-28266: A server recovering from a persistent file can corrupt a cluster's replicated topic data

Due to a bug in previous releases, a server joining a cluster and recovering topics from a persistent store could corrupt the topic data in the cluster.

This bug had several symptoms, including inconsistencies between the cluster members' topic trees, and internal failures to apply delta updates. (E.g. PUSH-000843 ... state REPLICA cannot have delta applied by REPLICA).

The bug has been fixed in this release.

Security

FB-27968: Setting READ_TOPIC permissions update could fail with an IllegalArgumentException

Due to a bug in previous releases, applying particular combinations of path permission assignments using the security control API could fail with an IllegalArgumentException. The bug has been fixed in this release.

FB-27972: Concurrency issue could lead to a corrupt permissions index

In previous releases, a bug in the code that creates internal index of security permissions could leave the index in a corrupt form. The bug has been fixed in this release.

FB-27976: Upgrade log4j2 to address CVE-2021-44228 security vulnerability

The log4j2 logging library used by Diffusion has been upgraded to version 2.15.0. This addresses a critical security bug [CVS-2021-44228] in log4j2. See https://logging.apache.org/log4j/2.x/security.html for details.

FB-28009: Upgrade log4j2 to address CVE-2021-45046 security vulnerability

The log4j2 logging library used by Diffusion has been upgraded to version 2.17.0. This addresses a critical security bug [CVS-2021-45046] in log4j2. See https://logging.apache.org/log4j/2.x/security.html for details.

FB-28053: Changes to the security configuration fail in a cluster if the security store file is read-only

A bug was introduced in Diffusion 6.6.0 and later releases which corrupted the propagation of security configuration changes across a cluster if the corresponding security store file (SystemAuthentication.store, Security.store) is read-only.

Changing the file permissions so the security store files can be read but not written is supported, and can be useful if a separate mechanism is used to seed security configuration after a cluster is cold-started.

The bug has been fixed in this release. Security store changes are again propagated correctly across the cluster, regardless of whether the security store file is read-only.

Server

FB-27605: Possible leak of sessions (and session metrics) that time out during connection

In certain situations, a client session failing to connect due to a timeout could lead to a memory leak where the server side client object remains. This would also affect metrics as the failed session would remain in the 'open' and 'connected' counts.

This problem has now been resolved.

FB-28247: New subscription inadvertently removed existing subscriptions

Due to a bug in previous releases, the topic selectors maintained by the server for a session could be corrupted by subscription and unsubscription operations. Specifically, the problem could be triggered if a session subscribed to a topic selector with a descendant pattern qualifier ("/", or "//"), for example "?a//", then later redundantly subscribed to a topic selector that is a strict sub-selector of the first one, for example "a/b". The bug could cause another topic selector to be removed in ways that were hard to predict.

The bug has been fixed in this release.

Topic Views

FB-27865: Inserts before patch clauses can cause indeterminate results

It was possible that having an 'insert' clause in a topic view specification before a 'patch' clause could produce indeterminate results and in some situations even lead to orphaned reference topics.
The validation of topic views has now been changed to ensure that any 'insert' clauses happen after 'patch' clauses. A failure will occur when parsing a topic view specification if this is not the case.

Topics

FB-28170: JSON patch exception message is misleading

The error message given by an applyJSONPatch operation or a patch operation within a topic view could be misleading. The message would read 'failed on operation [1] of [2]' if the second operation failed because it was using the index of the failed operation rather than its number.
This has now been changed so that if the second operation fails then it will read 'failed on operation [2] of [2]'.

FB-28309: IllegalArgumentException in TopicTreeNodeImpl

A concurrency bug in previous releases could corrupt topics in the topic tree. One side-effect is that a subsequent attempt to add a topic could fail with an IllegalArgumentException. The bug has been fixed in this release.

Known Issues

Topic Views

DIF-166: Reference topics retained by 'preserve topics' are not persisted across server instances or cluster

A new 'preserve topics' clause was introduced to topic views in release 6.6. This clause means that reference topics created by a view (that have a path dependent upon the source topic value) are retained until the source topic is removed or the topic view is removed. Though this is true in the context of a single server instance, it is not the case if the server is restarted as all such topics created during the previous server instance will be lost. It is also not the case if a new server enters a cluster as the new server will only have reference topics generated from the point in time where it joined the cluster and will not reflect reference topics previously created within other cluster peers.
This issue occurs because reference topics are not persisted, either to file or across the cluster.

DIF-167: Restrictions on mapping single value topics to time series reference topics

The ability to specify a target type in a topic view was introduced in release 6.7.0.
When using this feature to map a single value topic to a time series topic there are the following restrictions.
1) The retained events in the target topic are not replicated across the cluster (as reference topics are not replicated). This means that a new server joining the cluster will not have the same number of retained events as other cluster members. For this reason mapping single value to time series topics should not be used in a clustered environment.
2) Retained events in the target topic are not persisted therefore when a server is restarted the target time series will initially start with a single event and will only grow as the source topic is updated.

The ability to map single-value topics to time series topics will be withdrawn in a future release.