Diffusion 6.9 Release Notes

6.9.6 (13 March 2024)

Fixes in 6.9.6

Apple Client

DIF-1560: Apple SDK unable to decode 16 Bit Float

The Apple SDK raised "Parsing of type 25 is unhandled." when attempting to parse a 16-bit floating point. This issue has now been resolved.

Persistence

DIF-1536: Edit events of persisted replicated time series topics are not recovered from file persistence

Due to a bug in previous releases, edit events of replicated time series topics would not be correctly recovered from file persistence. If a cluster was restarted, the edit events and their corresponding original events would be silently discarded. The bug has been fixed in this release.

6.9.5 (26 February 2024)

Fixes in 6.9.5

.NET Client

DIF-1177: Client ignores maximum message size set by user

The Dotnet client ignored the maximum message size set by the user. This has been fixed.

DIF-1196: GetSessionProperties does not throw NoSuchSessionException

Calling GetSessionProperties for a session that has been closed did not throw NoSuchSessionException. This has been fixed.

Console

DIF-180: Topic views errors do not show the correct column position

The console topic view editor provided incorrect column information when presenting syntax error feedback for topic views.

Server

DIF-1023: Corruption of client queue index can cause unnecessary CPU use

Due to a bug in previous releases, long lived sessions that subscribed to many different topics could perform unnecessary processing. The bug has been fixed in this release.

DIF-917: Server can fail to accept updates at a high rate

Due to a bug in previous releases, a server could fail to accept updates at a high rate (more than tens of thousands of updates a second). The bug has been fixed in this release.

6.9.4 (13 December 2023)

Fixes in 6.9.4

.NET Client

DIF-238: Authentication Control example throws exception before completing

If SetAuthenticationHandlerAsync is called with await then creating a session throws SessionEstablishmentTransientException before the authenticator is called. This has been fixed.

Apple Client

DIF-779: Missing Session State Listener

The Session State Listener was missing from PTDiffusionSessionConfiguration. This issue has been fixed.

DIF-797: Invalid URL string when opening a session in the Apple SDK raises uncaught NSInvalidArgumentException

When an invalid URL string is passed when opening a session a NSInvalidArgumentException is raised.
In Objective-C we can trap it, but in Swift it’s not possible resulting in an uncaught exception that crashes the application. This has been fixed.

DIF-847: Reconnection Timeout closes successfully recovered connection

When a successful reconnection occurred, the reconnection timeout was not properly cleared and resulted in a disconnection. This issue has been resolved.

Console

DIF-792: Metrics table can fail to pad collector metrics correctly

When a metrics table is displaying metrics from multiple collectors with different numbers of dimensions, rows for those collectors with fewer dimensions are incorrectly shifted left by one or more columns.

JavaScript Client

DIF-479: Shared session has initialising state immediately after connecting

Shared sessions would return the connected session before updating the state to connected. This could lead to a new session to appear to still be in the `initialising` state. The order of internal state updates has been corrected. A new session will now reliably be in `connected` state.

Python Client

DIF-30: No support for Python 3.10-12

The Python client now supports Python versions 3.8-3.12 on MacOS (arm64 and x86_64), Windows and any x86_64 Linux distribution supported by the Manylinux project (https://github.com/pypa/manylinux).

Server

DIF-556: Multiplexers can encounter errors when topics are updated during partition recovery

In a rare situation during cluster reconfiguration an update to a replicated topic could fail to be sent to clients. This has been fixed.

DIF-849: The UNSUBSCRIBE conflation policy does not handle branch mapping subscriptions correctly

Due to a bug in previous releases, the UNSUBSCRIBE conflation policy did not correctly handle branch mapping subscriptions. This has been fixed in this release.

Session Trees

DIF-608: Range queries fail on timeseries topics exposed through session trees

An attempt to perform a range query on a topic exposed by session trees would fail with "Range query failed because topic 'xxx' does not exist.". This has now been resolved so that range queries can be performed on session tree time series topics.

6.9.3 (27 September 2023)

Fixes in 6.9.3

.NET Client

DIF-63: SessionClosedException thrown when not in closed state

If timeout occurs during reconnection the client can hang. This has been fixed.
A connection attempt does not timeout if there is no response from the server. This has been fixed.
Connection and reconnection failure will transition the session state to CLOSED_FAILED.

Server

DIF-513: Update CAS failure after killing cluster member

When using a cluster with a quorum a Diffusion instance could reject further updates if another instance was restarted. This issue has now been resolved.

DIF-547: NullPointerException in RemoteServerManagerImpl.loadRemoteServers

In release 6.9.2 a NullPointerException was reported in RemoteServerManagerImpl.loadRemoteServers. This effectively prevented the use of remote servers. This has been fixed in this release.

6.9.2 (15 September 2023)

Fixes in 6.9.2

.NET Client

FB-30208: SessionIdFromString method is missing

The .NET client can now create a session ID from a string with the SessionIdFromString method (as in the Java client).

FB-30309: .NET client incorrectly accepts empty string as valid JSON

The dotnet client incorrectly accepted empty strings as valid JSON. This has now been fixed.

FB-30739: Unable to connect successfully through a load balancer

Addresses an issue where the .NET client was unable to connect through a GCP (Google Cloud Platform) load balancer.

Apple Client

FB-30273: Frequent crash in Diffusion iOS client library

We have addressed a bug that crashed the Diffusion Apple Client in PTDiffusionRequestStreamRegistry.
This issue is now resolved.

FB-30305: No XCFramework found

Previous implementation of the Swift Package for the Apple Client would result in the error "No XCFramework found". This has now been resolved.

JavaScript Client

DIF-20: Javascript API has no mechanism for setting trusted certificates

Additional options have been added to diffusion.connect, allowing modification of the underlying TLS configuration. The new options are passed directly to tls.createSecureContext()

FB-30472: Incorrect close reason type on SessionPropertiesListener.onSessionClose

The TypeScript definition of SessionPropertiesListener.onSessionClose() did not specify the correct type for the close reason. A new ClientCloseReason type has been introduced to the signature of the callback.

Persistence

FB-30708: Topic persistence operations could fail silently

Fixed an issue that could cause commit of topic operations to fail silently.

Replication

DIF-149: Error while handling a multiplexer event - IllegalStateException: Has already connected

In previous release, when new servers joined a cluster, a server could log a stack trace with the error message "IllegalStateException: Has already connected". This is harmless, but overly concerning. From this release, the error message will no longer be logged.

FB-30612: If cluster topology changes during topic removal, topics can be left in a state where they can never be removed

If the cluster topology changes, that is, a server joins or leaves the cluster, in-flight operations can fail with a transient CLUSTER_REPARTITION error. Due to a bug introduced in Diffusion 6.5, topic removal operations that fail if the cluster topology changes can leave the topics in an invalid state such that they can never be removed.

The bug has been fixed in this release.

FB-30698: Failure to start connector because readiness conditions not triggered

Fixed an issue that could result in Diffusion instances joining a cluster failing to reach readiness.

FB-30712: Hazelcast Security Vulnerabilities

Due to security vulnerabilities in Hazelcast (detailed in CVE-2022-36437 and CVE-2023-33265), we have upgraded our version from v5.1.2 to v5.1.7

Security

FB-30279: A path permission assignment for a role would sometimes not override permissions inherited from a parent path

Due to a bug in previous releases, if a security role had path permission assignments for paths X and Y, where X is a parent path of Y, the path permission assignments for Y were sometimes not applied correctly.

The bug has been fixed in this release.

Server

FB-30366: StringIndexOutOfBoundsException in PathSelector.confirmSelects

In certain circumstances, the Diffusion server could report a StringIndexOutOfBoundsException in PathSelector.confirmSelects. This has now been resolved.

FB-30702: Problems when restarting a cluster member with an out of date persistence file

Fixed an issue that could cause unexpected restarts when an instance with an out-of-date persistence file joined a cluster.

Topic Views

FB-30687: Mapping single value topic to time series topic does not respect specified retained range

Using topic views to map a single value topic to a time series topic would always assume the default retained range (10) regardless of what was specified in the with properties clause. This was because the retained range for time series reference topics is always limited to the source retained range so that cross cluster inconsistencies do not occur. This has now been changed so that when mapping a single value topic to a time series topic any specified retained range is respected rather than assuming the default.
However, there is a separate known issue relating to this type mapping in that using a single value topic mapping to a time series target can lead to inconsistent views of the target topic events across a cluster and therefore such a mapping is not suitable for use in a clustered environment. Even in a single server environment it is worth noting that the retained event range of the target time series topic will not be restored in server restart.

6.9.1 (27 March 2023)

Improvements in 6.9.1

Java & Android Client

FB-30023: Java Client and Diffusion server now use HTTP/1.0 standard headers

The Java client and the Diffusion server now follow HTTP/1.0 standards, including an ASCII space following the colon in HTTP headers. Other Diffusion clients already follow this practice and are unchanged.

Fixes in 6.9.1

.NET Client

FB-30090: NoReconnection does not correctly disable reconnection

Addresses an issue where setting NoReconnection on a .NET session did not disable reconnection. This has now been resolved.

C Client

FB-29808: Session properties listener seg faults when an update event is received

Addresses an issue where a C client application could crash (segfault) if a session close event was received in a session properties listener.

Documentation

FB-29946: Documentation incorrectly states that Topic View 'process' conditionals support 'pointer operator pointer'

The documentation for the topic views 'process' transformation incorrectly suggests that a condition can take the form 'pointer operator [pointer/constant]'. This is incorrect as only 'pointer operator constant' is supported. The documentation has been corrected for this release. The ability to use 'pointer operator pointer' will be included in a future release.

Federation

FB-29734: Remote server connections can block indefinitely

Networking issues between clustered Diffusion instances could cause the server to stall.

This has been fixed in this release.

JavaScript Client

FB-29873: MutableRecordModel.add not working as expected

MutableRecordModel.add was not working as expected and records were not added. This has been fixed.

Replication

FB-29969: Cluster repartitioning during a request to add a topic can cause a the request to hang

If cluster repartitioning occurred while a client was adding a topic then instead of the client being notified with an exception the operation would never complete. This has been fixed.

Security

FB-29927: Users with no permissions can view the server logs in the console

Users without adequate permissions were able to view server logs in the Diffusion Management Console. This has now been resolved.

Topic Views

FB-29996: Topic view INSERT can incorrectly insert the value of a topic with DONT_RETAIN_VALUE=true

Due to a bug in previous releases, a topic view INSERT transformation could insert a stale value of a stateless topic (that is, a topic with its DONT_RETAIN_VALUE property set to true). This would happen if the topic view was added after the stateless topic, and there were other topic views with topic selectors that referenced the stateless topic.

The bug has been fixed in this release. An INSERT transformation will never insert the value of a stateless topic.

FB-30000: Reference topics can incorrectly use the stale value of a topic with DONT_RETAIN_VALUE=true

Due to a bug in previous releases, topic view could create reference topics using a stale value of a stateless topic (that is, a topic with its DONT_RETAIN_VALUE property set to true). This would happen if the topic view was added after the stateless topic, and there were other topic views with topic selectors that referenced the stateless topic.

The bug has been fixed in this release.

Topics

FB-29991: Subscriptions fail to be re-evaluated if topic selections match topics in the source branch of a removed branch mapping

Due to a bug in previous releases, if a subscription to a topic was redirected by a branch mapping, and the branch mapping was removed, the session may not be resubscribed to a topic at the original topic path. The bug has been fixed in this release.

6.9.0 (14 December 2022)

Improvements in 6.9.0

.NET Client

FB-28973: The SendRequestToFilter IFilteredRequestCallback no longer extends IStream

As noted in deprecation notices from release 6.6 the IFilteredRequestCallback interface used in the IMessaging.SendRequestToFilterAsync method no longer inherits from the Callbacks.IStream interface.

Adapters

FB-29528: Bundled Kafka adapter has been removed

The Kafka adapter is not bundled together with the Diffusion server any more. With the release of the Gateway Framework, a new version of the Kafka adapter is available which uses the Framework and is available as a standalone application. See the Gateway framework manual on the website for more detail.

FB-29529: Bundled CDC adapter has been removed

The CDC adapter is not bundled together with the Diffusion server any more. With the release of the Gateway Framework, a new version of the CDC adapter is available which uses the Framework and is available as a standalone application. See the Gateway framework manual on the website for more detail.

C Client

FB-29098: deprecation of on_handler_error handler from session_listener

The callback function on_handler_error on the session listener has been deprecated and will no longer be called by the SDK. The callback function will be removed in the future.

Client

FB-28680: Initial Connection Retry Strategy

Previously, if a Diffusion client failed to connect to a server, it would have to detect a transient connection exception, and then retry the connection. This led to boilerplate code being necessary for all client applications.
With this release, client applications have the ability to define an initial connection retry strategy which allows the client connection to be automatically retried a number of times, or until it succeeds.

FB-29694: New update stream builder

A new update stream builder has been provided and a new method has been added to the topic update feature to create the builder. Previous createUpdateStream methods have now been deprecated.

FB-29695: TopicUpdate - createUpdateStream methods deprecated

The createUpdateStream methods of the TopicUpdate feature have now been deprecated in favor of the new update stream builder.

Configuration

FB-28206: Logs.xml is deprecated

Logs.xml is deprecated and will be removed in a future release. default-log-directory and console-monitored-log should now be set in Server.xml.

FB-28374: The deprecated date formating server API has been removed

The server date formatting API and configuration which were deprecated in Diffusion 6.5.0 have been removed.

FB-28549: New persisted-topics-restored start condition

A new readiness condition has been added to connector configuration which can be used to prevent a connector from starting until persistence restore is complete. This is enabled using the persisted-topics-restored condition in start-conditions in Connectors.xml.

Console

FB-24683: Sorting of the metric tables by collector dimension values

The Diffusion management console now sorts displayed metrics for collector subdimensions, such as those created by the grouping properties of metric collectors, by the values of the subdimension.

FB-28283: Basic support for Topic Views 'process' in Console

The Diffusion management console now has an extra tab in the topic view editor for editing optional process clauses.

Federation

FB-26412: Provide ability for Primary servers to initiate Remote Server connections

Previously remote server connections would be initiated from the secondary server, connecting to a primary server (or cluster) with no need for any configuration at the primary server.

At this release, a new mechanism for the connection of remote servers has been introduced which may be used in situations where inbound connections to back-end (primary) server is not allowed for reasons of security. Using this new mechanism a 'primary initiator' remote server can be configured on the primary server (or cluster) and a 'secondary initiator' remote server configured on the secondary server (or cluster). The primary server (or a single member of a primary cluster) will initiate a connection to the secondary server (or all servers in the secondary cluster) and the secondary acceptor will accept the connection and establish a secure (SSL) connection over the same socket channel and authenticate with the primary server.

Once such a connection has been established the behaviour of the secondary acceptor is the same as if it were a secondary initiator.

Unlike secondary initiators (the previous mechanism) which only maintain a physical connection if there are topic views that use them, a primary initiator/secondary acceptor connection is maintained as long as both servers are running. If the connection is lost the primary initiator will periodically retry the connection(s), unlike the previous mechanism where it was the secondary that retried the connection.

Secondary initiators use the reconnection buffer mechanism to recovery from brief loss of connection so that reference topics at the secondary are not torn down due to such a loss of connection. The nature of primary to secondary connections means that this is not feasible and as such all reference topics are torn down at the secondary when there is any loss of connection.

Installation

FB-21604: Improved installer experience on macos

Installer experience has been greatly improved with the creation of a Swift Package that contains the Diffusion xcframework. This can be found at https://github.com/diffusiondata/diffusion-swift.

Java & Android Client

FB-28972: Messaging Feature : sendRequestToFilter FilteredRequestCallback no longer extends Stream

As noted in deprecation notices from release 6.5 the FilteredRequestCallback interface used in the Messaging.sendRequestToFilter method no longer extends the Stream interface.

JavaScript Client

FB-28975: The onError and onClose methods of the sendRequestToFilter FilteredResponseHandler are removed

The onError and onClose methods of the sendRequestToFilter FilteredResponseHandler were deprecated and have been removed

Licensing

FB-27836: Exceeding licence CPU limit no longer causes multiplexers to reset to 1

Running Diffusion on a machine with more cores than are licensed no longer limits the server to a single multiplexer. Multiplexers are now limited to the number of cores specified in the license or the number of multiplexers specified in the Server.xml, whichever is lower.

Logging

FB-28360: Enable session lock logging in the journal

Users can now specify which session properties should be included in the journal logs using the system property: diffusion.journal.session.properties

this can be one of:

`$all`

Includes all session properties.

`$none`

Only default session properties are included i.e. id, principal, connection type. This is the default if not set.

`$fixed`

All fixed session properties are included.

`$user`

All user properties are included.

`$x,y,z`

A list of session properties to be included. This can include $fixed and $user.

`$not,x,y,z`

A list of session properties which should not be included. This can include $fixed and $user. All other properties will be included.

Persistence

FB-29084: Streaming file compaction

Reduced memory overhead of file persistence.

Python Client

FB-24373: New TopicSpecification class

This adds a TopicSpecification class for describing additional properties of a topic other than the (mandatory) Data Type. It is now preferred to use this when specifying a topic type, e.g. for topic creation, but, where a TopicSpecification is now expected any Diffusion Data Type will be implictly converted to an appropriate TopicSpecification.

There is full documentation in the updated API docs, and we have endeavoured to make relevant parameters as discoverable as possible for each Data Type.

FB-26152: Time Series

The Python client now supports the "time series" topic type. Time series topics are useful for collaborative applications, for example chat rooms. Multiple users can concurrently update a time series topic.
Each event has a value and associated metadata including a sequence number, a timestamp, and author. New subscribers are sent a configurable window of the latest events, followed by new events as they happen. A separate query API allows events to be retrieved from the log maintained by the topic.

FB-26160: New session state listener

The Python client now provides the Session Listener API offered in other Diffusion clients. This allows the user to add listeners to a session which will be called whenever the state of that session changes.

FB-29351: Session Locks

The API for per-session locks from the other clients has been added to the Python client. See diffusion.session.Session.lock().

FB-29567: Improved Update API

The client API now includes an improved 'update' API which allows clients more flexible options when updating topics. It adds conditional updates, non-exclusive update streams and the creation of missing topics through the update API.

FB-29568: Session Lock Constraint

The Python Client now supports topic update constraints on Session Lock ownership. This instructs the server to enforce ownership of a given lock when updating a topic. This constraint type (LockConstraint) can be accessed via diffusion.features.topics.update.constraint_factory.ConstraintFactory().locked.

Security

FB-28390: Cluster communication optionally supports TLS

Communication between servers in a cluster can now be protected using Transport Layer Security.

To achieve this, configure each server as follows. First ensure the replication connector (specified by the <connector> element in Replication.xml) is configured with a key store. Second, ensure the trust store used by the server contains a certificate chain that matches the certificates used by the servers in the cluster. For example, if you are using the provided sample keystore, edit diffusion.sh to add the following to the command line:

-Djavax.net.ssl.trustStore=${DIFFUSION_HOME}/etc/sample.keystore

If the trust store does not contain an appropriate certificate chain, servers will reject connection attempts and log a PUSH-000081 message for the handshake failure. This will happen repeatedly unless the replication connector is configured to require TLS (that is, the <key-store> configuration has its "mandatory" attribute set to true), otherwise communication between servers will fall back to plain connections.

System Monitoring/Statistics

FB-26337: The Prometheus service now supports OpenMetrics

The Prometheus support has been updated to support OpenMetrics (https://github.com/OpenObservability/OpenMetrics).

From this release, the HTTP service uses the Accept request header to determine whether to use the legacy Prometheus 0.0.4 or the OpenMetrics 1.0.0 text format (https://prometheus.io/docs/instrumenting/exposition_formats/).

The support has resulted in some minor changes to the metric names to conform to OpenMetrics conventions. These changes are only present when requesting metrics in the OpenMetrics format.

- All counter metrics now have "_total" suffixes.
- The 'diffusions_topics_count' gauge metric has been renamed to 'diffusion_topics_current'.
- The diffusion_log_events_count' counter metric has been renamed to 'diffusion_log_events'.
- The 'diffusion_release' gauge metric has been renamed to 'diffusion_release_info'.
- The 'diffusion_license' gauge metric has been renamed to 'diffusion_license_info'.

Separately, a bug (28770) has been corrected that renames the 'diffusion_topics_subscriber_update_message_bytes' counter metric to 'diffusion_topics_subscriber_update_compressed_bytes_total'.

Additional information is available to integrations using the OpenMetrics text format:
- Metrics that represent bytes have "units" metadata.
- The 'diffusion_release' and 'diffusion_license' metrics are of type 'info'.

Each counter metric now has a "created" timestamp, available in both Prometheus and OpenMetrics text formats. This is set when the metric is first created and can be used to reconcile metrics across server restarts.

FB-27485: Metrics Improvements

Several new metrics have been added, covering:
* The number of topic values stored
* The memory overhead relating to each remote server
* The number of bytes used for file persistence

The bytes topic metric no longer double-counts bytes shared between a reference topic and a source topic.

Topic metrics can now be grouped by topic view.

Fixes in 6.9.0

.NET Client

FB-29041: Topic update performance varies significantly across sessions

The performance of topic updates can vary considerably across sessions. This has been fixed.

Apple Client

FB-28402: Apple SDK not available via Swift Package Manager

You can now easily add Apple Diffusion SDK to your Xcode solution via Swift Package Manager, available in https://github.com/pushtechnology/diffusion-swift

C Client

FB-28729: OpenSSL Library failure in the C Client for Windows

The C Client for Windows would fail as the openSSL embedded library was still depending externally on an openSSL DLL.
A fully self contained openSSL library is now embedded in the C Client for Windows.

FB-29063: C library dependency clashes

Since v6.8.0, the C client library has internally packaged OpenSSL (and other third party libraries since before v6.8.0). As a result of this, applications have been unable to link their own versions of these libraries as it would result in symbol clashes when linking the application. This issue has now been addressed.

FB-29093: Unable to provide session context before session creation

The C Client was unable to receive as a parameter a user context before the session creation. This has now been addressed.

FB-29094: hash_clear not correctly clearing memory allocations

A double free was possible when using the hash_clear function, and then using hash_free on the same HASH_T *. This issue has now been resolved.

FB-29100: APR symbol renaming

Since v6.8.0, the C client library has internally packaged APR libraries. As a result of this, applications have been unable to link their own versions of these libraries as it would result in symbol clashes when linking the application. This issue has now been addressed.

FB-29293: Value stream does not fire on_error when session is closed

An issue was detected when closing a session with attached value streams did not generate an on_error callback indicating the session had been closed. This has been fixed.

Console

FB-28362: Cross site scripting vulnerability in the console logs tab

A remote code execution vulnerability in the server logs tab of the Diffusion Management Console has been resolved at this release.

FB-28403: Gateway endpoints tab not visible if no endpoints defined

If a gateway application supported endpoints but none had as yet been defined in the configuration then the endpoints tab was not made visible in the console and so it was not possible to define new endpoints from the console. This has now been resolved.

FB-28513: Config in service view does not update when services are paused/resumed

In the Diffusion management console, the gateway service configuration view did not always update when services were paused or resumed. This has been resolved.

FB-28536: Remote servers tab does not appear if license contains FANOUT features

If the Diffusion license in use declares FANOUT_SERVER or FANOUT_CLIENT instead of REMOTE_CONNECTIONS the remote servers tab does not appear in the console even though these feature options are logically equivalent.

This has been fixed so that the remote servers tab will appear if any of the named options are present in the license.

Environment

FB-28366: Correct parameterisation of Diffusion start scripts

Due to a packaging error, enhancements to the start scripts made under case 28366 were not released in 6.8.0. This has been rectified. The server start scripts can be customised using the additional environment variables DIFFUSION_EXT_DIR, LOG4J_CONFIGURATION, JVM_LOG_DIR, and EXTRA_JAVA_PARAMETERS. See the explanatory comments in the scripts for more details.

Federation

FB-28343: Newly connecting remote servers can overload the primary server outbound queues

Previously the use of remote topic views could cause a very large number of subscription and initial value notifications to be queued in the primary server outbound queue. If the queue was of an insufficient size to accommodate these notifications the secondary server would be disconnected. The secondary server would then retry the connection causing the same problem to repeat indefinitely.

This problem has now been mitigated by ensuring that the subscription for each distinct remote topic view is executed in series, so each topic view would only start to subscribe when the previous one was complete. This reduces the number of notifications that need to be queued in the primary server at any one time. It is therefore now within the user's control to divide up the subscriptions in remote topic views and thus control the notification rate. It should be noted that any individual remote topic view should still not request a subscription that will request more notifications (up to 2 per topic) than the primary server's outbound queue is able to cope with.

Java & Android Client

FB-29536: Client does not mask websocket frames

Implemented section 5.3 of RFC 6455, resolving an issue where intermediate proxies expected client WebSocket frames to have a mask.

JavaScript Client

FB-28930: Principal cannot be null

Previously, when an empty string or null was passed as the principal when connecting, it was treated as undefined resulting in a connection being attempted without principal AND credentials, even if credentials were explicitly set. Now, an empty string is interpreted as a valid principal and a connection will be attempted with credentials in this case.

FB-29071: The promise returned by session.closeSession() will never resolve when closeSession() is called a second time

Previously, the promise returned when calling session.closeSession() on a closed session would never resolve. This has been fixed.

Persistence

FB-27940: Compaction stopped due to "has invalid timestamp for time series topic" error

In certain circumstances the recovery of persistence files containing time series topics could fail with an error reporting "has invalid timestamp for time series topic" and this in turn could lead to further persistence compaction failures.
This issue has now been resolved.

FB-28787: PersistenceException - Failed to remove REMOVE_TOPIC during compaction

A PersistenceException stating "Failed to remove REMOVE_TOPIC" could occur when compacting persistence files. This led to the compaction stopping, which in turn could lead to disk space filling.
This is likely due to some previous file corruption. The processing has now been changed so that if a remove operation is found for a topic that there is no record of then the compaction will log an error message and proceed.

FB-28952: Memory Leak : Persistence compaction retains references to removed topic values after removal

A memory leak in persistence file compaction caused references to removed topic values to be retained. This would lead to increasing heap usage which was especially noticeable when using time series topics. This problem has now been resolved.

FB-29572: Persistence restore stops on encountering file inconsistency, leading to loss of topic data

Previously, restore from persistence files would stop on encountering a file inconsistency, possibly due to some file corruption (write errors). This meant that data (topics) after the file inconsistency would be lost. The restore process will now log such file errors but will continue to restore everything after.

Replication

FB-28253: A server recovering from a persistent file can corrupt a cluster's replicated topic data

Due to a bug in previous releases, a server joining a cluster and recovering topics from a persistent store could corrupt the topic data in the cluster.

This bug had several symptoms, including inconsistencies between the cluster members' topic trees, and internal failures to apply delta updates. (E.g. PUSH-000843 ... state REPLICA cannot have delta applied by REPLICA).

The bug has been fixed in this release.

FB-29225: Replication data corruption on migration (Hazelcast serialization exception)

In previous releases, due to a replication bug, data shared across a cluster could become corrupted if the cluster topology changed (a server joined or left the cluster). This would be logged to the server log as a HazelcastSerializationException.

The bug has been fixed in this release. HazelcastSerializationExceptions and the associated data corruption should no longer occur.

FB-29241: Removing a non-replicated topic throws a ClusterRepartitionException

Due to bugs in a previous releases, a request to remove a single, unreplicated topic could fail with a ClusterRepartitionException. Additionally, a removal request of a single replicated topic path would always return a removed topic count of 1, even if there was no such topic.

Both bugs have been fixed in this release.

Server

FB-28193: NullPointerException in ServerMultiplexerStateImpl when adding topics

Due to a bug in previous releases, an internal subscription index could be corrupted when a session closed. This could cause subscriptions by unrelated sessions to fail, and was apparent from PUSH-000229 or PUSH-000872 messages in the server log. The bug has been fixed in this release.

FB-28538: Failure to update to a topic can be logged twice

In previous releases, a PUSH-000464 message could be logged twice for a topic update failure. The bug has been corrected in this release.

FB-28650: Server deadlock in DefaultTimeoutSupervisor

In previous releases, a server-side deadlock could occur between the DefaultTimeoutSupervisor thread and a thread from the background thread pool. The only remedy was to restart the server.

The bug has been fixed in this release.

FB-28663: NullpointerException in TopicLoadMessage due to null value provided for reference topic

If a primitive (String, Int64, Double) topic was updated to null (removing its value) and the topic was selected as a source for a topic view, this would cause a NullPointerException to be logged in com.pushtechnology.diffusion.multiplexer.server.subscription.TopicLoadMessage.

This has now been resolved and such an update will now be published to clients as the CBOR null value (hex F6).

FB-28798: NegativeArraySizeException in IBytesOutputStreamImpl

A NegativeArraySizeException in IBytesOutputStreamImpl could occur if a partition was being migrated in a clustered environment and one or more topics within the partition were exceptionally large (most likely time series topics with a large number of events).

The size of data that can be accommodated in the partition log has now been doubled so this is far less likely to happen, however, the partition log size is currently limited by the size of a Java integer.

In the future, the partition log capability will be extended to accommodate such very large values. However, in the meantime, users must be aware that very large values in a single topic could possibly lead to such problems.

FB-28896: Removal of a single replicated topic could silently fail

Due to a bug in previous releases, an API operation to remove a single topic could erroneously report success when topic removal failed due to the cluster repartitioning. The bug has been fixed in this release.

FB-28916: Valid JSON patch can be rejected

Due to a bug in previous releases, the server could incorrectly reject a valid JSON patch. The bug has been fixed in this release.

FB-29028: Internal server IP addresses exposed to clients in error messages

Internal server IP addresses were being exposed to clients in some error messages. Such IP addresses are now obfuscated in error messages.

FB-29682: Multiplexer stuck processing selection of a missing topic notification handler

Due to a concurrency bug in previous releases, the server could stall while trying to select a control session to handle a request.

The bug has been fixed in this release.

System Monitoring/Statistics

FB-28770: Correct Prometheus metric name to diffusion_topics_subscriber_update_compressed_bytes

In previous releases, the subscriber_updated_compressed_bytes metric was exported to Prometheus under an incorrect name ("diffusion_topics_subscriber_update_message_bytes"). From this release, the Prometheus metric name has been corrected to "diffusion_subscriber_updated_compressed_bytes".

FB-29031: Prometheus output contains redundant HELP and TYPE information

Due to a bug in previous releases, the Prometheus HTTP gateway would produce repeated HELP and TYPE lines for each unique combination of dimension labels a metric had. This information was redundant, and caused problems for downstream tools such as New Relic.

The bug has been resolved in this release. Each metric now has a single HELP and TYPE line, regardless of the number of dimensions.

Topics

FB-28270: IllegalArgumentException in TopicTreeNodeImpl

A concurrency bug in previous releases could corrupt topics in the topic tree. One side-effect is that a subsequent attempt to add a topic could fail with an IllegalArgumentException. The bug has been fixed in this release.

FB-28939: JSON patch can't be applied to a newly recovered topic

Due to a bug in previous releases, a JSON patch couldn't be applied to topics newly recovered from the cluster or file persistence. Attempting to do so would result fail with a IncompatibleTopicStateException error with a message such as "state FILE cannot have delta applied by APPLICATION".

The bug has been fixed in this release.

Known Issues

Topic Views

DIF-166: Reference topics retained by 'preserve topics' are not persisted across server instances or cluster

A new 'preserve topics' clause was introduced to topic views in release 6.6. This clause means that reference topics created by a view (that have a path dependent upon the source topic value) are retained until the source topic is removed or the topic view is removed. Though this is true in the context of a single server instance, it is not the case if the server is restarted as all such topics created during the previous server instance will be lost. It is also not the case if a new server enters a cluster as the new server will only have reference topics generated from the point in time where it joined the cluster and will not reflect reference topics previously created within other cluster peers.
This issue occurs because reference topics are not persisted, either to file or across the cluster.

DIF-167: Restrictions on mapping single value topics to time series reference topics

The ability to specify a target type in a topic view was introduced in release 6.7.0.
When using this feature to map a single value topic to a time series topic there are the following restrictions.
1) The retained events in the target topic are not replicated across the cluster (as reference topics are not replicated). This means that a new server joining the cluster will not have the same number of retained events as other cluster members. For this reason mapping single value to time series topics should not be used in a clustered environment.
2) Retained events in the target topic are not persisted therefore when a server is restarted the target time series will initially start with a single event and will only grow as the source topic is updated.

The ability to map single-value topics to time series topics will be withdrawn in a future release.