Skip to content

Support TCP for protocol messages#3636

Open
softins wants to merge 41 commits into
jamulussoftware:mainfrom
softins:tcp-protocol
Open

Support TCP for protocol messages#3636
softins wants to merge 41 commits into
jamulussoftware:mainfrom
softins:tcp-protocol

Conversation

@softins
Copy link
Copy Markdown
Member

@softins softins commented Mar 11, 2026

Short description of changes

Support fallback to TCP for protocol messages, in order to overcome potential loss of large messages due to UDP fragmentation. Currently an incomplete draft, for comment as development continues.

CHANGELOG: Client/Server: Support TCP fallback for protocol messages.

Context: Fixes an issue?

Discussed in issue #3242.

Does this change need documentation? What needs to be documented and how?

It will need documentation once design and development are complete. Particularly need to explain the firewall requirements for a server or directory.

Status of this Pull Request

Incomplete, still under development. Main server side complete and working. Client side development in progress. Complete and ready for review and testing. Still marked draft as it needs some of the debug messages to be commented out before merging.

What is missing until this pull request can be merged?

A lot of testing of both server and client. Intended for Jamulus 4.0.0.

Checklist

  • I've verified that this Pull Request follows the general code principles
  • I tested my code and it does what I want
  • My code follows the style guide
  • I waited some time after this Pull Request was opened and all GitHub checks completed without errors.
  • I've filled all the content above

@softins softins added this to the Release 4.0.0 milestone Mar 11, 2026
@softins softins self-assigned this Mar 11, 2026
@softins
Copy link
Copy Markdown
Member Author

softins commented Mar 11, 2026

So far, this implements the server side of the design described here and here

@softins softins force-pushed the tcp-protocol branch 4 times, most recently from 5e1a658 to 0ae51e2 Compare March 16, 2026 13:05
@softins softins linked an issue Mar 16, 2026 that may be closed by this pull request
@softins softins added the feature request Feature request label Mar 16, 2026
@softins softins force-pushed the tcp-protocol branch 3 times, most recently from 7ad1d1f to d939e5b Compare March 26, 2026 17:38
@softins
Copy link
Copy Markdown
Member Author

softins commented Mar 28, 2026

So the next stage of implementation has been achieved: client-side support in the Connect dialog.

  1. If the server list has not been received via UDP when the associated message indicating TCP support has arrived, the client will retry fetching the server list over TCP.
  2. If the client list for a server has not been received via UDP when the associated message indicating TCP support has arrived, the client will retry fetching the client list over TCP, and will continue to use TCP for that server while the Connect dialog is open.
  3. A directory or server that does not have TCP support will not send the TCP supported message, and will continue to be handled as in current versions.
  4. If the server list or client list is successfully received over UDP, there is no need for the client to try TCP.

It has been tested by using nft to drop outbound Jamulus UDP messages with a specific message ID, to simulate loss due to fragmentation.

Examples for a directory-enabled server running on port 22120:

  • drop UDP server list: nft add rule inet filter output udp sport 22120 @ih,16,16 0xee03 drop
  • drop UDP client list: nft add rule inet filter output udp sport 22120 @ih,16,16 0xf503 drop
  • drop UDP "TCP supported" msg: nft add rule inet filter output udp sport 22120 @ih,16,16 0xfb03 drop

Note that nft rules require network byte order (big-endian), but Jamulus IDs are little-endian:

  • CLM_SERVER_LIST = 1006 = 0x03ee => 0xee03 (LE byte order)
  • CLM_RED_SERVER_LIST = 1018 = 0x03fa => 0xfa03 (LE byte order)
  • CLM_CONN_CLIENTS_LIST = 1013 = 0x03f5 => 0xf503 (LE byte order)
  • CLM_TCP_SUPPORTED = 1019 = 0x03fb => 0xfb03 (LE byte order)

@softins
Copy link
Copy Markdown
Member Author

softins commented Mar 28, 2026

The next step is to try implementing the connected-mode TCP described here

@ann0see ann0see self-requested a review April 7, 2026 14:51
Comment thread src/tcpserver.h
Comment thread src/main.cpp
bool bUseTranslation = true;
bool bCustomPortNumberGiven = false;
bool bEnableIPv6 = false;
bool bEnableTcp = false;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we'll have a long time for the 4.0 release, I'd enable it by default soon (of course once we've tested that the basics work)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I disagree. It's a server-only option, and most servers operators will not need to enable TCP support. Only those running large directories or large servers will need to, and they also need to understand and configure their firewall requirements.

TCP support in the client will indeed be enabled by default, but will only take effect when talking to a directory or server that has enabled it.

If a server operator enables TCP without having configured their firewall correctly, client users could have problems as the server would advertise TCP support to the client, but the client could be unable to connect.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we not give an error message or fallback procedure in case the TCP connection timed out?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I'm sure we can. I haven't yet tested that scenario.

But it doesn't negate my view that server-side TCP support needs to be an explicit option.

Comment thread src/connectdlg.cpp Outdated
Comment thread src/connectdlg.cpp Outdated
@ann0see ann0see added the bug Something isn't working label Apr 9, 2026
@github-project-automation github-project-automation Bot moved this to Triage in Tracking Apr 9, 2026
@ann0see ann0see moved this from Triage to In Progress in Tracking Apr 9, 2026
@softins
Copy link
Copy Markdown
Member Author

softins commented Apr 9, 2026

Well I've finished implementing everything I intended to, for directory, server and client, so it's ready for reviewing and trying out, as and when time permits (post 3.12.0).

I have a private directory and server built and running with TCP support, at newjam.softins.co.uk on the standard port 22124.

In order to demonstrate the use of TCP in a new client's connect dialog, it will be necessary to use custom firewall filters on the client end to temporarily drop incoming UDP Jamulus protocol messages containing a server list or connected clients list.

There is full forward and backward compatibility between clients and servers built with TCP support and older versions.

@softins softins marked this pull request as ready for review April 9, 2026 22:48
@softins softins marked this pull request as draft April 10, 2026 06:30
@softins
Copy link
Copy Markdown
Member Author

softins commented Apr 10, 2026

Keeping as draft, because it will need quite a few debug messages removed before merging.

@softins
Copy link
Copy Markdown
Member Author

softins commented May 25, 2026

This is rebased and ready for thorough review now. A few comments:

  • I know there are a lot of commits. I plan to squash to a lot fewer (maybe even one?) before merging, but I thought it might be useful to get the granular view while reviewing.
  • The TCP connections for fetching server lists from directories, and client lists from servers, are only used where UDP has failed. This is only likely to happen with very large lists, and most operators of small directories and servers will never need to specify --enabletcp.
  • A server or directory operator who does specify --enabletcp must also allow the host's firewall to accept TCP connections on the same port as for UDP.
  • The client always has TCP support enabled, and will only try to use TCP when it is enabled on the server.
    • When fetching server or client lists in the Connect Dialog, if the client detects that a list message has probably been lost, it will use a short-lived TCP connection for each request and then close it.
    • When the client establishes an audio connection to a TCP-enabled server, it will establish a long-lived TCP connection, so that the server can reliably send updated client lists to the client during the session. It will close the TCP connection on disconnect.
  • In order to simulate UDP message loss for testing, I have created a repo of test scripts and documentation at https://github.com/softins/jamulus-tcp-tests. As they will likely only be used during review, I didn't want to include them in the main repo, but am happy to do so if considered useful.
  • The Wireshark Jamulus dissector understands the CLM_TCP_SUPPORTED message, but does not yet support dissecting Jamulus messages sent over TCP.

@pljones
Copy link
Copy Markdown
Collaborator

pljones commented May 25, 2026

I'd rather wait now until #3710 is completed, so we can include this new feature in the message being proposed there.

Other than that, I can't see anything wrong with the code: that doesn't mean I fully understand it :).

@softins
Copy link
Copy Markdown
Member Author

softins commented May 25, 2026

I'd rather wait now until #3710 is completed, so we can include this new feature in the message being proposed there.

I disagree. The CLM_TCP_SUPPORTED message is not just to let the client know we support TCP. It also enables the client to know that it didn't receive the expected message via UDP that was sent immediately before it.

(The point is that the client only uses TCP when necessary, not every time in any case)

So I've no problem with using #3710 purely for information that the server supports TCP connection, but that can't replace the CLM_TCP_SUPPORTED mechanism.

So this PR can, or maybe even should, precede #3710, and the latter can then include the fact the feature is enabled in its own information.

@pljones
Copy link
Copy Markdown
Collaborator

pljones commented May 25, 2026

The client always has TCP support enabled, and will only try to use TCP when it is enabled on the server.

  • When fetching server or client lists in the Connect Dialog, if the client detects that a list message has probably been lost, it will use a short-lived TCP connection for each request and then close it.
  • When the client establishes an audio connection to a TCP-enabled server, it will establish a long-lived TCP connection, so that the server can reliably send updated client lists to the client during the session. It will close the TCP connection on disconnect.

OK, I guess I don't follow this. Somewhere (docs/tcp.md or something), could you add an explanation with a flow diagram of how the different scenarios will work, showing protocol messages exchanged and comparing the current flow with the new proposed flow.

(As you've got new files, I'd like them in after the AGPL 3.0+ change, too, so you can be first to put just the AGPL 3.0+ banner on a file to show how it's done.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working feature request Feature request needs documentation PRs requiring documentation changes or additions

Projects

Status: Waiting on Team

Development

Successfully merging this pull request may close these issues.

Support TCP for protocol messages

3 participants