Basically enables us to use `test.WithAllDatabases` when testing
internal HTTP APIs, as this would otherwise result in Prometheus
complaining about already registered metric names.
Adds wakeup broadcast handling to the pinecone demos.
This will reset their blacklist status and interrupt any ongoing
federation queue backoffs currently in progress for this peer.
The end result is that any queued events will quickly be sent to the
peer if they had disconnected while attempting to send events to them.
Adds `PUT
/_matrix/client/v3/directory/list/appservice/{networkId}/{roomId}` and
`DELTE
/_matrix/client/v3/directory/list/appservice/{networkId}/{roomId}`
support, as well as the ability to filter `/publicRooms` on networkID
and including all networks.
This is a refactor of the federation destination queues.
It fixes a few things, namely:
- actually retry outgoing events with backoff behaviour
- obtain enough events from the database to fill messages as much as
possible
- minimize the amount of running goroutines
- use pure timers for backoff
- don't restart queue unless necessary
- close the background task when backing off
- increase max edus in a transaction to match the spec
- cleanup timers more aggresively to reduce memory usage
- add jitter to backoff timers to reduce resource spikes
- add a bunch of tests (with real and fake databases) to ensure
everything is working
This fixes some edge cases where federation queue backoffs and
blacklisting weren't behaving as expected.
It also adds new tests for the federation queues to ensure their
behaviour continues to work correctly.
This ensures that the joined hosts in the federation API are correct
after the state is rewritten. This might fix some races around the time
of joining federated rooms.
If the private key file is lost, it's often possible to retrieve the
public key from another server elsewhere, so we should make it possible
to configure it in that way.
Some tweaks for the send-to-device consumers/producers:
- use `json.RawMessage` without marshalling it first
- try further devices (if available) if we failed to `PublishMsg` in the
producers
- some logging changes (to better debug E2EE issues)
We were `json.Unmarshal`ing the EDU and `json.Marshal`ing right before
sending the EDU to the stream. Those are now removed and the consumer
does `json.Unmarshal` once.
`If a device list update goes missing, the server resyncs on the next
one` was failing because a previous test would receive a `waitTime` of
1h, resulting in the test timing out.
This now tries to handle the returned errors differently, e.g. by using
the default `waitTime` of 2s. Also doesn't try further users in the
list, if one of the errors would cause a longer `waitTime`.
This makes the following changes:
* The various `Defaults` functions are now responsible for setting sane defaults if `generate` is specified, rather than hiding them in `generate-config`
* Some configuration options have been marked as `omitempty` so that they don't appear in generated configs unnecessarily (monolith-specific vs. polylith-specific options)
* A new option `-polylith` has been added to `generate-config` to create a config that makes sense for polylith deployments (i.e. including the internal/external API listeners and per-component database sections)
* A new option `-normalise` has been added to `generate-config` to take an existing file and add any missing options and/or defaults