See issue: [#2718](https://github.com/matrix-org/dendrite/issues/2718)
for more details.
The fix assumes that if the number of transaction items are different,
then the txnid should be different.
txnid := OriginalServerTS()_len(transactions)
The case that it doesn't address is if the txnid generated this way is
the same for 2 different batches of events which have the same
OriginalServerTS and the same array length.
Another option:
txnid := OriginalServerTS()_hash(transactions)
Would love to hear other ideas and ways to fix this.
### Pull Request Checklist
* [x ] I have added added tests for PR _or_ I have justified why this PR
doesn't need tests.
* [x ] Pull request includes a [sign
off](https://github.com/matrix-org/dendrite/blob/main/docs/CONTRIBUTING.md#sign-off)
Signed-off-by: `Tak Wai Wong <tak@hntlabs.com>`
Co-authored-by: Tak Wai Wong <tak@hntlabs.com>
This should hopefully fix an entire class of problems where components
downstream from the roomserver (i.e. the sync API) could just lose a
whole bunch of state after a rewrite operation like a federated join.
The root of the bug is that we set `RewritesState` in the output event
which instructs downstream components to purge their copy of any room
state, but then didn't send the entire state snapshot in
`adds_state_event_ids` so the downstream state ends up being incomplete
as a result.
Previously `LoadMembershipAtEvent` would fail if the state before one of
the events was not known, i.e. because it was an outlier. This modifies
it so that it gracefully handles not knowing the state and returns no
memberships instead, so that history visibility doesn't freak out and
kill `/sync` requests dead.
Some tweaks for the send-to-device consumers/producers:
- use `json.RawMessage` without marshalling it first
- try further devices (if available) if we failed to `PublishMsg` in the
producers
- some logging changes (to better debug E2EE issues)
This should avoid unnecessary logging on startup if the migration (were
we need `InsertMigration`) was already executed.
This now checks for "unique constraint errors" for SQLite and Postgres
and fails the startup process if the migration couldn't be manually
inserted for some other reason.
This changes the detection of already executed migrations for the
roomserver state block and keychange refactor. It now uses schema tables
provided by the database engine to check if the column was already
removed. We now also store the migration in the migrations table.
This should stop e.g. Postgres from logging errors like `ERROR: column
"event_nid" does not exist at character 8`.
This PR
- adds tests for `evaluatePushrules`
- removes the need for the UserAPI on the `OutputStreamEventConsumer`
(for easier testing)
- adds a method to get the pushrules from the database
- adds a new default pushrule for `m.reaction` events (and some other
tweaks)
This adds the main component of the fulltext search.
This PR doesn't do anything yet, besides creating an empty fulltextindex
folder if enabled. Indexing events is done in a separate PR.
We were `json.Unmarshal`ing the EDU and `json.Marshal`ing right before
sending the EDU to the stream. Those are now removed and the consumer
does `json.Unmarshal` once.
`If a device list update goes missing, the server resyncs on the next
one` was failing because a previous test would receive a `waitTime` of
1h, resulting in the test timing out.
This now tries to handle the returned errors differently, e.g. by using
the default `waitTime` of 2s. Also doesn't try further users in the
list, if one of the errors would cause a longer `waitTime`.