Commit Graph

35 Commits

Author SHA1 Message Date
Till
eb29a31550
Optimize /sync and history visibility (#2961)
Should fix the following issues or make a lot less worse when using
Postgres:

The main issue behind #2911: The client gives up after a certain time,
causing a cascade of context errors, because the response couldn't be
built up fast enough. This mostly happens on accounts with many rooms,
due to the inefficient way we're getting recent events and current state

For #2777: The queries for getting the membership events for history
visibility were being executed for each room (I think 185?), resulting
in a whooping 2k queries for membership events. (Getting the
statesnapshot -> block nids -> actual wanted membership event)

Both should now be better by:
- Using a LATERAL join to get all recent events for all joined rooms in
one go (TODO: maybe do the same for room summary and current state etc)
- If we're lazy loading on initial syncs, we're now not getting the
whole current state, just to drop the majority of it because we're lazy
loading members - we add a filter to exclude membership events on the
first call to `CurrentState`.
- Using an optimized query to get the membership events needed to
calculate history visibility

---------

Co-authored-by: kegsay <kegan@matrix.org>
2023-02-07 14:31:23 +01:00
Till
0d0280cf5f
/sync performance optimizations (#2927)
Since #2849 there is no limit for the current state we fetch to
calculate history visibility. In large rooms this can cause us to fetch
thousands of membership events we don't really care about.
This now only gets the state event types and senders in our timeline,
which should significantly reduce the amount of events we fetch from the
database.

Also removes `MaxTopologicalPosition`, as it is an unnecessary DB call,
given we use the result in `topological_position < $1` calls.
2023-01-17 10:08:23 +01:00
Till
0491a8e343
Fix room summary returning wrong heroes (#2930)
This should fix #2910.
Probably makes Sytest/Complement a bit upset, since this not using
`sort.Strings` anymore.
2023-01-12 10:06:03 +01:00
Neil Alexander
6348486a13
Transactional isolation for /sync (#2745)
This should transactional snapshot isolation for `/sync` etc requests.

For now we don't use repeatable read due to some odd test failures with
invites.
2022-09-30 12:48:10 +01:00
Till
05cafbd197
Implement history visibility on /messages, /context, /sync (#2511)
* Add possibility to set history_visibility and user AccountType

* Add new DB queries

* Add actual history_visibility changes for /messages

* Add passing tests

* Extract check function

* Cleanup

* Cleanup

* Fix build on 386

* Move ApplyHistoryVisibilityFilter to internal

* Move queries to topology table

* Add filtering to /sync and /context
Some cleanup

* Add passing tests; Remove failing tests :(

* Re-add passing tests

* Move filtering to own function to avoid duplication

* Re-add passing test

* Use newly added GMSL HistoryVisibility

* Update gomatrixserverlib

* Set the visibility when creating events

* Default to shared history visibility

* Remove unused query

* Update history visibility checks to use gmsl
Update tests

* Remove unused statement

* Update migrations to set "correct" history visibility

* Add method to fetch the membership at a given event

* Tweaks and logging

* Use actual internal rsAPI, default to shared visibility in tests

* Revert "Move queries to topology table"

This reverts commit 4f0d41be9c194a46379796435ce73e79203edbd6.

* Remove noise/unneeded code

* More cleanup

* Try to optimize database requests

* Fix imports

* PR peview fixes/changes

* Move setting history visibility to own migration, be more restrictive

* Fix unit tests

* Lint

* Fix missing entries

* Tweaks for incremental syncs

* Adapt generic changes

Co-authored-by: Neil Alexander <neilalexander@users.noreply.github.com>
Co-authored-by: kegsay <kegan@matrix.org>
2022-08-11 18:23:35 +02:00
Till
df5d4dc7a3
Delete correct Send-to-Device messages (#2608)
* Add send-to-device tests

* Update tests, fix message deletion

* PR comments
2022-08-02 17:00:16 +02:00
Till
a7e92f8cb9
History visibility database changes (#2533)
* Add new history_visibility column

* Update SQL queries to include history_visibility

* Store the history visibilty calculated by the roomserver

* Update GMSL

* Update migrations

* Fix migration

* Update GMSL

* Fix `go.sum`

* Update GMSL to use sql.Scanner & sql.Valuer

* Re-order migration/table creation

* Update gomatrixserverlib

* Add history_visibility column to current_room_state

* Fix migrations

* Return error instead of Fatal log

Co-authored-by: Neil Alexander <neilalexander@users.noreply.github.com>
2022-07-18 14:46:15 +02:00
Till
09f0ff14c8
Minor SendToDevice fix (#2565)
* Avoid unnecessary marshalling if sending to the local server

* Fix ordering of ToDevice messages

* Revive SendToDevice test
2022-07-12 08:23:58 +02:00
kegsay
6de29c1cd2
bugfix: E2EE device keys could sometimes not be sent to remote servers (#2466)
* Fix flakey sytest 'Local device key changes get to remote servers'

* Debug logs

* Remove internal/test and use /test only

Remove a lot of ancient code too.

* Use FederationRoomserverAPI in more places

* Use more interfaces in federationapi; begin adding regression test

* Linting

* Add regression test

* Unbreak tests

* ALL THE LOGS

* Fix a race condition which could cause events to not be sent to servers

If a new room event which rewrites state arrives, we remove all joined hosts
then re-calculate them. This wasn't done in a transaction so for a brief period
we would have no joined hosts. During this interim, key change events which arrive
would not be sent to destination servers. This would sporadically fail on sytest.

* Unbreak new tests

* Linting
2022-05-17 13:23:35 +01:00
Neil Alexander
4ad5f9c982
Global database connection pool (for monolith mode) (#2411)
* Allow monolith components to share a single database pool

* Don't yell about missing connection strings

* Rename field

* Setup tweaks

* Fix panic

* Improve configuration checks

* Update config

* Fix lint errors

* Update comments
2022-05-03 16:35:06 +01:00
Till
29f2168789
Make /messages filterable (#2347)
* Make /messages filterable
Fix bug when determining if an event contains an URL

* Add newly passing test

* Fix test
2022-04-13 13:16:02 +02:00
kegsay
6d25bd6ca5
syncapi: add more tests; fix more bugs (#2338)
* syncapi: add more tests; fix more bugs

bugfixes:
 - The postgres impl of TopologyTable.SelectEventIDsInRange did not use the provided txn
 - The postgres impl of EventsTable.SelectEvents did not preserve the ordering of the input event IDs in the output events slice
 - The sqlite impl of EventsTable.SelectEvents did not use a bulk `IN ($1)` query.

Added tests:
 - `TestGetEventsInRangeWithTopologyToken`
 - `TestOutputRoomEventsTable`
 - `TestTopologyTable`

* -p 1 for now
2022-04-08 17:53:24 +01:00
kegsay
7499147550
Add test infrastructure code for dendrite unit/integ tests (#2331)
* Add test infrastructure code for dendrite unit/integ tests

Start re-enabling some syncapi storage tests in the process.

* Linting

* Add postgres service to unit tests

* dendrite not syncv3

* Skip test which doesn't work

* Linting

* Add `jetstream.PrepareForTests`

Co-authored-by: Neil Alexander <neilalexander@users.noreply.github.com>
2022-04-08 10:12:30 +01:00
Neil Alexander
b5a8935042
Sync refactor — Part 1 (#1688)
* It's half-alive

* Wakeups largely working

* Other tweaks, typing works

* Fix bugs, add receipt stream

* Delete notifier, other tweaks

* Dedupe a bit, add a template for the invite stream

* Clean up, add templates for other streams

* Don't leak channels

* Bring forward some more PDU logic, clean up other places

* Add some more wakeups

* Use addRoomDeltaToResponse

* Log tweaks, typing fixed?

* Fix timed out syncs

* Don't reset next batch position on timeout

* Add account data stream/position

* End of day

* Fix complete sync for receipt, typing

* Streams package

* Clean up a bit

* Complete sync send-to-device

* Don't drop errors

* More lightweight notifications

* Fix typing positions

* Don't advance position on remove again unless needed

* Device list updates

* Advance account data position

* Use limit for incremental sync

* Limit fixes, amongst other things

* Remove some fmt.Println

* Tweaks

* Re-add notifier

* Fix invite position

* Fixes

* Notify account data without advancing PDU position in notifier

* Apply account data position

* Get initial position for account data

* Fix position update

* Fix complete sync positions

* Review comments @Kegsay

* Room consumer parameters
2021-01-08 16:59:06 +00:00
Neil Alexander
50963b724b
More sane next batch handling, typing notification tweaks, give invites their own stream position, device list fix (#1641)
* Update sync responses

* Fix positions, add ApplyUpdates

* Fix MarshalText as non-pointer, PrevBatch is optional

* Increment by number of read receipts

* Merge branch 'master' into neilalexander/devicelist

* Tweak typing

* Include keyserver position tweak

* Fix typing next position in all cases

* Tweaks

* Fix typo

* Tweaks, restore StreamingToken.MarshalText which somehow went missing?

* Rely on positions from notifier rather than manually advancing them

* Revert "Rely on positions from notifier rather than manually advancing them"

This reverts commit 53112a62cc3bfd9989acab518e69eeb27938117a.

* Give invites their own position, fix other things

* Fix test

* Fix invites maybe

* Un-whitelist tests that look to be genuinely wrong

* Use real receipt positions

* Ensure send-to-device uses real positions too
2020-12-18 11:11:21 +00:00
Neil Alexander
9c03b0a4fa
Refactor sync tokens (#1628)
* Refactor sync tokens

* Comment out broken notifier test

* Update types, sytest-whitelist

* More robust token checking

* Remove New functions for streaming tokens

* Export Logs in StreamingToken

* Fix tests
2020-12-10 18:57:10 +00:00
Neil Alexander
b5aa7ca3ab
Top-level setup package (#1605)
* Move config, setup, mscs into "setup" top-level folder

* oops, forgot the EDU server

* Add setup

* goimports
2020-12-02 17:41:00 +00:00
Neil Alexander
20a01bceb2
Pass pointers to events — reloaded (#1583)
* Pass events as pointers

* Fix lint errors

* Update gomatrixserverlib

* Update gomatrixserverlib

* Update to matrix-org/gomatrixserverlib#240
2020-11-16 15:44:53 +00:00
Kegsay
279044cd90
Add history visibility guards (#1470)
* Add history visibility guards

Default to 'joined' visibility to avoid leaking events, until we get
around to implementing history visibility completely. Related #617

* Don't apply his vis checks on shared rooms

* Fix order of checks

* Linting and remove another misleading check

* Update whitelist
2020-10-02 17:08:13 +01:00
S7evinK
3e01db0049
Fix golangci-lint issues (#1464)
* Fix S1039: unnecessary use of fmt.Sprintf

* Fix S1036: unnecessary guard around map access

Signed-off-by: Till Faelligen <tfaelligen@gmail.com>
2020-10-01 20:00:56 +01:00
Neil Alexander
4b09f445c9
Configuration format v1 (#1230)
* Initial pass at refactoring config (not finished)

* Don't forget current state and EDU servers

* More shifting around

* Update server key API tests

* Fix roomserver test

* Fix more tests

* Further tweaks

* Fix current state server test (sort of)

* Maybe fix appservices

* Fix client API test

* Include database connection string in database options

* Fix sync API build

* Update config test

* Fix unit tests

* Fix federation sender build

* Fix gobind build

* Set Listen address for all services in HTTP monolith mode

* Validate config, reinstate appservice derived in directory, tweaks

* Tweak federation API test

* Set MaxOpenConnections/MaxIdleConnections to previous values

* Update generate-config
2020-08-10 14:18:04 +01:00
Kegsay
0fdd4f14d1
Add support for logs in StreamingToken (#1229)
* Add support for logs in StreamingToken

Tokens now end up looking like `s11_22|dl-0-123|ab-0-12224`
where `dl` and `ab` are log names, `0` is the partition and
`123` and `12224` are the offsets.

* Also test reserialisation

* s/|/./g so tokens url escape nicely
2020-07-29 19:00:04 +01:00
Neil Alexander
b6bc132485
Use TransactionWriter in other component SQLite (#1209)
* Use TransactionWriter on other component SQLites

* Fix sync API tests

* Fix panic in media API

* Fix a couple of transactions

* Fix wrong query, add some logging output

* Add debug logging into StoreEvent

* Adjust InsertRoomNID

* Update logging
2020-07-21 15:48:21 +01:00
Kegsay
4897beabee
Finish implementing retiring invites (#1166)
* Pass retired invites to the syncapi with the event ID of the invite

* Implement retire invite streaming

* Update whitelist
2020-06-26 11:07:52 +01:00
Kegsay
9c77022513
Make userapi responsible for checking access tokens (#1133)
* Make userapi responsible for checking access tokens

There's still plenty of dependencies on account/device DBs, but this
is a start. This is a breaking change as it adds a required config
value `listen.user_api`.

* Cleanup

* Review comments and test fix
2020-06-16 14:10:55 +01:00
Neil Alexander
a5d822004d
Send-to-device support (#1072)
* Groundwork for send-to-device messaging

* Update sample config

* Add unstable routing for now

* Send to device consumer in sync API

* Start the send-to-device consumer

* fix indentation in dendrite-config.yaml

* Create send-to-device database tables, other tweaks

* Add some logic for send-to-device messages, add them into sync stream

* Handle incoming send-to-device messages, count them with EDU stream pos

* Undo changes to test

* pq.Array

* Fix sync

* Logging

* Fix a couple of transaction things, fix client API

* Add send-to-device test, hopefully fix bugs

* Comments

* Refactor a bit

* Fix schema

* Fix queries

* Debug logging

* Fix storing and retrieving of send-to-device messages

* Try to avoid database locks

* Update sync position

* Use latest sync position

* Jiggle about sync a bit

* Fix tests

* Break out the retrieval from the update/delete behaviour

* Comments

* nolint on getResponseWithPDUsForCompleteSync

* Try to line up sync tokens again

* Implement wildcard

* Add all send-to-device tests to whitelist, what could possibly go wrong?

* Only care about wildcard when targeted locally

* Deduplicate transactions

* Handle tokens properly, return immediately if waiting send-to-device messages

* Fix sync

* Update sytest-whitelist

* Fix copyright notice (need to do more of this)

* Comments, copyrights

* Return errors from Do, fix dendritejs

* Review comments

* Comments

* Constructor for TransactionWriter

* defletions

* Update gomatrixserverlib, sytest-blacklist
2020-06-01 17:50:19 +01:00
Neil Alexander
02fe38e1f7
Per-user-per-device sync streams (#1068)
* Per-user-per-device sync streams

* Tweaks

* Tweaks

* Pass full device into CompleteSync

* Set user IDs and device IDs properly in tests

* Add new test, fix TestNewEventAndWasPreviouslyJoinedToRoom

* nolint a function that is not used yet

* Add test for waking up single device

* Hopefully unstick test

* Try to ensure that TestCorrectStreamWakeup doesn't block forever

* Update tests
2020-05-28 10:05:04 +01:00
Kegsay
8db60c90bb
Fix a bug whereby backfilling could leak events across rooms (#1043)
* Fix a bug whereby backfilling could leak events across rooms

Caused by a faulty SQL query. With tests now.

* comment
2020-05-15 16:27:34 +01:00
Kegsay
7ca230e931
Cleanup syncapi topology logic (#1035)
* Cleanup syncapi topology logic

* Variable renaming

* comments
2020-05-14 17:30:16 +01:00
Kegsay
1b34130a5b
Finish merging syncserver.go (#1033)
* Refactor all postgres tables; start work on sqlite

* wip sqlite merges; database is locked errors to investigate and failing tests

* Revert "wip sqlite merges; database is locked errors to investigate and failing tests"

This reverts commit 26cbfc5b75ae2dc4fb31a838b917aa39d758f162.

* convert current room state table

* port over sqlite topology table

* remove a few functions

* remove more functions

* Share more code

* factor out completesync and a bit more

* Remove remaining code
2020-05-14 16:11:37 +01:00
Kegsay
5e9dce1c0c
syncapi: Rename and split out tokens (#1025)
* syncapi: Rename and split out tokens

Previously we used the badly named `PaginationToken` which was
used for both `/sync` and `/messages` requests. This quickly
became confusing because named fields like `PDUPosition` meant
different things depending on the token type. Instead, we now have
two token types: `TopologyToken` and `StreamingToken`, both of
which have fields which make more sense for their specific situations.

Updated the codebase to use one or the other. `PaginationToken` still
lives on as `syncToken`, an unexported type which both tokens rely on.
This allows us to guarantee that the specific mappings of positions
to a string remain solely under the control of the `types` package.
This enables us to move high-level conceptual things like
"decrement this topological token" to function calls e.g
`TopologicalToken.Decrement()`.

Currently broken because `/messages` seemingly used both stream and
topological tokens, though I need to confirm this.

* final tweaks/hacks

* spurious logging

* Review comments and linting
2020-05-13 12:14:50 +01:00
Kegsay
36bbb25561
Fix ordering when backfilling (#1000)
* Fix ordering when backfilling

The problem was that we weren't sorting the returned events
by depth when sending them back to the caller, instead we
were sorting by prev_events which is not the same thing.

* Fixup tests
2020-05-01 16:41:13 +01:00
Kegsay
17e046f18f
Fix prev_batch tokens (#999) 2020-05-01 12:41:38 +01:00
Kegsay
b28674435e
Correctly generate backpagination tokens for events which have the same depth (#996)
* Correctly generate backpagination tokens for events which have the same depth

With tests. Unfortunately the code around here is hard to understand.
There will be a subsequent PR which fixes this up now that we have a test
harness in place.

* Add postgres impl

* More linting

* Fix psql statement so it actually works
2020-05-01 11:01:34 +01:00
Kegsay
ebbfc12592
Add tests for the storage interface (#995)
* Move docs to interface

* Add tests around syncing

* Add topology token test

* Linting
2020-04-30 17:15:29 +01:00