Replaced with types.HeaderedEvent _for now_. In reality we want to move
them all to gmsl.Event and only use HeaderedEvent when we _need_ to
bundle the version/event ID with the event (seriailsation boundaries,
and even then only when we don't have the room version).
Requires https://github.com/matrix-org/gomatrixserverlib/pull/373
Should fix the following issues or make a lot less worse when using
Postgres:
The main issue behind #2911: The client gives up after a certain time,
causing a cascade of context errors, because the response couldn't be
built up fast enough. This mostly happens on accounts with many rooms,
due to the inefficient way we're getting recent events and current state
For #2777: The queries for getting the membership events for history
visibility were being executed for each room (I think 185?), resulting
in a whooping 2k queries for membership events. (Getting the
statesnapshot -> block nids -> actual wanted membership event)
Both should now be better by:
- Using a LATERAL join to get all recent events for all joined rooms in
one go (TODO: maybe do the same for room summary and current state etc)
- If we're lazy loading on initial syncs, we're now not getting the
whole current state, just to drop the majority of it because we're lazy
loading members - we add a filter to exclude membership events on the
first call to `CurrentState`.
- Using an optimized query to get the membership events needed to
calculate history visibility
---------
Co-authored-by: kegsay <kegan@matrix.org>
This is apparently some incorrect behaviour that we built as a result of
a spec bug (matrix-org/matrix-spec#1314) where we were applying a filter
to the `"state"` section of the `/sync` response incorrectly. The client
then has no way to know that the state was limited.
This PR removes the state limiting, which probably also helps #2842.
This fixes a temporary workaround with the `selectEventsWithEventIDsSQL`
queries where fields need to be artificially added to the queries so the
row results match the format of the `syncapi_output_room_events` table.
I made similar functions that accept row results from the
`syncapi_current_room_state` table and convert them into StreamEvents
without the fields that are specific to output room events.
There is also a unit test in the first commit to ensure the resulting
behavior doesn't change from the modified queries and functions.
Fixes#601.
### Pull Request Checklist
<!-- Please read docs/CONTRIBUTING.md before submitting your pull
request -->
* [x] I have added tests for PR _or_ I have justified why this PR
doesn't need tests.
* [x] Pull request includes a [sign
off](https://github.com/matrix-org/dendrite/blob/main/docs/CONTRIBUTING.md#sign-off)
Signed-off-by: `Ashley Nelson <fant@shley.email>`
Co-authored-by: Neil Alexander <neilalexander@users.noreply.github.com>
Recently I have observed that dendrite spends a lot of time (~390s) in
`selectRoomIDsWithAnyMembershipSQL` query
```
dendrite_syncapi=# select total_exec_time, left(query,100) from pg_stat_statements order by total_exec_time desc limit 5 ;
total_exec_time | left
--------------------+------------------------------------------------------------------------------------------------------
747826.5800519128 | SELECT event_id, id, headered_event_json, session_id, exclude_from_sync, transaction_id, history_vis
389130.5490339942 | SELECT DISTINCT room_id, membership FROM syncapi_current_room_state WHERE type = $2 AND state_key =
376104.17514700035 | SELECT psd.datname, xact_commit, xact_rollback, blks_read, blks_hit, tup_returned, tup_fetched, tup_
363644.164092031 | SELECT event_type_nid, event_state_key_nid, event_nid FROM roomserver_events WHERE event_nid = ANY($
58570.48104699995 | SELECT event_id, headered_event_json FROM syncapi_current_room_state WHERE room_id = $1 AND ( $2::te
(5 rows)
```
Explain analyze showed correct usage of `syncapi_room_state_unique`
index:
```
dendrite_syncapi=#
explain analyze SELECT distinct room_id, membership FROM syncapi_current_room_state WHERE type = 'm.room.member' AND state_key = '@qjfl:dendrite.stg.globekeeper.com';
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Unique (cost=2749.38..2749.56 rows=24 width=52) (actual time=2.933..2.956 rows=65 loops=1)
-> Sort (cost=2749.38..2749.44 rows=24 width=52) (actual time=2.932..2.937 rows=65 loops=1)
Sort Key: room_id, membership
Sort Method: quicksort Memory: 34kB
-> Index Scan using syncapi_room_state_unique on syncapi_current_room_state (cost=0.41..2748.83 rows=24 width=52) (actual time=0.030..2.890 rows=65 loops=1)
Index Cond: ((type = 'm.room.member'::text) AND (state_key = '@qjfl:dendrite.stg.globekeeper.com'::text))
Planning Time: 0.140 ms
Execution Time: 2.990 ms
(8 rows)
```
Multi-column indexes in Postgres shall perform well for leftmost
columns, but I gave it a try and created
`syncapi_current_room_state_type_state_key_idx` index. I could observe
significant performance improvement. Execution time dropped from 2.9 ms
to 0.24 ms:
```
explain analyze SELECT distinct room_id, membership FROM syncapi_current_room_state WHERE type = 'm.room.member' AND state_key = '@qjfl:dendrite.stg.globekeeper.com';
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------------
Unique (cost=96.46..96.64 rows=24 width=52) (actual time=0.199..0.218 rows=65 loops=1)
-> Sort (cost=96.46..96.52 rows=24 width=52) (actual time=0.199..0.202 rows=65 loops=1)
Sort Key: room_id, membership
Sort Method: quicksort Memory: 34kB
-> Bitmap Heap Scan on syncapi_current_room_state (cost=4.53..95.91 rows=24 width=52) (actual time=0.048..0.139 rows=65 loops=1)
Recheck Cond: ((type = 'm.room.member'::text) AND (state_key = '@qjfl:dendrite.stg.globekeeper.com'::text))
Heap Blocks: exact=59
-> Bitmap Index Scan on syncapi_current_room_state_type_state_key_idx (cost=0.00..4.53 rows=24 width=0) (actual time=0.037..0.037 rows=65 loops=1)
Index Cond: ((type = 'm.room.member'::text) AND (state_key = '@qjfl:dendrite.stg.globekeeper.com'::text))
Planning Time: 0.236 ms
Execution Time: 0.242 ms
(11 rows)
```
Next improvement is skipping DISTINCT and rely on map assignment in
`SelectRoomIDsWithAnyMembership`. Execution time drops by almost half:
```
explain analyze SELECT room_id, membership FROM syncapi_current_room_state WHERE type = 'm.room.member' AND state_key = '@qjfl:dendrite.stg.globekeeper.com';
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on syncapi_current_room_state (cost=4.53..95.91 rows=24 width=52) (actual time=0.032..0.113 rows=65 loops=1)
Recheck Cond: ((type = 'm.room.member'::text) AND (state_key = '@qjfl:dendrite.stg.globekeeper.com'::text))
Heap Blocks: exact=59
-> Bitmap Index Scan on syncapi_current_room_state_type_state_key_idx (cost=0.00..4.53 rows=24 width=0) (actual time=0.021..0.021 rows=65 loops=1)
Index Cond: ((type = 'm.room.member'::text) AND (state_key = '@qjfl:dendrite.stg.globekeeper.com'::text))
Planning Time: 0.087 ms
Execution Time: 0.136 ms
(7 rows)
```
In our env we spend only 1s on inserting to table, so the write penalty
of creating an index should be small.
```
dendrite_syncapi=# select total_exec_time, left(query,100) from pg_stat_statements where query like '%INSERT%syncapi_current_room_state%' order by total_exec_time desc;
total_exec_time | left
--------------------+------------------------------------------------------------------------------------------------------
1139.9057619999971 | INSERT INTO syncapi_current_room_state (room_id, event_id, type, sender, contains_url, state_key, he
(1 row)
```
This PR does not require test modifications.
### Pull Request Checklist
<!-- Please read docs/CONTRIBUTING.md before submitting your pull
request -->
* [x] I have added added tests for PR _or_ I have justified why this PR
doesn't need tests.
* [x] Pull request includes a [sign
off](https://github.com/matrix-org/dendrite/blob/main/docs/CONTRIBUTING.md#sign-off)
Signed-off-by: `Piotr Kozimor <p1996k@gmail.com>`
* Fix query issue, only add "changed" users if we actually share a room
* Avoid log spam if context is done
* Undo changes to filterSharedUsers
* Add logging again..
* Fix SQLite shared users query
* Change query to include invited users
* Add new db migration
* Update migrations
Remove goose
* Add possibility to test direct upgrades
* Try to fix WASM test
* Add checks for specific migrations
* Remove AddMigration
Use WithTransaction
Add Dendrite version to table
* Fix linter issues
* Update tests
* Update comments, outdent if
* Namespace migrations
* Add direct upgrade tests, skipping over one version
* Split migrations
* Update go version in CI
* Fix copy&paste mistake
* Use contexts in migrations
Co-authored-by: kegsay <kegan@matrix.org>
Co-authored-by: Neil Alexander <neilalexander@users.noreply.github.com>
* Add function to the sync API storage package for filtering shared users
* Use the database instead of asking the RS API
* Fix unit tests
* Fix map handling in `filterSharedUsers`
* Only load members of newly joined rooms
* Comment that the query is prepared at runtime
Co-authored-by: Neil Alexander <neilalexander@users.noreply.github.com>
* Don't request state events if we already have the timeline events (Postgres only)
* Rename variable
* nocyclo
* Add SQLite
* Tweaks
* Revert query change
* Don't dedupe if asking for full state
* Update query
* Don't bail so quickly in fetchMissingStateEvents
* Don't recalculate event IDs so often in sync API
* Add comments
* Fix comments
* Update to matrix-org/gomatrixserverlib@eb6a890
* SendEventWithState events as new
* Use cumulative state IDs for final event
* Error wrapping in calculateAndSetState
* Handle overwriting same event type and state key
* Hacky way to spot historical events
* Don't exclude from sync
* Don't generate output events when rewriting forward extremities
* Update output event check
* Historical output events
* Define output room event type
* Notify key changes on state
* Don't send our membership event twice
* Deduplicate state entries
* Tweaks
* Remove unnecessary nolint
* Fix current state upsert in sync API
* Send auth events as outliers, state events as rewrite
* Sync API don't consume state events
* Process events actually
* Improve outlier check
* Fix local room check
* Remove extra room check, it seems to break the whole damn world
* Fix federated join check
* Fix nil pointer exception
* Better comments on DeduplicateStateEntries
* Reflow forced federated joins
* Don't force federated join for possibly even local invites
* Comment SendEventWithState better
* Rewrite room state in sync API storage
* Add TODO
* Clean up all room data when receiving create event
* Don't generate output events for rewrites, but instead notify that state is rewritten on the final new event
* Rename to PurgeRoom
* Exclude backfilled messages from /sync
* Split out rewriting state from updating state from state res
Co-authored-by: Kegan Dougal <kegan@matrix.org>
* Refactor all postgres tables; start work on sqlite
* wip sqlite merges; database is locked errors to investigate and failing tests
* Revert "wip sqlite merges; database is locked errors to investigate and failing tests"
This reverts commit 26cbfc5b75ae2dc4fb31a838b917aa39d758f162.
* convert current room state table
* port over sqlite topology table
* remove a few functions
* remove more functions
* Share more code
* factor out completesync and a bit more
* Remove remaining code
* Use HeaderedEvent in syncapi
* Update notifier test
* Fix persisting headered event
* Clean up unused API function
* Fix overshadowed err from linter
* Write headered JSON to invites table too
* Rename event_json to headered_event_json in syncapi database schemae
* Fix invites_table queries
* Update QueryRoomVersionCapabilitiesResponse comment
* Fix syncapi SQLite
* Upgrade gomatrixserverlib dependency
Signed-off-by: Thibaut CHARLES cromfr@gmail.com
* Added missing passing sytest
Signed-off-by: Thibaut CHARLES cromfr@gmail.com
* Fix login using identifier key
Not a full fix, it only really supports logging in with
the localpart of an mxid.
Signed-off-by: Serra Allgood <serra@allgood.dev>
* Replace deprecated prometheus.InstrumentHandler and unsafe time.Ticker
* goimports
* re-add temporarily missing deps?
* Refactor InstrumentHandlerCounter definition
* URL decode args
* Return server names (#833)
* Remove unnecessary map->array processing
* Return server names in room federation directory query
* Knock off a TODO
* Fix /send_join and /send_leave (#821)
Fix the /send_join and /send_leave endpoints, so that they use the v2 endpoints as mandated by MSC1802. Also comment out the SyTest tests that are failing because of lack of support for the v1 endpoints.
* Refuse /send_join without m.room.create (#824)
Signed-off-by: Abhishek Kumar <abhishekkumar2718@gmail.com>
* AS should use the v1 endpoint, rather than r0 (#827)
* docker: Passthrough parameters to dendrite-monolith-server
* Fix copy & paste error (#812)
* Use gomatrixserverlib.Transaction instead of local type (#590) (#811)
* Move files back if linting fails (#810)
* replaced gometalinter description with golangci-lint (#837)
* Amend syncapi SQL queries to return missing columns (#840)
* This commit updates a couple of the syncapi SQL queries to return additional columns that are required/expected by rowsToStreamEvents in output_room_events_table.go.
It's not exactly clear to me yet what transaction_id and session_id do, but these being added n #367 results in state events breaking the /sync endpoint.
This is a temporary fix. We need to come up with a better solution.
* gomatrix to gomatrixserverlib on some weird line change
* Tweaks from @babolivier review comments
* Implement storage interfaces (#841)
* Implement interfaces for federationsender storage
* Implement interfaces for mediaapi storage
* Implement interfaces for publicroomsapi storage
* Implement interfaces for roomserver storage
* Implement interfaces for syncapi storage
* Implement interfaces for keydb storage
* common.PartitionStorer in publicroomsapi interface
* Update copyright notices
* make cmd directory path absolute in build.sh (#830)
* Resync testfile with current sytest pass/fail (#832)
* Resync testfile with current sytest pass/fail
* Add displayname test
* Fall back to postgres when database connection string parsing fails (#842)
* Fall back to postgres when parsing the database connection string for a URI schema fails
* Fix behaviour so that it really tries postgres when URL parsing fails and it complains about unknown schema if it succeeds
* Fix#842
* Fix#842 - again...
* Federation fixes (#845)
* Update gomatrixserverlib to p2p commit 92c0338, other tweaks
* Update gomatrixserverlib to p2p commit e5dcc65
* Rewrite getAuthChain
* Update gomatrixserverlib in go.mod/go.sum
* Correct a couple of package refs for updated gmsl/gomatrix
* Update gomatrixserverlib ref in go.mod/go.sum
* Update getAuthChain comments following @babolivier review
* Add a Sytest blacklist file (#849)
* Add more passing tests to the testfile, add test blacklist file (#848)
* CS API: Support for /messages, fixes for /sync (#847)
* Merge forward
* Tidy up a bit
* TODO: What to do with NextBatch here?
* Replace SyncPosition with PaginationToken throughout syncapi
* Fix PaginationTokens
* Fix lint errors
* Add a couple of missing functions into the syncapi external storage interface
* Some updates based on review comments from @babolivier
* Some updates based on review comments from @babolivier
* argh whitespacing
* Fix opentracing span
* Remove dead code
* Don't overshadow err (fix lint issue)
* Handle extremities after inserting event into topology
* Try insert event topology as ON CONFLICT DO NOTHING
* Prevent OOB error in addRoomDeltaToResponse
* Thwarted by gocyclo again
* Fix NewPaginationTokenFromString, define unit test for it
* Update pagination token test
* Update sytest-whitelist
* Hopefully fix some of the sync batch tokens
* Remove extraneous sync position func
* Revert to topology tokens in addRoomDeltaToResponse etc
* Fix typo
* Remove prevPDUPos as dead now that backwardTopologyPos is used instead
* Fix selectEventsWithEventIDsSQL
* Update sytest-blacklist
* Update sytest-whitelist
* Some fixes for #847 (#850)
* Fix a couple of cases where backfilling events we already had causes panics, hopefully fix ordering of events, update GMSL dependency for backfill URL fixes
* Remove commented out lines from output_room_events_table schema
* Wire up publicroomsapi for roomserver events (#851)
* Wire up publicroomsapi to roomserver events
* Remove parameter that was incorrectly brought over from p2p work
* nolint containsBackwardExtremity for now
* Store our own keys in the keydb (#853)
* Store our own keys in the keydb
The DirectKeyFetcher makes the assumption that you can always reach the key/v2/server endpoint of any server, including our own. We previously haven't bothered to store our own keys in the keydb so this would mean we end up making key requests to ourselves.
In the libp2p world as an example, self-dialling is not possible, therefore this would render it impossible to get our own keys.
This commit adds our own keys into the keydb so that we don't create unnecessarily (and maybe impossible) requests.
* Use golang.org/x/crypto/ed25519 instead of crypto/ed25519 for pre-Go 1.13
* More sync fixes (#854)
* Further sync tweaks
* Remove unnecessary blank line
* getBackwardTopologyPos always returns a usable value
* Revert order fixing
* Implement GET endpoints for account_data in clientapi (#861)
* Implement GET endpoints for account_data in clientapi
* Fix accountDB parameter
* Remove fmt.Println
* Add empty push rules into account data on account creation (#862)
* Handle kind=guest query parameter on /register (#860)
* Handle kind=guest query parameter on /register
* Reorganized imports
* Pass device_id as nil
* Added tests to systest-whitelist
* Update sytest-whitelist
* Blacklist 'displayname updates affect room member events' (#859)
* Room version abstractions (#865)
* Rough first pass at adding room version abstractions
* Define newer room versions
* Update room version metadata
* Fix roomserver/versions
* Try to fix whitespace in roomsSchema
* Implement room version capabilities in CS API (#866)
* Add wiring for querying the roomserver for the default room version
* Try to implement /capabilities for room versions
* Update copyright notices
* Update sytests, add /capabilities endpoint into CS API
* Update sytest-whitelist
* Add GetDefaultRoomVersion
* Fix cases where state package was shadowed
* Fix version formatting
* Update Dockerfile to Go 1.13.6
* oh yes types I remember
* And fix the default too
* Update documentation for Go 1.13 (#867)
* Pass cfg by reference around the codebase (#819)
* Pass cfg by reference around the codebase
* Merge branch 'master' into pass-cfg-by-ref
Co-authored-by: Neil Alexander <neilalexander@users.noreply.github.com>
* Implement missing device management features (#835)
* Implement missing device management features
Signed-off-by: Till Faelligen <tfaelligen@gmail.com>
* Add a little more documentation
* Undo changes
* Use non-anonymous struct to decode devices list
* Update sytest-whitelist
* Update sytest-whitelist
* Update sytest-blacklist
Co-authored-by: Neil Alexander <neilalexander@users.noreply.github.com>
* Adding sslmode: disable to sytest server config (#813)
Co-authored-by: Neil Alexander <neilalexander@users.noreply.github.com>
* Fix AppService bind addrs in test (#805)
Co-authored-by: Neil Alexander <neilalexander@users.noreply.github.com>
* Always defer *sql.Rows.Close and consult with Err (#844)
* Always defer *sql.Rows.Close and consult with Err
database/sql.Rows.Next() makes sure to call Close only after exhausting
result rows which would NOT happen when returning early from a bad Scan.
Close being idempotent makes it a great candidate to get always deferred
regardless of what happens later on the result set.
This change also makes sure call Err() after exhausting Next() and
propagate non-nil results from it as the documentation advises.
Closes#764
Signed-off-by: Kiril Vladimiroff <kiril@vladimiroff.org>
* Override named result parameters in last returns
Signed-off-by: Kiril Vladimiroff <kiril@vladimiroff.org>
* Do the same over new changes that got merged
Signed-off-by: Kiril Vladimiroff <kiril@vladimiroff.org>
Co-authored-by: Neil Alexander <neilalexander@users.noreply.github.com>
* Clean up
Co-authored-by: Serra Allgood <serra@allgood.dev>
Co-authored-by: Andrew Morgan <1342360+anoadragon453@users.noreply.github.com>
Co-authored-by: Neil Alexander <neilalexander@users.noreply.github.com>
Co-authored-by: Brendan Abolivier <github@brendanabolivier.com>
Co-authored-by: Abhishek Kumar <31231064+abhishekkumar2718@users.noreply.github.com>
Co-authored-by: Will Hunt <will@half-shot.uk>
Co-authored-by: S7evinK <tfaelligen@gmail.com>
Co-authored-by: Arshpreet <30545756+arsh-7@users.noreply.github.com>
Co-authored-by: Prateek Sachan <42961174+prateek2211@users.noreply.github.com>
Co-authored-by: Behouba Manassé <behouba@gmail.com>
Co-authored-by: aditsachde <23707194+aditsachde@users.noreply.github.com>
Co-authored-by: Kiril Vladimiroff <kiril@vladimiroff.org>
* Always defer *sql.Rows.Close and consult with Err
database/sql.Rows.Next() makes sure to call Close only after exhausting
result rows which would NOT happen when returning early from a bad Scan.
Close being idempotent makes it a great candidate to get always deferred
regardless of what happens later on the result set.
This change also makes sure call Err() after exhausting Next() and
propagate non-nil results from it as the documentation advises.
Closes#764
Signed-off-by: Kiril Vladimiroff <kiril@vladimiroff.org>
* Override named result parameters in last returns
Signed-off-by: Kiril Vladimiroff <kiril@vladimiroff.org>
* Do the same over new changes that got merged
Signed-off-by: Kiril Vladimiroff <kiril@vladimiroff.org>
Co-authored-by: Neil Alexander <neilalexander@users.noreply.github.com>
* Merge forward
* Tidy up a bit
* TODO: What to do with NextBatch here?
* Replace SyncPosition with PaginationToken throughout syncapi
* Fix PaginationTokens
* Fix lint errors
* Add a couple of missing functions into the syncapi external storage interface
* Some updates based on review comments from @babolivier
* Some updates based on review comments from @babolivier
* argh whitespacing
* Fix opentracing span
* Remove dead code
* Don't overshadow err (fix lint issue)
* Handle extremities after inserting event into topology
* Try insert event topology as ON CONFLICT DO NOTHING
* Prevent OOB error in addRoomDeltaToResponse
* Thwarted by gocyclo again
* Fix NewPaginationTokenFromString, define unit test for it
* Update pagination token test
* Update sytest-whitelist
* Hopefully fix some of the sync batch tokens
* Remove extraneous sync position func
* Revert to topology tokens in addRoomDeltaToResponse etc
* Fix typo
* Remove prevPDUPos as dead now that backwardTopologyPos is used instead
* Fix selectEventsWithEventIDsSQL
* Update sytest-blacklist
* Update sytest-whitelist