Commit Graph

240 Commits

Author SHA1 Message Date
S7evinK
161f145176
Add NATS JetStream support (#1866)
* Add NATS JetStream support
Update shopify/sarama

* Fix addresses

* Don't change Addresses in Defaults

* Update saramajetstream

* Add missing error check

Keep typing events for at least one minute

* Use all configured NATS addresses

* Update saramajetstream

* Try setting up with NATS

* Make sure NATS uses own persistent directory (TODO: make this configurable)

* Update go.mod/go.sum

* Jetstream package

* Various other refactoring

* Build fixes

* Config tweaks, make random jetstream storage path for CI

* Disable interest policies

* Try to sane default on jetstream base path

* Try to use in-memory for CI

* Restore storage/retention

* Update nats.go dependency

* Adapt changes to config

* Remove unneeded TopicFor

* Dep update

* Revert "Remove unneeded TopicFor"

This reverts commit f5a4e4a339b6f94ec215778dca22204adaa893d1.

* Revert changes made to streams

* Fix build problems

* Update nats-server

* Update go.mod/go.sum

* Roomserver input API queuing using NATS

* Fix topic naming

* Prometheus metrics

* More refactoring to remove saramajetstream

* Add missing topic

* Don't try to populate map that doesn't exist

* Roomserver output topic

* Update go.mod/go.sum

* Message acknowledgements

* Ack tweaks

* Try to resume transaction re-sends

* Try to resume transaction re-sends

* Update to matrix-org/gomatrixserverlib@91dadfb

* Remove internal.PartitionStorer from components that don't consume keychanges

* Try to reduce re-allocations a bit in resolveConflictsV2

* Tweak delivery options on RS input

* Publish send-to-device messages into correct JetStream subject

* Async and sync roomserver input

* Update dendrite-config.yaml

* Remove roomserver tests for now (they need rewriting)

* Remove roomserver test again (was merged back in)

* Update documentation

* Docker updates

* More Docker updates

* Update Docker readme again

* Fix lint issues

* Send final event in `processEvent` synchronously (since this might stop Sytest from being so upset)

* Don't report event rejection errors via `/send`, since apparently this is upsetting tests that don't expect that

* Go 1.16 instead of Go 1.13 for upgrade tests and Complement

* Revert "Don't report event rejection errors via `/send`, since apparently this is upsetting tests that don't expect that"

This reverts commit 368675283fc44501f227639811bdb16dd5deef8c.

* Don't report any errors on `/send` to see what fun that creates

* Fix panics on closed channel sends

* Enforce state key matches sender

* Do the same for leave

* Various tweaks to make tests happier

Squashed commit of the following:

commit 13f9028e7a63662759ce7c55504a9d2423058668
Author: Neil Alexander <neilalexander@users.noreply.github.com>
Date:   Tue Jan 4 15:47:14 2022 +0000

    Do the same for leave

commit e6be7f05c349fafbdddfe818337a17a60c867be1
Author: Neil Alexander <neilalexander@users.noreply.github.com>
Date:   Tue Jan 4 15:33:42 2022 +0000

    Enforce state key matches sender

commit 85ede6d64bf10ce9b91cdd6d80f87350ee55242f
Author: Neil Alexander <neilalexander@users.noreply.github.com>
Date:   Tue Jan 4 14:07:04 2022 +0000

    Fix panics on closed channel sends

commit 9755494a98bed62450f8001d8128e40481d27e15
Author: Neil Alexander <neilalexander@users.noreply.github.com>
Date:   Tue Jan 4 13:38:22 2022 +0000

    Don't report any errors on `/send` to see what fun that creates

commit 3bb4f87b5dd56882febb4db5621db484c8789b7c
Author: Neil Alexander <neilalexander@users.noreply.github.com>
Date:   Tue Jan 4 13:00:26 2022 +0000

    Revert "Don't report event rejection errors via `/send`, since apparently this is upsetting tests that don't expect that"

    This reverts commit 368675283fc44501f227639811bdb16dd5deef8c.

commit fe2673ed7be9559eaca134424e403a4faca100b0
Author: Neil Alexander <neilalexander@users.noreply.github.com>
Date:   Tue Jan 4 12:09:34 2022 +0000

    Go 1.16 instead of Go 1.13 for upgrade tests and Complement

commit 368675283fc44501f227639811bdb16dd5deef8c
Author: Neil Alexander <neilalexander@users.noreply.github.com>
Date:   Tue Jan 4 11:51:45 2022 +0000

    Don't report event rejection errors via `/send`, since apparently this is upsetting tests that don't expect that

commit b028dfc08577bcf52e6cb498026e15fa5d46d07c
Author: Neil Alexander <neilalexander@users.noreply.github.com>
Date:   Tue Jan 4 10:29:08 2022 +0000

    Send final event in `processEvent` synchronously (since this might stop Sytest from being so upset)

* Merge in NATS Server v2.6.6 and nats.go v1.13 into the in-process connection fork

* Add `jetstream.WithJetStreamMessage` to make ack/nak-ing less messy, use process context in consumers

* Fix consumer component name in  federation API

* Add comment explaining where streams are defined

* Tweaks to roomserver input with comments

* Finish that sentence that I apparently forgot to finish in INSTALL.md

* Bump version number of config to 2

* Add comments around asynchronous sends to roomserver in processEventWithMissingState

* More useful error message when the config version does not match

* Set version in generate-config

* Fix version in config.Defaults

Co-authored-by: Neil Alexander <neilalexander@users.noreply.github.com>
2022-01-05 17:44:49 +00:00
Neil Alexander
3113210f17
Fix keyring regressions in previous P2P demo 2021-12-13 13:24:49 +00:00
Neil Alexander
c3dda0779d
Return event NID from StoreEvent, match PSQL vs SQLite behaviour, tweak backfill persistence (#2071) 2021-12-09 15:03:26 +00:00
Neil Alexander
c9419e51af
Don't populate config defaults where it doesn't make sense (#2058)
* Don't populate config defaults where it doesn't make sense

* Fix dendritejs builds
2021-11-24 11:57:39 +00:00
Neil Alexander
ec716793eb
Merge federationapi, federationsender, signingkeyserver components (#2055)
* Initial federation sender -> federation API refactoring

* Move base into own package, avoids import cycle

* Fix build errors

* Fix tests

* Add signing key server tables

* Try to fold signing key server into federation API

* Fix dendritejs builds

* Update embedded interfaces

* Fix panic, fix lint error

* Update configs, docker

* Rename some things

* Reuse same keyring on the implementing side

* Fix federation tests, `NewBaseDendrite` can accept freeform options

* Fix build

* Update create_db, configs

* Name tables back

* Don't rename federationsender consumer for now
2021-11-24 10:45:23 +00:00
Neil Alexander
6e93531e94
Don't persist transaction IDs in the roomserver (#2048) 2021-11-22 09:13:12 +00:00
Neil Alexander
403498a85b
Only return non-stub rooms from GetKnownRooms (#2049)
* Only return non-stub rooms from `GetKnownRooms`

This should stop a bunch of errors at startup with invalid server ACLs.

* Fix query
2021-11-18 11:34:19 +00:00
Neil Alexander
b9a575919a
Try to reduce re-allocations a bit in resolveConflictsV2 2021-11-04 10:55:07 +00:00
PiotrKozimor
dec05c3347
Run gofmt on dendrite - apply go 1.17 preferred build tags (#2021) 2021-11-02 16:48:48 +00:00
Neil Alexander
b99f594a93
Fix #2028 (#2036) 2021-11-02 16:47:39 +00:00
Ryan W
a624eab309
- Removed double imports (#1989)
- Lower cased error messages

Signed-off-by: Ryan Whittington <twentybitdev@gmail.com>

Co-authored-by: kegsay <kegan@matrix.org>
2021-09-08 17:31:03 +01:00
kegsay
7dc8fb1fe7
Add more logs (#2005)
* Add more logs

To help debug the migration issue in #1924 along with manual data-loss-inducing fixes.
Also log the origin server on processed txns to help debug buggy server origins.

* Fix query
2021-09-07 15:07:14 +01:00
Neil Alexander
51b119107c
Don't return nonsense canonical room aliases in the public rooms responses (#1992) 2021-08-27 16:50:30 +01:00
kegsay
4cc8b28b7f
Ensure all create events have a snapshot NID of 0 (#1961)
Fixes #1924 for postgres users, though the underlying cause of why
they aren't 0 in the first place is unresolved.
2021-08-04 17:48:23 +01:00
kegsay
ed04eed441
Fix sqlite migration issues (#1960)
* Do not store 'null' in the database for empty JSON arrays

This can cause issues, though it should be noted that the majority
of the time this will marshal/unmarshal just fine, see
https://play.golang.org/p/Doe2NZUgv7Q

* bugfix: sqlite migration should handle create events as having no 'before' snapshot

The state snapshot for any given event in the roomserver represents the state _before_
the event. For the create event, this is nothing, so the state snapshot nid should be 0.

In some cases this wasn't happening, resulting in a nice mix of possible options including:
 - A state snapshot without any state blocks `[]` or `null`.
 - A state snapshot with a single state block with a single event, the create event, causing
   a circular loop. This is incorrect as it represents the state before the event, not after.

* Add state key check
2021-08-04 17:08:17 +01:00
Kegan Dougal
ed4097825b Factor out StatementList to sqlutil and use it in userapi
It helps with the boilerplate.
2021-07-28 18:30:04 +01:00
kegsay
16bf94f239
Not finding the snapshot is not fatal (#1940) 2021-07-26 12:30:44 +01:00
Neil Alexander
39e8d1cc6f
Track knocking in membership updater (#1935)
* Topologically sort outliers in SendEventWithState

* Knock in membership updater

* Update gomatrixserverlib

* Update gomatrixserverlib

* Get the NID of the knock event properly for the membership updater
2021-07-22 12:26:58 +01:00
Neil Alexander
c1447a58e5
Various alias fixes (#1934)
* Generate m.room.canonical_alias instead of legacy m.room.aliases

* Add omitempty tags

* Add aliases endpoint to client API

* Check power levels when setting aliases

* Don't return null on /aliases

* Don't return error if the state event fails

* Update sytest-whitelist

* Don't send updated m.room.canonical_alias events

* Don't check PLs after all because for local aliases they are apparently irrelevant

* Fix some bugs

* Allow deleting a local alias with enough PL

* Fix some more bugs

* Update sytest-whitelist

* Fix copyright notices

* Review comments
2021-07-21 16:53:50 +01:00
Neil Alexander
f0f8c7f055
Optimise QueryServerJoinedToRoom (#1933)
* Optimise checking if a server is in a room

* Fix queries

* Fix queries
2021-07-21 13:06:32 +01:00
Neil Alexander
f63068df3b
Only include go-sqlite3 on the relevant binaries (#1900)
* Only include go-sqlite3 on the relevant binaries

* The driver name is always sqlite3 now

* Update to matrix-org/go-sqlite3-js@e537baa
2021-07-20 11:18:14 +01:00
David Spenler
8d8fe485b4
Fix failing ban tests (#1884)
* Add room membership and powerlevel checks for func SendBan

* Added non-error return to func GetStateEvent when no state events with the specified state key are found

* Add passing tests to whitelist

* Fixed formatting

* Update roomserver/storage/shared/storage.go

Co-authored-by: Neil Alexander <neilalexander@users.noreply.github.com>
Co-authored-by: kegsay <kegan@matrix.org>
Co-authored-by: kegsay <kegsay@gmail.com>
2021-07-19 18:33:05 +01:00
Neil Alexander
09d3bab838
Metric fixes
Squashed commit of the following:

commit c6eb4d8bbf80320ec2b6d416c77659b0343e5e47
Author: Neil Alexander <neilalexander@users.noreply.github.com>
Date:   Mon Jul 19 16:52:57 2021 +0100

    Fix bug

commit d420966d9ac44936728960a8d38602662b58f1c3
Author: Neil Alexander <neilalexander@users.noreply.github.com>
Date:   Mon Jul 19 16:46:12 2021 +0100

    Update metric

commit 0ad6e37846e2ebbbd0e33a38274094bd15b8f11b
Author: Neil Alexander <neilalexander@users.noreply.github.com>
Date:   Mon Jul 19 16:30:14 2021 +0100

    Fix observe for calculateStateDurations
2021-07-19 17:20:55 +01:00
Neil Alexander
eb2a8e4c0b
Set buckets for dendrite_roomserver_calculate_state_duration_microseconds 2021-07-19 16:07:06 +01:00
Neil Alexander
b20d402f39
dendrite_roomserver_calculate_state_duration_microseconds as histogram rather than summary 2021-07-19 15:34:12 +01:00
kegsay
e80098e186
bugfix: retire invites even when we cannot talk to the remote server to make/send_leave (#1918)
* bugfix: retire invites even when we cannot talk to the remote server to make/send_leave

Also modify the leave response in /sync to include a fake event as this is ultimately
what clients (and sytest) will use to determine leave-ness.

* hash the event ID

* Base64 not hex
2021-07-14 10:39:17 +01:00
kegsay
f8ae391a5b
Expose more data when outputting output room events (#1916)
* Add more logging for content fields

* Fix fields
2021-07-13 11:19:21 +01:00
Neil Alexander
acec6fa979
Move a couple of callers to helpers.IsServerCurrentlyInRoom over to the query API (#1912) 2021-07-09 17:49:59 +01:00
Neil Alexander
c8408a6387
Add more optimised code path for checking if we're in a room (#1909)
* Add more optimised code path for checking if we're in a room

* Fix database queries

* Fix federation API test

* Fix logging

* Review comments

* Make separate API call for room membership
2021-07-09 16:36:45 +01:00
kegsay
3e50bac944
bugfix: order the state blocks so recreating state snapshots works correctly (#1908)
* Logging

* Revert "Logging"

This reverts commit 23ce334182d8a70f8e0381786f9dea77a9b91ed8.

* bugfix: order the state blocks so recreating state snapshots works correctly
2021-07-09 10:49:49 +01:00
Neil Alexander
816e1a402b
Fix bug when rejecting invites (#1907)
* Fix rejecting invites maybe

* Remove comment that is no longer correct

* Review comment on performFederatedRejectInvite
2021-07-08 14:54:03 +01:00
kegsay
5a09290c32
db migration: handle create events with no state blocks from v0.1.0 (#1904) 2021-07-07 17:07:33 +01:00
Neil Alexander
192a7a7923
Roomserver input backpressure metric
Squashed commit of the following:

commit 56e934ac0aeedcfb2c072010959ba49734d4e0cb
Author: Neil Alexander <neilalexander@users.noreply.github.com>
Date:   Fri Jul 2 09:39:30 2021 +0100

    Fix metric

commit 3911f3a0c17b164b012e881c085ceca30f5de408
Author: Neil Alexander <neilalexander@users.noreply.github.com>
Date:   Fri Jul 2 09:36:29 2021 +0100

    Register correct metric

commit a9ddbfaed421538a701151801e9451198a8be4f3
Author: Neil Alexander <neilalexander@users.noreply.github.com>
Date:   Fri Jul 2 09:33:33 2021 +0100

    Try to capture RS input backpressure metric
2021-07-02 09:48:55 +01:00
kegsay
c849e74dfc
db migration: fix #1844 and add additional assertions (#1889)
* db migration: fix #1844 and add additional assertions

- Migration scripts will now check to see if there are any unconverted
  snapshot IDs and fail the migration if there are any. This should
  prevent people from getting a corrupt database in the event the root
  cause is still unknown.
- Add an ORDER BY clause when doing batch queries in the postgres
  migration. LIMIT and OFFSET without ORDER BY are undefined and must
  not be relied upon to produce a deterministic ordering (e.g row order).
  See https://www.postgresql.org/docs/current/queries-limit.html

* Linting

Co-authored-by: Neil Alexander <neilalexander@users.noreply.github.com>
2021-06-29 11:25:17 +01:00
Neil Alexander
7c3991ee2f
Use a custom FIFO queue for the RS input API (#1888)
* Use a FIFO queue instead of a channel to reduce backpressure

* Make sure someone wakes up

* Tweaks

* Add comments
2021-06-28 15:11:36 +01:00
Neil Alexander
5357df36c9
Fix panic in roomserver 2021-06-21 09:41:12 +01:00
Neil Alexander
c67d8da3eb
Fix bug in SQLite migration 2021-04-26 13:45:47 +01:00
Neil Alexander
5ce1fe80de
State storage refactor (#1839)
* Hash-deduplicated state storage (and migrations) for PostgreSQL and SQLite

* Refactor droomserver database setup for migrations

* Fix conflict statements

* Update migration names

* Set a boundary for old to new block/snapshot IDs so we don't rewrite them more than once accidentally

* Create sequence if not exists

* Fix boundary queries

* Fix boundary queries

* Use Query

* Break out queries a bit

* More sequence tweaks

* Query parameters are not playing the game

* Injection escaping may not work for CREATE SEQUENCE after all

* Fix snapshot sequence name

* Use boundaried IDs in SQLite too

* Use IFNULL for SQLite

* Use COALESCE in PostgreSQL

* Review comments @Kegsay
2021-04-26 13:25:57 +01:00
Kegsay
af41f6d454
Add Sentry support (#1803)
* Add Sentry support

* Use HTTP Sentry properly maybe

* Capture panics

* Log fed Sentry stuff correctly

* British english linter
2021-03-24 10:25:24 +00:00
Neil Alexander
01267a34b9
Fix nil pointer crash in QueryMembershipsForRoom 2021-03-17 13:58:04 +00:00
Kegsay
3c419be6af
roomserver: don't make_join with ourselves if clients ask us to (#1797)
* roomserver: don't make_join with ourselves if clients ask us to

* delete properly
2021-03-08 18:16:28 +00:00
Will Hunt
9557ccada4
Fix appsevice alias queries part 2 (#1684)
* Check membership of room

* Use QueryStateAfterEventsResponse

* Fix complexity

* Add field ShouldHitAppservice to GetRoomIDForAlias

* Hit appservice when trying to join a non-existent alias

* remove unused

* Changes that I made a long time ago

* Rename to appserviceJoinedAtEvent

* Check membership in GetMemberships

* Update QueryMembershipsForRoom

* Tweaks in client API

* Update appserviceJoinedAtEvent

* Comments

* Try QueryMembershipForUser instead

* Undo some changes to client API that shouldn't be needed

* More /event tweaks

* Refactor /event bit

* Go back to QueryMembershipsForRoom because appservices are hard

* Fix bugs in onMessage

* Add comments

* More logical naming, clean up a bit

Co-authored-by: Neil Alexander <neilalexander@users.noreply.github.com>
2021-03-03 17:00:31 +00:00
Will Hunt
a2773922d2
Send events to appservice based on room membership (#1680)
* Check membership of room

* Use QueryStateAfterEventsResponse

* Fix complexity

* Changes that I made a long time ago

* Rename to appserviceJoinedAtEvent

* Check membership in GetMemberships

* Update QueryMembershipsForRoom

* Tweaks in client API

* Update appserviceJoinedAtEvent

* Comments

* Try QueryMembershipForUser instead

* Undo some changes to client API that shouldn't be needed

* More /event tweaks

* Refactor /event bit

* Go back to QueryMembershipsForRoom because appservices are hard

* Fix bugs in onMessage

* Add comments

Co-authored-by: Neil Alexander <neilalexander@users.noreply.github.com>
2021-03-03 16:27:44 +00:00
Neil Alexander
d15836e260
Increase gocyclo complexity to 25 (and remove all but 2 golint directives related to it) (#1783) 2021-03-03 14:35:57 +00:00
Neil Alexander
5d74a1757f
Don't query for servers so often in /send (#1766)
* Look up servers less often, don't hit API for missing auth events unless there are actually missing auth events

* Remove ResolveConflictsAdhoc (since it is already in GMSL), other tweaks

* Update gomatrixserverlib to matrix-org/gomatrixserverlib#254

* Fix resolve-state

* Initialise t.servers on first use
2021-02-16 17:12:17 +00:00
Neil Alexander
02e6d89cc2
Fix crash in membership updater (#1753)
* Fix nil pointer exception in membership updater

* goimports
2021-02-06 11:49:18 +00:00
Neil Alexander
de5f22a469
Remove redundant check (#1748) 2021-02-04 11:12:52 +00:00
Kegsay
6d1c6f29e0
Add m.room.create to invite stripped state (#1740)
MSC1772 needs this because the create event contains info on if
the room is a space or not. The create event itself isn't sensitive
so other people may find this useful too.
2021-01-29 11:36:26 +00:00
Neil Alexander
64fb6de6d4
Don't retrieve same state events over and over again (#1737) 2021-01-26 09:12:17 +00:00
Matthew Hodgson
0571d395b5
Peeking over federation via MSC2444 (#1391)
* a very very WIP first cut of peeking via MSC2753.

doesn't yet compile or work.
needs to actually add the peeking block into the sync response.
checking in now before it gets any bigger, and to gather any initial feedback on the vague shape of it.

* make PeekingDeviceSet private

* add server_name param

* blind stab at adding a `peek` section to /sync

* make it build

* make it launch

* add peeking to getResponseWithPDUsForCompleteSync

* cancel any peeks when we join a room

* spell out how to runoutside of docker if you want speed

* fix SQL

* remove unnecessary txn for SelectPeeks

* fix s/join/peek/ cargocult fail

* HACK: Track goroutine IDs to determine when we write by the wrong thread

To use: set `DENDRITE_TRACE_SQL=1` then grep for `unsafe`

* Track partition offsets and only log unsafe for non-selects

* Put redactions in the writer goroutine

* Update filters on writer goroutine

* wrap peek storage in goid hack

* use exclusive writer, and MarkPeeksAsOld more efficiently

* don't log ascii in binary at sql trace...

* strip out empty roomd deltas

* re-add txn to SelectPeeks

* re-add accidentally deleted field

* reject peeks for non-worldreadable rooms

* move perform_peek

* fix package

* correctly refactor perform_peek

* WIP of implementing MSC2444

* typo

* Revert "Merge branch 'kegan/HACK-goid-sqlite-db-is-locked' into matthew/peeking"

This reverts commit 3cebd8dbfbccdf82b7930b7b6eda92095ca6ef41, reversing
changes made to ed4b3a58a7855acc43530693cc855b439edf9c7c.

* (almost) make it build

* clean up bad merge

* support SendEventWithState with optional event

* fix build & lint

* fix build & lint

* reinstate federated peeks in the roomserver (doh)

* fix sql thinko

* todo for authenticating state returned by /peek

* support returning current state from QueryStateAndAuthChain

* handle SS /peek

* reimplement SS /peek to prod the RS to tell the FS about the peek

* rename RemotePeeks as OutboundPeeks

* rename remote_peeks_table as outbound_peeks_table

* add perform_handle_remote_peek.go

* flesh out federation doc

* add inbound peeks table and hook it up

* rename ambiguous RemotePeek as InboundPeek

* rename FSAPI's PerformPeek as PerformOutboundPeek

* setup inbound peeks db correctly

* fix api.SendEventWithState with no event

* track latestevent on /peek

* go fmt

* document the peek send stream race better

* fix SendEventWithRewrite not to bail if handed a non-state event

* add fixme

* switch SS /peek to use SendEventWithRewrite

* fix comment

* use reverse topo ordering to find latest extrem

* support postgres for federated peeking

* go fmt

* back out bogus go.mod change

* Fix performOutboundPeekUsingServer

* Fix getAuthChain -> GetAuthChain

* Fix build issues

* Fix build again

* Fix getAuthChain -> GetAuthChain

* Don't repeat outbound peeks for the same room ID to the same servers

* Fix lint

* Don't omitempty to appease sytest

Co-authored-by: Kegan Dougal <kegan@matrix.org>
Co-authored-by: Neil Alexander <neilalexander@users.noreply.github.com>
2021-01-22 14:55:08 +00:00