Last week the UK government published a draft of the proposed Online Safety Bill, after having initially introduced formal proposals for said bill in early 2020. With this post we aim to shed some light on its potential impacts and explain why we think that this bill - despite having great intentions - may actually be setting a dangerous precedent when it comes to our rights to privacy, freedom of expression and self determination.
The proposed bill aims to provide a legal framework to address illegal and harmful content online. This focus on “not illegal, but harmful” content is at the centre of our concerns - it puts responsibility on organisations themselves to arbitrarily decide what might be harmful, without any legal backing. The bill itself does not actually provide a definition of harmful, instead relying on service providers to assess and decide on this. This requirement to identify what is “likely to be harmful” applies to all users, children and adults. Our question here is - would you trust a service provider to decide what might be harmful to you and your children, with zero input from you as a user?
Additionally, the bill incentivises the use of privacy-invasive age verification processes which come with their own set of problems. This complete disregard of people’s right to privacy is a reflection of the privileged perspectives of those in charge of the drafting of this bill, which fails to acknowledge how actually harmful it would be for certain groups of the population to have their real life identity associated with their online identity.
Our view of the world, and of the internet, is largely different from the one presented by this bill. Now, this categorically does not mean we don’t care about online safety (it is quite literally our bread and butter) - we just fundamentally disagree with the approach taken.
Whilst we sympathise with the government’s desire to show action in this space and to do something about children’s safety (everyone’s safety really), we cannot possibly agree with the methods.
Back in October of 2020 we presented our proposed approach to online safety - ironically also in response to a government proposal, albeit about encryption backdoors. In it, we briefly discussed the dangers of absolute determinations of morality from a single cultural perspective:
As uncomfortable as it may be, one man’s terrorist is another man’s freedom fighter, and different jurisdictions have different laws - and it’s not up to the Matrix.org Foundation to play God and adjudicate.
We now find ourselves reading a piece of legislation that essentially demands these determinations from tech companies. The beauty of the human experience lies with its diversity and when we force technology companies to make calls about what is right or wrong - or what is “likely to have adverse psychological or physical impacts” on children - we end up in a dangerous place of centralising and regulating relative morals. Worst of all, when the consequence of getting it wrong is criminal liability for senior managers what do we think will happen?
Regardless of how omnipresent it is in our daily lives, technology is still not a solution for human problems. Forcing organisations to be judge and jury of human morals for the sake of “free speech” will, ironically, have severe consequences on free speech, as risk profiles will change for fear of liability.
Forcing a “duty of care” responsibility on organisations which operate online will not only drown small and medium sized companies in administrative tasks and costs, it will further accentuate the existing monopolies by Big Tech. Plainly, Big Tech can afford the regulatory burden - small start-ups can’t. Future creators will have their wings clipped from the offset and we might just miss out on new ideas and projects for fear of legal repercussions. This is a threat to the technology sector, particularly those building on emerging technologies like Matrix. In some ways, it is a threat to democracy and some of the freedoms this bill claims to protect.
These are, quite frankly, steps towards an authoritarian dystopia. If Trust & Safety managers start censoring something as natural as a nipple on the off chance it might cause “adverse psychological impacts” on children, whose freedom of expression are we actually protecting here?
More specifically on the issue of content moderation: the impact assessment provided by the government alongside this bill predicts that the additional costs for companies directly related to the bill will be in the billions, over the course of 10 years. The cost for the government? £400k, in every proposed policy option. Our question is - why are these responsibilities being placed on tech companies, when evidently this is a societal problem?
We are not saying it is up to the government to single-handedly end the existence of Child Sexual Abuse and Exploitation (CSAE) or extremist content online. What we are saying is that it takes more than content filtering, risk assessments and (faulty) age verification processes for it to end. More funding for tech literacy organisations and schools, to give children (and parents) the tools to stay safe is the first thing that comes to mind. Further investment in law enforcement cyber units and the judicial system, improving tech companies’ routes for abuse reporting and allowing the actual judges to do the judging seems pretty sensible too. What is absolutely egregious is the degradation of the digital rights of the majority, due to the wrongdoings of a few.
Our goal with this post is not to be dramatic or alarmist. However, we want to add our voices to the countless digital rights campaigners, individuals and organisations that have been raising the alarm since the early days of this bill. Just like with coercive control and abuse, the degradation of our rights does not happen all at once. It is a slippery slope that starts with something as (seemingly) innocuous as mandatory content scanning for CSAE content and ends with authoritarian surveillance infrastructure. It is our duty to put a stop to this before it even begins.Twitter card image credit from Brazil, which feels all too familiar right now.
As many know, over the years we've experimented with how to let users locate and curate sets of users and rooms in Matrix. Back in Nov 2017 we added 'groups' (aka 'communities') as a custom mechanism for this - introducing identifiers beginning with a + symbol to represent sets of rooms and users, like +matrix:matrix.org.
However, it rapidly became obvious that Communities had some major shortcomings. They ended up being an extensive and entirely new API surface (designed around letting you dynamically bridge the membership of a group through to a single source of truth like LDAP) - while in practice groups have enormous overlap with rooms: managing membership, inviting by email, access control, power levels, names, topics, avatars, etc. Meanwhile the custom groups API re-invented the wheel for things like pushing updates to the client (causing a whole suite of problems). So clients and servers alike ended up reimplementing large chunks of similar functionality for both rooms and groups.
And so almost before Communities were born, we started thinking about whether it would make more sense to model them as a special type of room, rather than being their own custom primitive. MSC1215 had the first thoughts on this in 2017, and then a formal proposal emerged at MSC1772 in Jan 2019. We started working on this in earnest at the end of 2020, and christened the new way of handling groups of rooms and users as... Spaces!
Spaces work as follows:
Today, we're ridiculously excited to be launching Space support as a beta in matrix-react-sdk and matrix-android-sdk2 (and thus Element Web/Desktop and Element Android) and Synapse 1.34.0 - so head over to your nearest Element, make sure it's connected to the latest Synapse (and that Synapse has Spaces enabled in its config) and find some Space to explore! #community:matrix.org might be a good start :)
The beta today gives us the bare essentials: and we haven't yet finished space-based access controls such as setting powerlevels in rooms based on space membership (MSC2962) or limiting who can join a room based on their space membership (MSC3083) - but these will be coming asap. We also need to figure out how to implement Flair on top of Spaces rather than Communities.
This is also a bit of a turning point in Matrix's architecture: we are now using rooms more and more as a generic way of modelling new features in Matrix. For instance, rooms could be used as a structured way of storing files (MSC3089); Reputation data (MSC2313) is stored in rooms; Threads can be stored in rooms (MSC2836); Extensible Profiles are proposed as rooms too (MSC1769). As such, this pushes us towards ensuring rooms are as lightweight as possible in Matrix - and that things like sync and changing profile scale independently of the number of rooms you're in. Spaces effectively gives us a way of creating a global decentralised filesystem hierarchy on top of Matrix - grouping the existing rooms of all flavours into an epic multiplayer tree of realtime data. It's like USENET had a baby with the Web!
For lots more info from the Element perspective, head over to the Element blog. Finally, the point of the beta is to gather feedback and fix bugs - so please go wild in Element reporting your first impressions and help us make Spaces as awesome as they deserve to be!
Thanks for flying Matrix into Space;
Matthew & the whole Spaces (and Matrix) team.
Since the end of 2019, we have spent quite a bit of time thinking about and exploring different technologies whilst building various demos for P2P Matrix. Our mission for P2P Matrix is to evolve Matrix into a hybrid between today's server-oriented network and a pure P2P network - empowering users to have total autonomy and privacy over their data if they want (by storing it in P2P Matrix, by embedding their server into their Matrix client), while also letting users store their data in serverside nodes if they so desire.
The goal is to protect metadata much better (as users no longer have to depend on a server run by someone else to communicate), as well as drive new features such as account portability, multi-homed accounts, low-bandwidth Matrix and smarter federation transports - and provide support for internet-less mesh communication via Matrix which can also interoperate with the wider network. You can read more about it in our Introducing P2P Matrix blog post from last summer, or watch our FOSDEM 2021 talk where we previewed Pinecone. It's important to note that this has been a small but important long-term project for Matrix, and has been progressing entirely outside our business-as-usual work of improving the core protocol and reference implementations.
As the project has progressed, we've built a variety of prototypes using existing libraries (go-libp2p, js-libp2p and Yggdrasil), demonstrating what an early P2P Matrix might feel like if it were running on a mobile device, in the web browser and so on using such an overlay network. Each of these demos has taught us something new, and so in October 2020 we decided to take this knowledge to build an experimental new overlay network of our own.
Pinecone is designed to provide end-to-end encrypted communications between devices, regardless of how they are connected to one another, in a lightweight and self-arranging fashion. The routing protocol is a hybrid, taking inspiration from Yggdrasil by building a global spanning tree, but rather than forwarding all traffic using the spanning tree topology, we use it as a bootstrap routing mechanism for a line/snake topology, ordered by their ed25519 public keys, which we have affectionately named SNEK (Sequentially Networked Edwards Key) routing.
Nodes seek out their closest keyspace neighbours on the network and paths are built between these pairs of nodes, similar to how a Chord DHT functions, populating the routing tables of intermediate nodes in the process. These paths are then used to forward traffic without having to perform up-front searches, allowing for very fast connection setups between overlay nodes. These paths are resilient to network topology changes and handle node mobility considerably better than any other name-independent routing scheme that we have seen — early results are very promising so far. We have also been experimenting with a combination of the μTP (Micro Transport Protocol) and TLS to provide stateful connection setup, congestion control and end-to-end encryption for all federation traffic carried over the Pinecone network.
If Pinecone works out, our intention is to collaborate with the libp2p and IPFS team to incorporate Pinecone routing into libp2p (if they'll have us!) while incorporating their gossipsub routing to improve Matrix federation... and get the best of both worlds :)
Today we're releasing the source code for our current early implementation of Pinecone — you can get it from GitHub right now! It's very experimental still and not very well optimised yet, but it is the foundation of our latest mobile P2P Matrix demos, which support P2P Matrix over both Bluetooth Low Energy mesh networks, multicast DNS discovery within a LAN, and/or by routing through static Pinecone peers on the Internet:
Building a routing overlay is only the first step in the journey towards P2P Matrix. We will also be looking closely in the coming months at improving the Matrix federation protocol to work well in mixed-connectivity scenarios (rather than the full mesh approach used today) as well as decentralised identities, hybrid deployments with existing homeservers and getting Dendrite (the Matrix homeserver which is embedded into the current P2P demos) more stable and feature-complete.
The long-term plan could look something like this:
Most discussion around P2P Matrix takes place in #p2p:matrix.org, so if you are interested in what's going on, please join us there!
Back in June we wrote about our plans to tighten up data privacy in Matrix after some areas for improvement were brought to our attention. To quickly recap: the primary concern was that the default config for Riot specifies identity servers and integration managers run by New Vector (the company which the original Matrix team set up to build Riot and fund Matrix dev) - and so folks using a standalone homeserver may end up using external services without realising it. There were some other legitimate issues raised too (e.g. contact information should be obfuscated when checking if your contacts are on Matrix; Riot defaulted to using Google for STUN (firewall detection) if no TURN server had been set up on their server; Synapse defaults to using matrix.org as a key notary server).
We’ve been working away at this fairly solidly over the last few months. Some of the simpler items shipped quickly (e.g. Riot/Web had a stupid bug where it kept incorrectly loading the integration manager; Riot/Android wasn’t clear enough about when contact discovery was happening; Riot/Web wasn’t clear enough about the fact device names are publicly visible; etc) - but other bits have turned out to be incredibly time-consuming to get right.
However, we’re in the process today of releasing Synapse 1.4.0 and Riot/Web 1.4.0 (it’s coincidence the version numbers have lined up!) which resolve the majority of the remaining issues. The main changes are as follows:
Riot no longer automatically uses identity servers by default. Identity servers are only useful when inviting users by email address, or when discovering whether your contacts are on Matrix. Therefore, we now wait until the user tries to perform one of these operations before explaining that they need an identity server to do so, and we prompt them to select one if they want to proceed. This makes it abundantly clear that the user is connecting to an independent service, and why.
Synapse no longer uses identity servers for verifying registrations or verifying password reset. Originally, Synapse made use of the fact that the Identity Service contains email/msisdn verification logic to handle identity verification in general on behalf of the homeserver. However, in retrospect this was a mistake: why should the entity running your identity server have the right to verify password resets or registration details on your homeserver? So, we have moved this logic into Synapse. This means Synapse 1.4.0 requires new configuration for email/msisdn verification to work - please see the upgrade notes for full details.
Sydent now supports discovering contacts based on hashed identifiers. MSC2134 specifies entirely new IS APIs for discovering contacts using a hash of their identifier rather than directly exposing the raw identifiers being searched for. This is implemented in Riot/iOS and Riot/Android and should be in the next major release; Riot/Web 1.4.0 has it already.
Synapse now warns in its logs if you are using matrix.org as a default trusted key server, in case you wish to use a different server to help discover other servers’ keys.
Synapse now garbage collects redacted messages after N days (7 days by default). (It doesn’t yet garbage collect attachments referenced from redacted messages; we’re still working on that).
Synapse now deletes account access data (IP addresses and User Agent) after N days (28 days by default) of a device being deleted.
Riot warns before falling back to using STUN (and defaults to turn.matrix.org rather than stun.google.com) for firewall discovery (STUN) when placing VoIP calls, and makes it clear that this is an emergency fallback for misconfigured servers which are missing TURN support. (We originally deleted the fallback entirely, but this broke things for too many people, so we’ve kept it but warn instead).
All of this is implemented in Riot/Web 1.4.0 and Synapse 1.4.0. Riot/Web 1.4.0 shipped today (Fri Sept 27th) and we have a release candidate for Synapse 1.4 (1.4.0rc1) today which fully ship on Monday.
Riot/Mobile is following fast behind - most of the above has been implemented and everything should land in the next release. RiotX/Android doesn’t really have any changes to make given it hadn’t yet implemented Identity Service or Integration Manager APIs.
This has involved a surprisingly large amount of spec work; no fewer than 9 new Matrix Spec Changes (MSC) have been required as part of the project. In particular, this results in a massive update to the Identity Service API, which will be released very shortly with the new MSCs. You can see the upcoming changes on the unstable branch and compare with the previous 0.2.1 stable release, as well as checking the detailed MSCs as follows:
This said, there is still some work remaining for us to do here. The main things which haven’t made it into this release are:
We’ll continue to work on these as part of our ongoing maintenance backlog.
Finally, it’s really worth calling out the amount of effort that went into this project. Huge huge thanks to everyone involved (given it’s cut across pretty much every project & subteam we have working on the core of Matrix) who have soldiered through the backlog. We’ve been tracking progress using our feature-dashboard tool which summarises Github issues based on labels & issue lifecycle, and for better or worse it’s ended up being the biggest project board we’ve ever had. You can see the live data here (warning, it takes tens of seconds to spider Github to gather the data) - or, for posterity and ease of reference, I’ve included the current issue list below. The issues which are completed have “done” after them; the ones still in progress say “in progress”, and ones which haven’t started yet have nothing. We split the project into 3 phases - phases 1 and 2 represent the items needed to fully solve the privacy concerns, phase 3 is right now a mix of "nice to have" polish and some more speculative items. At this point we’ve effectively finished phase 1 on Synapse & Riot/Web, and Riot/Mobile is following close behind. We're continuing to work on phase 2, and we’ll work through phase 3 (where appropriate) as part of our general maintenance backlog.
I hope this gives suitable visibility on how we’re considering privacy; after all, Matrix is useless as an open communication protocol if the openness comes at the expense of user privacy. We’ll give another update once the remaining straggling issues are closed out; and meanwhile, now the bulk of the privacy work is out of the way on Riot/Web, we can finally get back to implementing the UI E2E cross-signing verification and improving first time user experience.
Thanks for your patience and understanding while we’ve sorted this stuff out; and thanks once again for flying Matrix :)
id_access_tokenin CS API when required (done)
id_access_tokenin CS API when required (done)
2019 is a big year for Matrix, in the next month we will have shipped:
Today we are sharing the Matrix core team's backend roadmap. The idea is that this will make it easier for anyone to understand where the project is going, what we consider to be important, and why.
To see the roadmap in its full glory, take a look here.
Our roadmap is not a delivery plan - there are explicitly no dates. The reason for this is that we know that other projects will emerge, developers will be needed to support other urgent initiatives, matrix.org use continues to grow exponentially and will require performance tweaking.
So simply, based on what we know now, this is the order we will work on our projects.
We are often asked ‘Why are you not working on X, it is really important' where the answer is often ‘We agree that X is really important, but A, B and C are more important and must come first'.
The point of sharing the roadmap is to make that priority trade off more transparent and consumable.
We also had Ben (benpa) contribute from a community perspective and took input from speaking to so many of you at FOSDEM.
In the end we filled a wall with post-its, each post-it representing a sizeable project. The position of the post-it was significant in that the vertical axis being a sense of how valuable we thought the task would be, and the horizontal axis being a rough guess on how complex we considered it to be.
We found this sort of grid approach to be really helpful in determining relative priority.
After many hours and plenty of blood, sweat and tears we ended up with something we could live with and wrote it up in the shared board.
Matrix's reference homeserver, Synapse, is written in Python and uses the Twisted networking framework to power its bitslinging across the Internet. The Python version used has been strictly Python 2.7, the last supported version of Python 2, but as of this week that changes! Since Twisted and our other upstream dependencies now support the newest version of Python, Python 3, we are now able to finish the jump and port Synapse to use it by default. The port has been done in a backwards compatible way, written in a subset of Python that is usable in both Python 2 and Python 3, meaning your existing Synapse installs still work on Python 2, while preparing us for a Python 3 future.
The Python 3 port has benefits other than just preparing for the End of Life of Python 2.7. Successive versions of Python 3 have improved the standard library, provided newer and clearer syntax for asynchronous code, added opt-in static typing to reduce bugs, and contained incremental performance and memory management improvements. These features, once Synapse stops supporting Python 2, can then be fully utilised to make Synapse's codebase clearer and more performant. One bonus that we get immediately, though, is Python 3's memory compaction of Unicode strings. Rather than storing as UCS-2/UTF-16 or UCS-4/UTF-32, it will instead store it in the smallest possible representation giving a 50%-75% memory improvement for strings only containing Latin-1 characters, such as nearly all dictionary keys, hashes, IDs, and a large proportion of messages being processed from English speaking countries. Non-English text will also see a memory improvement, as it can be commonly stored in only two bytes instead of the four in a UCS-4 “wide” Python 2 build.Editor's note: If you were wondering how this fits in with Dendrite (the next-gen golang homeserver): our plan is to use Synapse as the reference homeserver for all the current work going on with landing a 1.0 release of the Matrix spec: it makes no sense to try to iterate and converge on 1.0 on both Synapse and Dendrite in parallel. In order to prove that the 1.0 spec is indeed fit for purpose we then also need Synapse to exit beta and hit a 1.0 too, hence the investment to get it there. It's worth noting that over the last year we've been plugging away solidly improving Synapse in general (especially given the increasing number of high-profile deployments out there), so we're committed to getting Synapse to a formal production grade release and supporting it in the long term. Meanwhile, Dendrite development is still progressing - currently acting as a place to experiment with more radical blue-sky architectural changes, especially in low-footprint or even clientside homeservers. We expect it to catch up with Synapse once 1.0 is out the door; and meanwhile Synapse is increasingly benefiting from performance work inspired by Dendrite.
See 10/15, ~20:00 for the Python 3 migration. This is on some of the Synchrotrons on matrix.org.
See ~11/8 for the Python 3 migration. This is on the Synapse master on matrix.org.
We have also noticed some better CPU utilisation:See 21:30 for the migration of federation reader 1, and 21:55 for the others. The federation reader is a particular pathological case, where the replacement of lists with iterators internally on Python 3 has given us some big boosts.
See 10/15, 4:00.The CPU utilisation has gone down on synchrotron 1 after the Python 3 migration, but not as dramatically as the federation reader. Synchrotron 3 was migrated a few days later.
As some extra data-points, my personal HS consumes about 300MB now at initial start, and grows to approximately 800MB -- under Python 2 the growth would be near-immediate to roughly 1.4GB.
Amber Brown (hawkowl)