feat(webrtc): add WebRTC (prev. browser-to-browser) spec (#497)

Introduces the webrtc protocol - a libp2p transport protocol enabling two private nodes (e.g. two browsers) to establish a direct connection.
2026-01-06 22:13:52 -05:00 · 2023-04-12 09:30:21 +02:00
parent 6634ca7abb
commit f8f32f73d1
4 changed files with 458 additions and 322 deletions
--- a/README.md
+++ b/README.md
@@ -99,7 +99,7 @@ see [#465](https://github.com/libp2p/specs/issues/465).
 - [secio][spec_secio] - SECIO, a transport security protocol for libp2p
 - [tls][spec_tls] - The libp2p TLS Handshake (TLS 1.3+)
 - [quic][spec_quic] - The libp2p QUIC Handshake
- [webrtc][spec_webrtc] - The libp2p WebRTC transport
+- [webrtc][spec_webrtc] - The libp2p WebRTC transports
 - [WebTransport][spec_webtransport] - Using WebTransport in libp2p


--- a/webrtc/README.md
+++ b/webrtc/README.md
@@ -1,8 +1,8 @@
 # WebRTC

-| Lifecycle Stage | Maturity                  | Status | Latest Revision |
-|-----------------|---------------------------|--------|-----------------|
-| 2A              | Candidate Recommendation  | Active | r0, 2022-10-14  |
+| Lifecycle Stage | Maturity                 | Status | Latest Revision |
+|-----------------|--------------------------|--------|-----------------|
+| 2A              | Candidate Recommendation | Active | r1, 2023-04-12  |

 Authors: [@mxinden]

@@ -11,171 +11,27 @@ Interest Group: [@marten-seemann]
 [@marten-seemann]: https://github.com/marten-seemann
 [@mxinden]: https://github.com/mxinden/

-<!-- markdown-toc start - Don't edit this section. Run M-x markdown-toc-refresh-toc -->
-**Table of Contents**
+WebRTC flavors in libp2p:

- [WebRTC](#webrtc)
-  - [Motivation](#motivation)
-  - [Addressing](#addressing)
-  - [Connection Establishment](#connection-establishment)
-    - [Browser to public Server](#browser-to-public-server)
-  - [Multiplexing](#multiplexing)
-    - [Ordering](#ordering)
-    - [Head-of-line blocking](#head-of-line-blocking)
-    - [`RTCDataChannel` negotiation](#rtcdatachannel-negotiation)
-    - [`RTCDataChannel` label](#rtcdatachannel-label)
-  - [Connection Security](#connection-security)
-  - [Previous, ongoing and related work](#previous-ongoing-and-related-work)
-  - [Test vectors](#test-vectors)
-    - [Noise prologue](#noise-prologue)
-      - [Both client and server use SHA-256](#both-client-and-server-use-sha-256)
- [FAQ](#faq)
+1. [WebRTC](./webrtc.md)

-<!-- markdown-toc end -->
+   libp2p transport protocol enabling two private nodes (e.g. two browsers) to
+   establish a direct connection.

-## Motivation
+2. [WebRTC Direct](./webrtc-direct.md)

-1. **No need for trusted TLS certificates.** Enable browsers to connect to
-   public server nodes without those server nodes providing a TLS certificate
-   within the browser's trustchain. Note that we can not do this today with our
-   Websocket transport as the browser requires the remote to have a trusted TLS
-   certificate. Nor can we establish a plain TCP or QUIC connection from within
-   a browser. We can establish a WebTransport connection from the browser (see
-   [WebTransport specification](../webtransport)).
+   libp2p transport protocol **without the need for trusted TLS certificates.**
+   Enable browsers to connect to public server nodes without those server nodes
+   providing a TLS certificate within the browser's trustchain. Note that we can
+   not do this today with our Websocket transport as the browser requires the
+   remote to have a trusted TLS certificate. Nor can we establish a plain TCP or
+   QUIC connection from within a browser. We can establish a WebTransport
+   connection from the browser (see [WebTransport
+   specification](../webtransport)).

-## Addressing
+## Shared concepts

-WebRTC multiaddresses are composed of an IP and UDP address component, followed
-by `/webrtc` and a multihash of the certificate that the node uses.
-
-Examples:
-
- `/ip4/192.0.2.0/udp/1234/webrtc/certhash/<hash>/p2p/<peer-id>`
- `/ip6/fe80::1ff:fe23:4567:890a/udp/1234/webrtc/certhash/<hash>/p2p/<peer-id>`
-
-The TLS certificate fingerprint in `/certhash` is a
-[multibase](https://github.com/multiformats/multibase) encoded
-[multihash](https://github.com/multiformats/multihash).
-
-For compatibility implementations MUST support hash algorithm
-[`sha-256`](https://github.com/multiformats/multihash) and base encoding
-[`base64url`](https://github.com/multiformats/multibase). Implementations MAY
-support other hash algorithms and base encodings, but they may not be able to
-connect to all other nodes.
-
-## Connection Establishment
-
-### Browser to public Server
-
-Scenario: Browser _A_ wants to connect to server node _B_ where _B_ is publicly
-reachable but _B_ does not have a TLS certificate trusted by _A_.
-
-1. Server node _B_ generates a TLS certificate, listens on a UDP port and
-   advertises the corresponding multiaddress (see [#addressing]) through some
-   external mechanism.
-
-   Given that _B_ is publicly reachable, _B_ acts as a [ICE
-   Lite](https://www.rfc-editor.org/rfc/rfc5245) agent. It binds to a UDP port
-   waiting for incoming STUN and SCTP packets and multiplexes based on source IP
-   and source port.
-
-2. Browser _A_ discovers server node _B_'s multiaddr, containing _B_'s IP, UDP
-  port, TLS certificate fingerprint and optionally libp2p peer ID (e.g.
-  `/ip6/2001:db8::/udp/1234/webrtc/certhash/<hash>/p2p/<peer-id>`), through some
-  external mechanism.
-
-3. _A_ instantiates a `RTCPeerConnection`. See
-   [`RTCPeerConnection()`](https://developer.mozilla.org/en-US/docs/Web/API/RTCPeerConnection/RTCPeerConnection).
-
-   _A_ (i.e. the browser) SHOULD NOT reuse the same certificate across
-   `RTCPeerConnection`s. Reusing the certificate can be used to identify _A_
-   across connections by on-path observers given that WebRTC uses TLS 1.2.
-
-4. _A_ constructs _B_'s SDP answer locally based on _B_'s multiaddr.
-
-   _A_ generates a random string prefixed with "libp2p+webrtc+v1/". The prefix
-   allows us to use the ufrag as an upgrade mechanism to role out a new version
-   of the libp2p WebRTC protocol on a live network. While a hack, this might be
-   very useful in the future. _A_ sets the string as the username (_ufrag_ or _username fragment_)
-   and password on the SDP of the remote's answer.
-
-   _A_ MUST set the `a=max-message-size:16384` SDP attribute. See reasoning
-   [multiplexing](#multiplexing) for rational.
-
-   Finally _A_ sets the remote answer via
-   [`RTCPeerConnection.setRemoteDescription()`](https://developer.mozilla.org/en-US/docs/Web/API/RTCPeerConnection/setRemoteDescription).
-
-5. _A_ creates a local offer via
-   [`RTCPeerConnection.createOffer()`](https://developer.mozilla.org/en-US/docs/Web/API/RTCPeerConnection/createOffer).
-   _A_ sets the same username and password on the local offer as done in (4) on
-   the remote answer.
-
-   _A_ MUST set the `a=max-message-size:16384` SDP attribute. See reasoning
-   [multiplexing](#multiplexing) for rational.
-
-   Finally _A_ sets the modified offer via
-   [`RTCPeerConnection.setLocalDescription()`](https://developer.mozilla.org/en-US/docs/Web/API/RTCPeerConnection/setLocalDescription).
-
-   Note that this process, oftentimes referred to as "SDP munging" is disallowed
-   by the specification, but not enforced across the major browsers (Safari,
-   Firefox, Chrome) due to use-cases in the wild. See also
-   <https://bugs.chromium.org/p/chromium/issues/detail?id=823036>
-
-6. Once _A_ sets the SDP offer and answer, it will start sending STUN requests
-   to _B_. _B_ reads the _ufrag_ from the incoming STUN request's _username_
-   field. _B_ then infers _A_'s SDP offer using the IP, port, and _ufrag_ of the
-   request as follows:
-
-   1. _B_ sets the the `ice-ufrag` and `ice-pwd` equal to the value read from
-      the `username` field.
-
-   2. _B_ sets an arbitrary sha-256 digest as the remote fingerprint as it does
-      not verify fingerprints at this point.
-
-   3. _B_ sets the connection field (`c`) to the IP and port of the incoming
-      request `c=IN <ip> <port>`.
-
-   4. _B_ sets the `a=max-message-size:16384` SDP attribute. See reasoning
-      [multiplexing](#multiplexing) for rational.
-
-   _B_ sets this offer as the remote description. _B_ generates an answer and
-   sets it as the local description.
-
-   The _ufrag_ in combination with the IP and port of _A_ can be used by _B_
-   to identify the connection, i.e. demultiplex incoming UDP datagrams per
-   incoming connection.
-
-   Note that this step requires _B_ to allocate memory for each incoming STUN
-   message from _A_. This could be leveraged for a DOS attack where _A_ is
-   sending many STUN messages with different ufrags using different UDP source
-   ports, forcing _B_ to allocate a new peer connection for each. _B_ SHOULD
-   have a rate limiting mechanism in place as a defense measure. See also
-   <https://datatracker.ietf.org/doc/html/rfc5389#section-16.1.2>.
-
-7. _A_ and _B_ execute the DTLS handshake as part of the standard WebRTC
-   connection establishment.
-
-   At this point _B_ does not know the TLS certificate fingerprint of _A_. Thus
-   _B_ can not verify _A_'s TLS certificate fingerprint during the DTLS
-   handshake. Instead _B_ needs to _disable certificate fingerprint
-   verification_ (see e.g. [Pion's `disableCertificateFingerprintVerification`
-   option](https://github.com/pion/webrtc/blob/360b0f1745c7244850ed638f423cda716a81cedf/settingengine.go#L62)).
-
-   On success of the DTLS handshake the connection provides confidentiality and
-   integrity but not authenticity. The latter is guaranteed through the
-   succeeding Noise handshake. See [Connection Security
-   section](#connection-security).
-
-8. Messages on each `RTCDataChannel` are framed using the message
-   framing mechanism described in [Multiplexing](#multiplexing).
-
-9. The remote is authenticated via an additional Noise handshake. See
-   [Connection Security section](#connection-security).
-
-WebRTC can run both on UDP and TCP. libp2p WebRTC implementations MUST support
-UDP and MAY support TCP.
-
-## Multiplexing
+### Multiplexing

 The WebRTC browser APIs do not support half-closing of streams nor resets of the
 sending part of streams.
@@ -234,14 +90,14 @@ limits"](https://developer.mozilla.org/en-US/docs/Web/API/WebRTC_API/Using_data_
 Implementations MAY choose to send smaller messages, e.g. to reduce delays
 sending _flagged_ messages.

-### Ordering
+#### Ordering

 Implementations MAY expose an unordered byte stream abstraction to the user by
 overriding the default value of `ordered` `true` to `false` when creating a new
 data channel via
 [`RTCPeerConnection.createDataChannel`](https://www.w3.org/TR/webrtc/#dom-peerconnection-createdatachannel).

-### Head-of-line blocking
+#### Head-of-line blocking

 WebRTC datachannels and the underlying SCTP is message-oriented and not
 stream-oriented (e.g. see
@@ -272,7 +128,7 @@ IPv4 and 1224 bytes on IPv6.
 Long term we hope to be able to give better recommendations based on
 real-world experiments.

-### `RTCDataChannel` negotiation
+#### `RTCDataChannel` negotiation

 `RTCDataChannel`s are negotiated in-band by the WebRTC user agent (e.g. Firefox,
 Pion, ...). In other words libp2p WebRTC implementations MUST NOT change the
@@ -294,7 +150,7 @@ containing user data without waiting for the reception of the corresponding
 DATA_CHANNEL_ACK message", thus using `negotiated: false` does not imply an
 additional round trip for each new `RTCDataChannel`.

-### `RTCDataChannel` label
+#### `RTCDataChannel` label

 `RTCPeerConnection.createDataChannel()` requires passing a `label` for the
 to-be-created `RTCDataChannel`. When calling `createDataChannel` implementations
@@ -303,66 +159,6 @@ MUST pass an empty string. When receiving an `RTCDataChannel` via
 an empty string. This allows future versions of this specification to make use
 of the `RTCDataChannel` `label` property.

-## Connection Security
-
-Note that the below uses the message framing described in
-[multiplexing](#multiplexing).
-
-While WebRTC offers confidentiality and integrity via TLS, one still needs to
-authenticate the remote peer by its libp2p identity.
-
-After [Connection Establishment](#connection-establishment):
-
-1. _A_ and _B_ open a WebRTC data channel with `id: 0` and `negotiated: true`
-   ([`pc.createDataChannel("", {negotiated: true, id:
-   0});`](https://developer.mozilla.org/en-US/docs/Web/API/RTCPeerConnection/createDataChannel)).
-
-2. _B_ starts a Noise `XX` handshake on the new channel. See
-   [noise-libp2p](https://github.com/libp2p/specs/tree/master/noise).
-
-   _A_ and _B_ use the [Noise
-   Prologue](https://noiseprotocol.org/noise.html#prologue) mechanism. More
-   specifically _A_ and _B_ set the Noise _Prologue_ to
-   `<PREFIX><FINGERPRINT_A><FINGERPRINT_B>` before starting the actual Noise
-   handshake. `<PREFIX>` is the UTF-8 byte representation of the string
-   `libp2p-webrtc-noise:`. `<FINGERPRINT_A><FINGERPRINT_B>` is the concatenation
-   of the two TLS fingerprints of _A_ (Noise handshake responder) and then _B_
-   (Noise handshake initiator), in their multihash byte representation.
-
-   On Chrome _A_ can access its TLS certificate fingerprint directly via
-   `RTCCertificate#getFingerprints`. Firefox does not allow _A_ to do so. Browser
-   compatibility can be found
-   [here](https://developer.mozilla.org/en-US/docs/Web/API/RTCCertificate). In
-   practice, this is not an issue since the fingerprint is embedded in the local
-   SDP string.
-
-3. On success of the authentication handshake, the used datachannel is
-   closed and the plain WebRTC connection is used with its multiplexing
-   capabilities via datachannels. See [Multiplexing](#multiplexing).
-
-Note: WebRTC supports different hash functions to hash the TLS certificate (see
-<https://datatracker.ietf.org/doc/html/rfc8122#section-5>). The hash function used
-in WebRTC and the hash function used in the multiaddr `/certhash` component MUST
-be the same. On mismatch the final Noise handshake MUST fail.
-
-_A_ knows _B_'s fingerprint hash algorithm through _B_'s multiaddr. _A_ MUST use
-the same hash algorithm to calculate the fingerprint of its (i.e. _A_'s) TLS
-certificate. _B_ assumes _A_ to use the same hash algorithm it discovers through
-_B_'s multiaddr. For now implementations MUST support sha-256. Future iterations
-of this specification may add support for other hash algorithms.
-
-Implementations SHOULD setup all the necessary callbacks (e.g.
-[`ondatachannel`](https://developer.mozilla.org/en-US/docs/Web/API/RTCPeerConnection/datachannel_event))
-before starting the Noise handshake. This is to avoid scenarios like one where
-_A_ initiates a stream before _B_ got a chance to set the `ondatachannel`
-callback. This would result in _B_ ignoring all the messages coming from _A_
-targeting that stream.
-
-Implementations MAY open streams before completion of the Noise handshake.
-Applications MUST take special care what application data they send, since at
-this point the peer is not yet authenticated. Similarly, the receiving side MAY
-accept streams before completion of the handshake.
-
 ## Previous, ongoing and related work

 - Completed implementations of this specification:
@@ -375,68 +171,7 @@ accept streams before completion of the handshake.
    WASM): <https://github.com/wngr/libp2p-webrtc>
  - WebRTC using STUN and TURN: <https://github.com/libp2p/js-libp2p-webrtc-star>

-## Test vectors
-
-### Noise prologue
-
-All of these test vectors represent hex-encoded bytes.
-
-#### Both client and server use SHA-256
-
-Here client is _A_ and server is _B_.
-
-```
-client_fingerprint = "3e79af40d6059617a0d83b83a52ce73b0c1f37a72c6043ad2969e2351bdca870"
-server_fingerprint = "30fc9f469c207419dfdd0aab5f27a86c973c94e40548db9375cca2e915973b99"
-
-prologue = "6c69627032702d7765627274632d6e6f6973653a12203e79af40d6059617a0d83b83a52ce73b0c1f37a72c6043ad2969e2351bdca870122030fc9f469c207419dfdd0aab5f27a86c973c94e40548db9375cca2e915973b99"
-```
-
-# FAQ
-
- _Why exchange the TLS certificate fingerprint in the multiaddr? Why not
-  base it on the libp2p public key?_
-
-  Browsers do not allow loading a custom certificate. One can only generate a
-  certificate via
-  [rtcpeerconnection-generatecertificate](https://www.w3.org/TR/webrtc/#dom-rtcpeerconnection-generatecertificate).
-
- _Why not embed the peer ID in the TLS certificate, thus rendering the
-  additional "peer certificate" exchange obsolete?_
-
-  Browsers do not allow editing the properties of the TLS certificate.
-
- _How about distributing the multiaddr in a signed peer record, thus rendering
-  the additional "peer certificate" exchange obsolete?_
-
-  Signed peer records are not yet rolled out across the many libp2p protocols.
-  Making the libp2p WebRTC protocol dependent on the former is not deemed worth
-  it at this point in time. Later versions of the libp2p WebRTC protocol might
-  adopt this optimization.
-
-  Note, one can role out a new version of the libp2p WebRTC protocol through a
-  new multiaddr protocol, e.g. `/webrtc-2`.
-
- _Why exchange fingerprints in an additional authentication handshake on top of
-  an established WebRTC connection? Why not only exchange signatures of ones TLS
-  fingerprints signed with ones libp2p private key on the plain WebRTC
-  connection?_
-
-  Once _A_ and _B_ established a WebRTC connection, _A_ sends
-  `signature_libp2p_a(fingerprint_a)` to _B_ and vice versa. While this has the
-  benefit of only requring two messages, thus one round trip, it is prone to a
-  key compromise and replay attack. Say that _E_ is able to attain
-  `signature_libp2p_a(fingerprint_a)` and somehow compromise _A_'s TLS private
-  key, _E_ can now impersonate _A_ without knowing _A_'s libp2p private key.
-
-  If one requires the signatures to contain both fingerprints, e.g.
-  `signature_libp2p_a(fingerprint_a, fingerprint_b)`, the above attack still
-  works, just that _E_ can only impersonate _A_ when talking to _B_.
-
-  Adding a cryptographic identifier of the unique connection (i.e. session) to
-  the signature (`signature_libp2p_a(fingerprint_a, fingerprint_b,
-  connection_identifier)`) would protect against this attack. To the best of our
-  knowledge the browser does not give us access to such identifier.
+## FAQ

 - _Why use Protobuf for WebRTC message framing. Why not use our own,
  potentially smaller encoding schema?_
@@ -451,44 +186,11 @@ prologue = "6c69627032702d7765627274632d6e6f6973653a12203e79af40d6059617a0d83b83
  going forward. Using Protobuf is consistent with the many other libp2p
  protocols. These benefits outweigh the drawback of additional overhead.

- _Can a browser know upfront its UDP port which it is listening for incoming
-  connections on? Does the browser reuse the UDP port across many WebRTC
-  connections? If that is the case one could connect to any public node, with
-  the remote telling the local node what port it is perceived on. Thus one could
-  use libp2p's identify and AutoNAT protocol instead of relying on STUN._
-
-  No, a browser uses a new UDP port for each `RTCPeerConnection`.
-
- _Why not load a remote node's certificate into one's browser trust-store and
-  then connect e.g. via WebSocket._
-
-  This would require a mechanism to discover remote node's certificates upfront.
-  More importantly, this does not scale with the number of connections a typical
-  peer-to-peer application establishes.
-
 - _Why not use a central TURN servers? Why rely on libp2p's Circuit Relay v2
  instead?_

  As a peer-to-peer networking library, libp2p should rely as little as possible
  on central infrastructure.

- _Can an attacker launch an amplification attack with the STUN endpoint of
-  the server?_
-
-  We follow the reasoning of the QUIC protocol, namely requiring:
-
-  > an endpoint MUST limit the amount of data it sends to the unvalidated
-  > address to three times the amount of data received from that address.
-
-  <https://datatracker.ietf.org/doc/html/rfc9000#section-8>
-
-  This is the case for STUN response messages which are only slight larger than
-  the request messages. See also
-  <https://datatracker.ietf.org/doc/html/rfc5389#section-16.1.2>.
-
- _Why does B start the Noise handshake and not A?_
-
-  Given that WebRTC uses DTLS 1.2, _B_ is the one that can send data first.
-
 [QUIC RFC]: https://www.rfc-editor.org/rfc/rfc9000.html
 [uvarint-spec]: https://github.com/multiformats/unsigned-varint
--- a/webrtc/webrtc-direct.md
+++ b/webrtc/webrtc-direct.md
@@ -0,0 +1,312 @@
+# WebRTC Direct
+
+| Lifecycle Stage | Maturity                  | Status | Latest Revision |
+|-----------------|---------------------------|--------|-----------------|
+| 2A              | Candidate Recommendation  | Active | r1, 2023-04-12  |
+
+Authors: [@mxinden]
+
+Interest Group: [@marten-seemann]
+
+[@marten-seemann]: https://github.com/marten-seemann
+[@mxinden]: https://github.com/mxinden/
+
+## Motivation
+
+**No need for trusted TLS certificates.** Enable browsers to connect to public
+server nodes without those server nodes providing a TLS certificate within the
+browser's trustchain. Note that we can not do this today with our Websocket
+transport as the browser requires the remote to have a trusted TLS certificate.
+Nor can we establish a plain TCP or QUIC connection from within a browser. We
+can establish a WebTransport connection from the browser (see [WebTransport
+specification](../webtransport)).
+
+## Addressing
+
+WebRTC Direct multiaddresses are composed of an IP and UDP address component, followed
+by `/webrtc-direct` and a multihash of the certificate that the node uses.
+
+Examples:
+- `/ip4/1.2.3.4/udp/1234/webrtc-direct/certhash/<hash>/p2p/<peer-id>`
+- `/ip6/fe80::1ff:fe23:4567:890a/udp/1234/webrtc-direct/certhash/<hash>/p2p/<peer-id>`
+
+The TLS certificate fingerprint in `/certhash` is a
+[multibase](https://github.com/multiformats/multibase) encoded
+[multihash](https://github.com/multiformats/multihash).
+
+For compatibility implementations MUST support hash algorithm
+[`sha-256`](https://github.com/multiformats/multihash) and base encoding
+[`base64url`](https://github.com/multiformats/multibase). Implementations MAY
+support other hash algorithms and base encodings, but they may not be able to
+connect to all other nodes.
+
+## Connection Establishment
+
+### Browser to public Server
+
+Scenario: Browser _A_ wants to connect to server node _B_ where _B_ is publicly
+reachable but _B_ does not have a TLS certificate trusted by _A_.
+
+1. Server node _B_ generates a TLS certificate, listens on a UDP port and
+   advertises the corresponding multiaddress (see [#addressing]) through some
+   external mechanism.
+
+   Given that _B_ is publicly reachable, _B_ acts as a [ICE
+   Lite](https://www.rfc-editor.org/rfc/rfc5245) agent. It binds to a UDP port
+   waiting for incoming STUN and SCTP packets and multiplexes based on source IP
+   and source port.
+
+2. Browser _A_ discovers server node _B_'s multiaddr, containing _B_'s IP, UDP
+  port, TLS certificate fingerprint and optionally libp2p peer ID (e.g.
+  `/ip6/2001:db8::/udp/1234/webrtc-direct/certhash/<hash>/p2p/<peer-id>`), through some
+  external mechanism.
+
+3. _A_ instantiates a `RTCPeerConnection`. See
+   [`RTCPeerConnection()`](https://developer.mozilla.org/en-US/docs/Web/API/RTCPeerConnection/RTCPeerConnection).
+
+   _A_ (i.e. the browser) SHOULD NOT reuse the same certificate across
+   `RTCPeerConnection`s. Reusing the certificate can be used to identify _A_
+   across connections by on-path observers given that WebRTC uses TLS 1.2.
+
+4. _A_ constructs _B_'s SDP answer locally based on _B_'s multiaddr.
+
+   _A_ generates a random string prefixed with "libp2p+webrtc+v1/". The prefix
+   allows us to use the ufrag as an upgrade mechanism to role out a new version
+   of the libp2p WebRTC protocol on a live network. While a hack, this might be
+   very useful in the future. _A_ sets the string as the username (_ufrag_ or _username fragment_)
+   and password on the SDP of the remote's answer.
+
+   _A_ MUST set the `a=max-message-size:16384` SDP attribute. See reasoning
+   [multiplexing] for rational.
+
+   Finally _A_ sets the remote answer via
+   [`RTCPeerConnection.setRemoteDescription()`](https://developer.mozilla.org/en-US/docs/Web/API/RTCPeerConnection/setRemoteDescription).
+
+5. _A_ creates a local offer via
+   [`RTCPeerConnection.createOffer()`](https://developer.mozilla.org/en-US/docs/Web/API/RTCPeerConnection/createOffer).
+   _A_ sets the same username and password on the local offer as done in (4) on
+   the remote answer.
+
+   _A_ MUST set the `a=max-message-size:16384` SDP attribute. See reasoning
+   [multiplexing] for rational.
+
+   Finally _A_ sets the modified offer via
+   [`RTCPeerConnection.setLocalDescription()`](https://developer.mozilla.org/en-US/docs/Web/API/RTCPeerConnection/setLocalDescription).
+
+   Note that this process, oftentimes referred to as "SDP munging" is disallowed
+   by the specification, but not enforced across the major browsers (Safari,
+   Firefox, Chrome) due to use-cases in the wild. See also
+   https://bugs.chromium.org/p/chromium/issues/detail?id=823036
+
+6. Once _A_ sets the SDP offer and answer, it will start sending STUN requests
+   to _B_. _B_ reads the _ufrag_ from the incoming STUN request's _username_
+   field. _B_ then infers _A_'s SDP offer using the IP, port, and _ufrag_ of the
+   request as follows:
+
+   1. _B_ sets the the `ice-ufrag` and `ice-pwd` equal to the value read from
+      the `username` field.
+
+   2. _B_ sets an arbitrary sha-256 digest as the remote fingerprint as it does
+      not verify fingerprints at this point.
+
+   3. _B_ sets the connection field (`c`) to the IP and port of the incoming
+      request `c=IN <ip> <port>`.
+
+   4. _B_ sets the `a=max-message-size:16384` SDP attribute. See reasoning
+      [multiplexing] for rational.
+
+   _B_ sets this offer as the remote description. _B_ generates an answer and
+   sets it as the local description.
+
+   The _ufrag_ in combination with the IP and port of _A_ can be used by _B_
+   to identify the connection, i.e. demultiplex incoming UDP datagrams per
+   incoming connection.
+
+   Note that this step requires _B_ to allocate memory for each incoming STUN
+   message from _A_. This could be leveraged for a DOS attack where _A_ is
+   sending many STUN messages with different ufrags using different UDP source
+   ports, forcing _B_ to allocate a new peer connection for each. _B_ SHOULD
+   have a rate limiting mechanism in place as a defense measure. See also
+   https://datatracker.ietf.org/doc/html/rfc5389#section-16.1.2.
+
+7. _A_ and _B_ execute the DTLS handshake as part of the standard WebRTC
+   connection establishment.
+
+   At this point _B_ does not know the TLS certificate fingerprint of _A_. Thus
+   _B_ can not verify _A_'s TLS certificate fingerprint during the DTLS
+   handshake. Instead _B_ needs to _disable certificate fingerprint
+   verification_ (see e.g. [Pion's `disableCertificateFingerprintVerification`
+   option](https://github.com/pion/webrtc/blob/360b0f1745c7244850ed638f423cda716a81cedf/settingengine.go#L62)).
+
+   On success of the DTLS handshake the connection provides confidentiality and
+   integrity but not authenticity. The latter is guaranteed through the
+   succeeding Noise handshake. See [Connection Security
+   section](#connection-security).
+
+8. Messages on each `RTCDataChannel` are framed using the message
+   framing mechanism described in [Multiplexing].
+
+9. The remote is authenticated via an additional Noise handshake. See
+   [Connection Security section](#connection-security).
+
+WebRTC can run both on UDP and TCP. libp2p WebRTC implementations MUST support
+UDP and MAY support TCP.
+
+
+## Connection Security
+
+Note that the below uses the message framing described in
+[multiplexing].
+
+While WebRTC offers confidentiality and integrity via TLS, one still needs to
+authenticate the remote peer by its libp2p identity.
+
+After [Connection Establishment](#connection-establishment):
+
+1. _A_ and _B_ open a WebRTC data channel with `id: 0` and `negotiated: true`
+   ([`pc.createDataChannel("", {negotiated: true, id:
+   0});`](https://developer.mozilla.org/en-US/docs/Web/API/RTCPeerConnection/createDataChannel)).
+
+2. _B_ starts a Noise `XX` handshake on the new channel. See
+   [noise-libp2p](https://github.com/libp2p/specs/tree/master/noise).
+
+   _A_ and _B_ use the [Noise
+   Prologue](https://noiseprotocol.org/noise.html#prologue) mechanism. More
+   specifically _A_ and _B_ set the Noise _Prologue_ to
+   `<PREFIX><FINGERPRINT_A><FINGERPRINT_B>` before starting the actual Noise
+   handshake. `<PREFIX>` is the UTF-8 byte representation of the string
+   `libp2p-webrtc-noise:`. `<FINGERPRINT_A><FINGERPRINT_B>` is the concatenation
+   of the two TLS fingerprints of _A_ (Noise handshake responder) and then _B_
+   (Noise handshake initiator), in their multihash byte representation.
+
+   On Chrome _A_ can access its TLS certificate fingerprint directly via
+   `RTCCertificate#getFingerprints`. Firefox does not allow _A_ to do so. Browser
+   compatibility can be found
+   [here](https://developer.mozilla.org/en-US/docs/Web/API/RTCCertificate). In
+   practice, this is not an issue since the fingerprint is embedded in the local
+   SDP string.
+
+3. On success of the authentication handshake, the used datachannel is
+   closed and the plain WebRTC connection is used with its multiplexing
+   capabilities via datachannels. See [Multiplexing].
+
+Note: WebRTC supports different hash functions to hash the TLS certificate (see
+https://datatracker.ietf.org/doc/html/rfc8122#section-5). The hash function used
+in WebRTC and the hash function used in the multiaddr `/certhash` component MUST
+be the same. On mismatch the final Noise handshake MUST fail.
+
+_A_ knows _B_'s fingerprint hash algorithm through _B_'s multiaddr. _A_ MUST use
+the same hash algorithm to calculate the fingerprint of its (i.e. _A_'s) TLS
+certificate. _B_ assumes _A_ to use the same hash algorithm it discovers through
+_B_'s multiaddr. For now implementations MUST support sha-256. Future iterations
+of this specification may add support for other hash algorithms.
+
+Implementations SHOULD setup all the necessary callbacks (e.g.
+[`ondatachannel`](https://developer.mozilla.org/en-US/docs/Web/API/RTCPeerConnection/datachannel_event))
+before starting the Noise handshake. This is to avoid scenarios like one where
+_A_ initiates a stream before _B_ got a chance to set the `ondatachannel`
+callback. This would result in _B_ ignoring all the messages coming from _A_
+targeting that stream.
+
+Implementations MAY open streams before completion of the Noise handshake.
+Applications MUST take special care what application data they send, since at
+this point the peer is not yet authenticated. Similarly, the receiving side MAY
+accept streams before completion of the handshake.
+
+## Test vectors
+
+### Noise prologue
+
+All of these test vectors represent hex-encoded bytes.
+
+#### Both client and server use SHA-256
+
+Here client is _A_ and server is _B_.
+
+```
+client_fingerprint = "3e79af40d6059617a0d83b83a52ce73b0c1f37a72c6043ad2969e2351bdca870"
+server_fingerprint = "30fc9f469c207419dfdd0aab5f27a86c973c94e40548db9375cca2e915973b99"
+
+prologue = "6c69627032702d7765627274632d6e6f6973653a12203e79af40d6059617a0d83b83a52ce73b0c1f37a72c6043ad2969e2351bdca870122030fc9f469c207419dfdd0aab5f27a86c973c94e40548db9375cca2e915973b99"
+```
+
+# FAQ
+
+- _Why exchange the TLS certificate fingerprint in the multiaddr? Why not
+  base it on the libp2p public key?_
+
+  Browsers do not allow loading a custom certificate. One can only generate a
+  certificate via
+  [rtcpeerconnection-generatecertificate](https://www.w3.org/TR/webrtc/#dom-rtcpeerconnection-generatecertificate).
+
+- _Why not embed the peer ID in the TLS certificate, thus rendering the
+  additional "peer certificate" exchange obsolete?_
+
+  Browsers do not allow editing the properties of the TLS certificate.
+
+- _How about distributing the multiaddr in a signed peer record, thus rendering
+  the additional "peer certificate" exchange obsolete?_
+
+  Signed peer records are not yet rolled out across the many libp2p protocols.
+  Making the libp2p WebRTC protocol dependent on the former is not deemed worth
+  it at this point in time. Later versions of the libp2p WebRTC protocol might
+  adopt this optimization.
+
+  Note, one can role out a new version of the libp2p WebRTC protocol through a
+  new multiaddr protocol, e.g. `/webrtc-direct-2`.
+
+- _Why exchange fingerprints in an additional authentication handshake on top of
+  an established WebRTC connection? Why not only exchange signatures of ones TLS
+  fingerprints signed with ones libp2p private key on the plain WebRTC
+  connection?_
+
+  Once _A_ and _B_ established a WebRTC connection, _A_ sends
+  `signature_libp2p_a(fingerprint_a)` to _B_ and vice versa. While this has the
+  benefit of only requring two messages, thus one round trip, it is prone to a
+  key compromise and replay attack. Say that _E_ is able to attain
+  `signature_libp2p_a(fingerprint_a)` and somehow compromise _A_'s TLS private
+  key, _E_ can now impersonate _A_ without knowing _A_'s libp2p private key.
+
+  If one requires the signatures to contain both fingerprints, e.g.
+  `signature_libp2p_a(fingerprint_a, fingerprint_b)`, the above attack still
+  works, just that _E_ can only impersonate _A_ when talking to _B_.
+
+  Adding a cryptographic identifier of the unique connection (i.e. session) to
+  the signature (`signature_libp2p_a(fingerprint_a, fingerprint_b,
+  connection_identifier)`) would protect against this attack. To the best of our
+  knowledge the browser does not give us access to such identifier.
+
+- _Can a browser know upfront its UDP port which it is listening for incoming
+  connections on? Does the browser reuse the UDP port across many WebRTC
+  connections? If that is the case one could connect to any public node, with
+  the remote telling the local node what port it is perceived on. Thus one could
+  use libp2p's identify and AutoNAT protocol instead of relying on STUN._
+
+  No, a browser uses a new UDP port for each `RTCPeerConnection`.
+
+- _Why not load a remote node's certificate into one's browser trust-store and
+  then connect e.g. via WebSocket._
+
+  This would require a mechanism to discover remote node's certificates upfront.
+  More importantly, this does not scale with the number of connections a typical
+  peer-to-peer application establishes.
+
+- _Can an attacker launch an amplification attack with the STUN endpoint of
+  the server?_
+
+  We follow the reasoning of the QUIC protocol, namely requiring:
+
+  > an endpoint MUST limit the amount of data it sends to the unvalidated
+  > address to three times the amount of data received from that address.
+
+  https://datatracker.ietf.org/doc/html/rfc9000#section-8
+
+  This is the case for STUN response messages which are only slight larger than
+  the request messages. See also
+  https://datatracker.ietf.org/doc/html/rfc5389#section-16.1.2.
+
+- _Why does B start the Noise handshake and not A?_
+
+  Given that WebRTC uses DTLS 1.2, _B_ is the one that can send data first.
+
+[multiplexing]: ./README.md#multiplexing
--- a/webrtc/webrtc.md
+++ b/webrtc/webrtc.md
@@ -0,0 +1,122 @@
+# WebRTC
+
+| Lifecycle Stage | Maturity                 | Status | Latest Revision |
+|-----------------|--------------------------|--------|-----------------|
+| 2A              | Candidate Recommendation | Active | r0, 2023-04-12  |
+
+Authors: [@mxinden]
+
+## Motivation
+
+libp2p transport protocol enabling two private nodes (e.g. two browsers) to establish a direct connection.
+
+Browser _A_ wants to connect to Browser node _B_ with the help of server node _R_.
+Both _A_ and _B_ cannot listen for incoming connections due to running in a constrained environment (i.e. a browser) with its only transport capability being the W3C WebRTC `RTCPeerConnection` API and being behind a NAT and/or firewall.
+Note that _A_ and/or _B_ may as well be non-browser nodes behind NATs and/or firewalls.
+However, for two non-browser nodes using TCP or QUIC hole punching with [DCUtR] will be the more efficient way to establish a direct connection.
+
+On a historical note, this specification replaces the existing [libp2p WebRTC star](https://github.com/libp2p/js-libp2p-webrtc-star) and [libp2p WebRTC direct](https://github.com/libp2p/js-libp2p-webrtc-direct) protocols.
+
+## Connection Establishment
+
+1. _B_ advertises support for the WebRTC browser-to-browser protocol by appending `/webrtc` to its relayed multiaddr, meaning it takes the form of `<relayed-multiaddr>/webrtc/p2p/<b-peer-id>`.
+
+2. Upon discovery of _B_'s multiaddress, _A_ learns that _B_ supports the WebRTC transport and knows how to establish a relayed connection to _B_ to run the `/webrtc-signaling` protocol on top.
+
+3. _A_ establishes a relayed connection to _B_.
+   Note that further steps depend on the relayed connection to be authenticated, i.e. that data sent on the relayed connection can be trusted.
+
+4. _A_ (outbound side of relayed connection) creates an `RTCPeerConnection` provided by a W3C compliant WebRTC implementation (e.g. a browser).
+   See [STUN](#stun) section on what STUN servers to configure at creation time.
+   _A_ creates an SDP offer via `RTCPeerConnection.createOffer()`.
+   _A_ initiates the signaling protocol to _B_ via the relayed connection from (1), see [Signaling Protocol](#signaling-protocol) and sends the offer to _B_.
+   Note that _A_ being the initiator of the stream is merely a convention preventing both nodes to simultaneously initiate a new connection thus potentially resulting in two WebRTC connections.
+   _A_ MUST as well be able to handle an incoming signaling protocol stream to support the case where _B_ initiates the signaling process.
+
+5. On reception of the incoming stream, _B_ (inbound side of relayed connection) creates an `RTCPeerConnection`.
+   Again see [STUN](#stun) section on what STUN servers to configure at creation time.
+   _B_ receives _A_'s offer sent in (2) via the signaling protocol stream and provides the offer to its `RTCPeerConnection` via `RTCPeerConnection.setRemoteDescription`.
+   _B_ then creates an answer via `RTCPeerConnection.createAnswer` and sends it to _A_ via the existing signaling protocol stream (see [Signaling Protocol](#signaling-protocol)).
+
+6. _A_ receives _B_'s answer via the signaling protocol stream and sets it locally via `RTCPeerConnection.setRemoteDescription`.
+
+7. _A_ and _B_ send their local ICE candidates via the existing signaling protocol stream to enable trickle ICE.
+   Both nodes continuously read from the stream, adding incoming remote candidates via `RTCPeerConnection.addIceCandidate()`.
+
+8. On successful establishment of the direct connection, _B_ and _A_ close the signaling protocol stream.
+   On failure _B_ and _A_ reset the signaling protocol stream.
+
+   Behavior for transferring data on a relayed connection, in the case where the direct connection failed, is out of scope for this specification and dependent on the application.
+
+9. Messages on `RTCDataChannel`s on the established `RTCPeerConnection` are framed using the message framing mechanism described in [multiplexing].
+
+## STUN
+
+A node needs to discover its public IP and port, which is forwarded to the remote node in order to connect to the local node.
+On non-browser libp2p nodes doing a hole punch with TCP or QUIC, the libp2p node discovers its public address via the [identify] protocol.
+One cannot use the [identify] protocol on browser nodes to discover ones public IP and port given that the browser uses a new port for each connection.
+For example say that the local browser node establishes a WebRTC connection C1 via browser-to-server to a server node and runs the [identify] protocol.
+The returned observed public port P1 will most likely (depending on the NAT) be a different port than the port observed on another connection C2.
+The only browser supported mechanism to discover ones public IP and port for a given WebRTC connection is the non-libp2p protocol STUN.
+This is why this specification depends on STUN, and thus the availability of one or more STUN servers for _A_ and _B_ to discovery their public addresses.
+
+Implementations MAY use one of the publicly available STUN servers, or deploy a dedicated server for a given libp2p network.
+Further specification of the usage of STUN is out of scope for this specifitcation.
+
+It is not necessary for _A_ and _B_ to use the same STUN server when establishing a WebRTC connection.
+
+## Signaling Protocol
+
+The protocol id is `/webrtc-signaling`.
+Messages are sent prefixed with the message length in bytes, encoded as an unsigned variable length integer as defined by the [multiformats unsigned-varint spec][uvarint-spec].
+
+``` protobuf
+syntax = "proto3";
+
+message Message {
+    // Specifies type in `data` field.
+    enum Type {
+        // String of `RTCSessionDescription.sdp`
+        SDP_OFFER = 0;
+        // String of `RTCSessionDescription.sdp`
+        SDP_ANSWER = 1;
+        // String of `RTCIceCandidate.toJSON()`
+        ICE_CANDIDATE = 2;
+    }
+
+    optional Type type = 1;
+    optional string data = 2;
+}
+```
+
+## FAQ
+
+- Why is there no additional Noise handshake needed?
+
+  This specification (browser-to-browser) requires _A_ and _B_ to exchange their SDP offer and answer over an authenticated channel.
+  Offer and answer contain the TLS certificate fingerprint.
+  The browser validates the TLS certificate fingerprint through the DTLS handshake during the WebRTC connection establishment.
+
+  In contrast, the browser-to-server specification allows exchange of the server's multiaddr, containing the server's TLS certificate fingerprint, over unauthenticated channels.
+  In other words, the browser-to-server specification does not consider the TLS certificate fingerprint in the server's multiaddr to be trusted.
+
+- Why use a custom signaling protocol? Why not use [DCUtR]?
+
+  DCUtR offers time synchronization through a two-step protocol (first `Connect`, then `Sync`).
+  This is not needed for WebRTC.
+
+  DCUtR does not provide a mechanism to trickle local address candidates to the remote as they are discovered.
+  Trickling candidates just-in-time allows for faster WebRTC connection establishment.
+
+- Why does _A_ and not _B_ initiate the signaling protocol?
+
+  In [DCUtR] _B_ (inbound side of the relayed connection) initiates the [DCUtR] protocol by opening the [DCUtR] protocol stream.
+  The reason is that in case _A_ is publicly reachable, _B_ might be able to use connection reversal to connect to _A_ directly.
+  This reason does not apply to the WebRTC browser-to-browser protocol.
+  Given that _A_ and _B_ at this point already have a relayed connection established, they might as well use it to exchange SDP, instead of using connection reversal and WebRTC browser-to-server.
+  Thus, for the WebRTC browser-to-browser protocol, _A_ initiates the signaling protocol by opening the signaling protocol stream.
+
+[DCUtR]: ./../relay/DCUtR.md
+[identify]: ./../identify/README.md
+[multiplexing]: ./README.md#multiplexing
+[uvarint-spec]: https://github.com/multiformats/unsigned-varint