mirror of
https://github.com/vacp2p/specs.git
synced 2026-01-07 22:44:07 -05:00
feat(webrtc): add WebRTC (prev. browser-to-browser) spec (#497)
Introduces the webrtc protocol - a libp2p transport protocol enabling two private nodes (e.g. two browsers) to establish a direct connection.
This commit is contained in:
@@ -99,7 +99,7 @@ see [#465](https://github.com/libp2p/specs/issues/465).
|
|||||||
- [secio][spec_secio] - SECIO, a transport security protocol for libp2p
|
- [secio][spec_secio] - SECIO, a transport security protocol for libp2p
|
||||||
- [tls][spec_tls] - The libp2p TLS Handshake (TLS 1.3+)
|
- [tls][spec_tls] - The libp2p TLS Handshake (TLS 1.3+)
|
||||||
- [quic][spec_quic] - The libp2p QUIC Handshake
|
- [quic][spec_quic] - The libp2p QUIC Handshake
|
||||||
- [webrtc][spec_webrtc] - The libp2p WebRTC transport
|
- [webrtc][spec_webrtc] - The libp2p WebRTC transports
|
||||||
- [WebTransport][spec_webtransport] - Using WebTransport in libp2p
|
- [WebTransport][spec_webtransport] - Using WebTransport in libp2p
|
||||||
|
|
||||||
|
|
||||||
|
|||||||
344
webrtc/README.md
344
webrtc/README.md
@@ -1,8 +1,8 @@
|
|||||||
# WebRTC
|
# WebRTC
|
||||||
|
|
||||||
| Lifecycle Stage | Maturity | Status | Latest Revision |
|
| Lifecycle Stage | Maturity | Status | Latest Revision |
|
||||||
|-----------------|---------------------------|--------|-----------------|
|
|-----------------|--------------------------|--------|-----------------|
|
||||||
| 2A | Candidate Recommendation | Active | r0, 2022-10-14 |
|
| 2A | Candidate Recommendation | Active | r1, 2023-04-12 |
|
||||||
|
|
||||||
Authors: [@mxinden]
|
Authors: [@mxinden]
|
||||||
|
|
||||||
@@ -11,171 +11,27 @@ Interest Group: [@marten-seemann]
|
|||||||
[@marten-seemann]: https://github.com/marten-seemann
|
[@marten-seemann]: https://github.com/marten-seemann
|
||||||
[@mxinden]: https://github.com/mxinden/
|
[@mxinden]: https://github.com/mxinden/
|
||||||
|
|
||||||
<!-- markdown-toc start - Don't edit this section. Run M-x markdown-toc-refresh-toc -->
|
WebRTC flavors in libp2p:
|
||||||
**Table of Contents**
|
|
||||||
|
|
||||||
- [WebRTC](#webrtc)
|
1. [WebRTC](./webrtc.md)
|
||||||
- [Motivation](#motivation)
|
|
||||||
- [Addressing](#addressing)
|
|
||||||
- [Connection Establishment](#connection-establishment)
|
|
||||||
- [Browser to public Server](#browser-to-public-server)
|
|
||||||
- [Multiplexing](#multiplexing)
|
|
||||||
- [Ordering](#ordering)
|
|
||||||
- [Head-of-line blocking](#head-of-line-blocking)
|
|
||||||
- [`RTCDataChannel` negotiation](#rtcdatachannel-negotiation)
|
|
||||||
- [`RTCDataChannel` label](#rtcdatachannel-label)
|
|
||||||
- [Connection Security](#connection-security)
|
|
||||||
- [Previous, ongoing and related work](#previous-ongoing-and-related-work)
|
|
||||||
- [Test vectors](#test-vectors)
|
|
||||||
- [Noise prologue](#noise-prologue)
|
|
||||||
- [Both client and server use SHA-256](#both-client-and-server-use-sha-256)
|
|
||||||
- [FAQ](#faq)
|
|
||||||
|
|
||||||
<!-- markdown-toc end -->
|
libp2p transport protocol enabling two private nodes (e.g. two browsers) to
|
||||||
|
establish a direct connection.
|
||||||
|
|
||||||
## Motivation
|
2. [WebRTC Direct](./webrtc-direct.md)
|
||||||
|
|
||||||
1. **No need for trusted TLS certificates.** Enable browsers to connect to
|
libp2p transport protocol **without the need for trusted TLS certificates.**
|
||||||
public server nodes without those server nodes providing a TLS certificate
|
Enable browsers to connect to public server nodes without those server nodes
|
||||||
within the browser's trustchain. Note that we can not do this today with our
|
providing a TLS certificate within the browser's trustchain. Note that we can
|
||||||
Websocket transport as the browser requires the remote to have a trusted TLS
|
not do this today with our Websocket transport as the browser requires the
|
||||||
certificate. Nor can we establish a plain TCP or QUIC connection from within
|
remote to have a trusted TLS certificate. Nor can we establish a plain TCP or
|
||||||
a browser. We can establish a WebTransport connection from the browser (see
|
QUIC connection from within a browser. We can establish a WebTransport
|
||||||
[WebTransport specification](../webtransport)).
|
connection from the browser (see [WebTransport
|
||||||
|
specification](../webtransport)).
|
||||||
|
|
||||||
## Addressing
|
## Shared concepts
|
||||||
|
|
||||||
WebRTC multiaddresses are composed of an IP and UDP address component, followed
|
### Multiplexing
|
||||||
by `/webrtc` and a multihash of the certificate that the node uses.
|
|
||||||
|
|
||||||
Examples:
|
|
||||||
|
|
||||||
- `/ip4/192.0.2.0/udp/1234/webrtc/certhash/<hash>/p2p/<peer-id>`
|
|
||||||
- `/ip6/fe80::1ff:fe23:4567:890a/udp/1234/webrtc/certhash/<hash>/p2p/<peer-id>`
|
|
||||||
|
|
||||||
The TLS certificate fingerprint in `/certhash` is a
|
|
||||||
[multibase](https://github.com/multiformats/multibase) encoded
|
|
||||||
[multihash](https://github.com/multiformats/multihash).
|
|
||||||
|
|
||||||
For compatibility implementations MUST support hash algorithm
|
|
||||||
[`sha-256`](https://github.com/multiformats/multihash) and base encoding
|
|
||||||
[`base64url`](https://github.com/multiformats/multibase). Implementations MAY
|
|
||||||
support other hash algorithms and base encodings, but they may not be able to
|
|
||||||
connect to all other nodes.
|
|
||||||
|
|
||||||
## Connection Establishment
|
|
||||||
|
|
||||||
### Browser to public Server
|
|
||||||
|
|
||||||
Scenario: Browser _A_ wants to connect to server node _B_ where _B_ is publicly
|
|
||||||
reachable but _B_ does not have a TLS certificate trusted by _A_.
|
|
||||||
|
|
||||||
1. Server node _B_ generates a TLS certificate, listens on a UDP port and
|
|
||||||
advertises the corresponding multiaddress (see [#addressing]) through some
|
|
||||||
external mechanism.
|
|
||||||
|
|
||||||
Given that _B_ is publicly reachable, _B_ acts as a [ICE
|
|
||||||
Lite](https://www.rfc-editor.org/rfc/rfc5245) agent. It binds to a UDP port
|
|
||||||
waiting for incoming STUN and SCTP packets and multiplexes based on source IP
|
|
||||||
and source port.
|
|
||||||
|
|
||||||
2. Browser _A_ discovers server node _B_'s multiaddr, containing _B_'s IP, UDP
|
|
||||||
port, TLS certificate fingerprint and optionally libp2p peer ID (e.g.
|
|
||||||
`/ip6/2001:db8::/udp/1234/webrtc/certhash/<hash>/p2p/<peer-id>`), through some
|
|
||||||
external mechanism.
|
|
||||||
|
|
||||||
3. _A_ instantiates a `RTCPeerConnection`. See
|
|
||||||
[`RTCPeerConnection()`](https://developer.mozilla.org/en-US/docs/Web/API/RTCPeerConnection/RTCPeerConnection).
|
|
||||||
|
|
||||||
_A_ (i.e. the browser) SHOULD NOT reuse the same certificate across
|
|
||||||
`RTCPeerConnection`s. Reusing the certificate can be used to identify _A_
|
|
||||||
across connections by on-path observers given that WebRTC uses TLS 1.2.
|
|
||||||
|
|
||||||
4. _A_ constructs _B_'s SDP answer locally based on _B_'s multiaddr.
|
|
||||||
|
|
||||||
_A_ generates a random string prefixed with "libp2p+webrtc+v1/". The prefix
|
|
||||||
allows us to use the ufrag as an upgrade mechanism to role out a new version
|
|
||||||
of the libp2p WebRTC protocol on a live network. While a hack, this might be
|
|
||||||
very useful in the future. _A_ sets the string as the username (_ufrag_ or _username fragment_)
|
|
||||||
and password on the SDP of the remote's answer.
|
|
||||||
|
|
||||||
_A_ MUST set the `a=max-message-size:16384` SDP attribute. See reasoning
|
|
||||||
[multiplexing](#multiplexing) for rational.
|
|
||||||
|
|
||||||
Finally _A_ sets the remote answer via
|
|
||||||
[`RTCPeerConnection.setRemoteDescription()`](https://developer.mozilla.org/en-US/docs/Web/API/RTCPeerConnection/setRemoteDescription).
|
|
||||||
|
|
||||||
5. _A_ creates a local offer via
|
|
||||||
[`RTCPeerConnection.createOffer()`](https://developer.mozilla.org/en-US/docs/Web/API/RTCPeerConnection/createOffer).
|
|
||||||
_A_ sets the same username and password on the local offer as done in (4) on
|
|
||||||
the remote answer.
|
|
||||||
|
|
||||||
_A_ MUST set the `a=max-message-size:16384` SDP attribute. See reasoning
|
|
||||||
[multiplexing](#multiplexing) for rational.
|
|
||||||
|
|
||||||
Finally _A_ sets the modified offer via
|
|
||||||
[`RTCPeerConnection.setLocalDescription()`](https://developer.mozilla.org/en-US/docs/Web/API/RTCPeerConnection/setLocalDescription).
|
|
||||||
|
|
||||||
Note that this process, oftentimes referred to as "SDP munging" is disallowed
|
|
||||||
by the specification, but not enforced across the major browsers (Safari,
|
|
||||||
Firefox, Chrome) due to use-cases in the wild. See also
|
|
||||||
<https://bugs.chromium.org/p/chromium/issues/detail?id=823036>
|
|
||||||
|
|
||||||
6. Once _A_ sets the SDP offer and answer, it will start sending STUN requests
|
|
||||||
to _B_. _B_ reads the _ufrag_ from the incoming STUN request's _username_
|
|
||||||
field. _B_ then infers _A_'s SDP offer using the IP, port, and _ufrag_ of the
|
|
||||||
request as follows:
|
|
||||||
|
|
||||||
1. _B_ sets the the `ice-ufrag` and `ice-pwd` equal to the value read from
|
|
||||||
the `username` field.
|
|
||||||
|
|
||||||
2. _B_ sets an arbitrary sha-256 digest as the remote fingerprint as it does
|
|
||||||
not verify fingerprints at this point.
|
|
||||||
|
|
||||||
3. _B_ sets the connection field (`c`) to the IP and port of the incoming
|
|
||||||
request `c=IN <ip> <port>`.
|
|
||||||
|
|
||||||
4. _B_ sets the `a=max-message-size:16384` SDP attribute. See reasoning
|
|
||||||
[multiplexing](#multiplexing) for rational.
|
|
||||||
|
|
||||||
_B_ sets this offer as the remote description. _B_ generates an answer and
|
|
||||||
sets it as the local description.
|
|
||||||
|
|
||||||
The _ufrag_ in combination with the IP and port of _A_ can be used by _B_
|
|
||||||
to identify the connection, i.e. demultiplex incoming UDP datagrams per
|
|
||||||
incoming connection.
|
|
||||||
|
|
||||||
Note that this step requires _B_ to allocate memory for each incoming STUN
|
|
||||||
message from _A_. This could be leveraged for a DOS attack where _A_ is
|
|
||||||
sending many STUN messages with different ufrags using different UDP source
|
|
||||||
ports, forcing _B_ to allocate a new peer connection for each. _B_ SHOULD
|
|
||||||
have a rate limiting mechanism in place as a defense measure. See also
|
|
||||||
<https://datatracker.ietf.org/doc/html/rfc5389#section-16.1.2>.
|
|
||||||
|
|
||||||
7. _A_ and _B_ execute the DTLS handshake as part of the standard WebRTC
|
|
||||||
connection establishment.
|
|
||||||
|
|
||||||
At this point _B_ does not know the TLS certificate fingerprint of _A_. Thus
|
|
||||||
_B_ can not verify _A_'s TLS certificate fingerprint during the DTLS
|
|
||||||
handshake. Instead _B_ needs to _disable certificate fingerprint
|
|
||||||
verification_ (see e.g. [Pion's `disableCertificateFingerprintVerification`
|
|
||||||
option](https://github.com/pion/webrtc/blob/360b0f1745c7244850ed638f423cda716a81cedf/settingengine.go#L62)).
|
|
||||||
|
|
||||||
On success of the DTLS handshake the connection provides confidentiality and
|
|
||||||
integrity but not authenticity. The latter is guaranteed through the
|
|
||||||
succeeding Noise handshake. See [Connection Security
|
|
||||||
section](#connection-security).
|
|
||||||
|
|
||||||
8. Messages on each `RTCDataChannel` are framed using the message
|
|
||||||
framing mechanism described in [Multiplexing](#multiplexing).
|
|
||||||
|
|
||||||
9. The remote is authenticated via an additional Noise handshake. See
|
|
||||||
[Connection Security section](#connection-security).
|
|
||||||
|
|
||||||
WebRTC can run both on UDP and TCP. libp2p WebRTC implementations MUST support
|
|
||||||
UDP and MAY support TCP.
|
|
||||||
|
|
||||||
## Multiplexing
|
|
||||||
|
|
||||||
The WebRTC browser APIs do not support half-closing of streams nor resets of the
|
The WebRTC browser APIs do not support half-closing of streams nor resets of the
|
||||||
sending part of streams.
|
sending part of streams.
|
||||||
@@ -234,14 +90,14 @@ limits"](https://developer.mozilla.org/en-US/docs/Web/API/WebRTC_API/Using_data_
|
|||||||
Implementations MAY choose to send smaller messages, e.g. to reduce delays
|
Implementations MAY choose to send smaller messages, e.g. to reduce delays
|
||||||
sending _flagged_ messages.
|
sending _flagged_ messages.
|
||||||
|
|
||||||
### Ordering
|
#### Ordering
|
||||||
|
|
||||||
Implementations MAY expose an unordered byte stream abstraction to the user by
|
Implementations MAY expose an unordered byte stream abstraction to the user by
|
||||||
overriding the default value of `ordered` `true` to `false` when creating a new
|
overriding the default value of `ordered` `true` to `false` when creating a new
|
||||||
data channel via
|
data channel via
|
||||||
[`RTCPeerConnection.createDataChannel`](https://www.w3.org/TR/webrtc/#dom-peerconnection-createdatachannel).
|
[`RTCPeerConnection.createDataChannel`](https://www.w3.org/TR/webrtc/#dom-peerconnection-createdatachannel).
|
||||||
|
|
||||||
### Head-of-line blocking
|
#### Head-of-line blocking
|
||||||
|
|
||||||
WebRTC datachannels and the underlying SCTP is message-oriented and not
|
WebRTC datachannels and the underlying SCTP is message-oriented and not
|
||||||
stream-oriented (e.g. see
|
stream-oriented (e.g. see
|
||||||
@@ -272,7 +128,7 @@ IPv4 and 1224 bytes on IPv6.
|
|||||||
Long term we hope to be able to give better recommendations based on
|
Long term we hope to be able to give better recommendations based on
|
||||||
real-world experiments.
|
real-world experiments.
|
||||||
|
|
||||||
### `RTCDataChannel` negotiation
|
#### `RTCDataChannel` negotiation
|
||||||
|
|
||||||
`RTCDataChannel`s are negotiated in-band by the WebRTC user agent (e.g. Firefox,
|
`RTCDataChannel`s are negotiated in-band by the WebRTC user agent (e.g. Firefox,
|
||||||
Pion, ...). In other words libp2p WebRTC implementations MUST NOT change the
|
Pion, ...). In other words libp2p WebRTC implementations MUST NOT change the
|
||||||
@@ -294,7 +150,7 @@ containing user data without waiting for the reception of the corresponding
|
|||||||
DATA_CHANNEL_ACK message", thus using `negotiated: false` does not imply an
|
DATA_CHANNEL_ACK message", thus using `negotiated: false` does not imply an
|
||||||
additional round trip for each new `RTCDataChannel`.
|
additional round trip for each new `RTCDataChannel`.
|
||||||
|
|
||||||
### `RTCDataChannel` label
|
#### `RTCDataChannel` label
|
||||||
|
|
||||||
`RTCPeerConnection.createDataChannel()` requires passing a `label` for the
|
`RTCPeerConnection.createDataChannel()` requires passing a `label` for the
|
||||||
to-be-created `RTCDataChannel`. When calling `createDataChannel` implementations
|
to-be-created `RTCDataChannel`. When calling `createDataChannel` implementations
|
||||||
@@ -303,66 +159,6 @@ MUST pass an empty string. When receiving an `RTCDataChannel` via
|
|||||||
an empty string. This allows future versions of this specification to make use
|
an empty string. This allows future versions of this specification to make use
|
||||||
of the `RTCDataChannel` `label` property.
|
of the `RTCDataChannel` `label` property.
|
||||||
|
|
||||||
## Connection Security
|
|
||||||
|
|
||||||
Note that the below uses the message framing described in
|
|
||||||
[multiplexing](#multiplexing).
|
|
||||||
|
|
||||||
While WebRTC offers confidentiality and integrity via TLS, one still needs to
|
|
||||||
authenticate the remote peer by its libp2p identity.
|
|
||||||
|
|
||||||
After [Connection Establishment](#connection-establishment):
|
|
||||||
|
|
||||||
1. _A_ and _B_ open a WebRTC data channel with `id: 0` and `negotiated: true`
|
|
||||||
([`pc.createDataChannel("", {negotiated: true, id:
|
|
||||||
0});`](https://developer.mozilla.org/en-US/docs/Web/API/RTCPeerConnection/createDataChannel)).
|
|
||||||
|
|
||||||
2. _B_ starts a Noise `XX` handshake on the new channel. See
|
|
||||||
[noise-libp2p](https://github.com/libp2p/specs/tree/master/noise).
|
|
||||||
|
|
||||||
_A_ and _B_ use the [Noise
|
|
||||||
Prologue](https://noiseprotocol.org/noise.html#prologue) mechanism. More
|
|
||||||
specifically _A_ and _B_ set the Noise _Prologue_ to
|
|
||||||
`<PREFIX><FINGERPRINT_A><FINGERPRINT_B>` before starting the actual Noise
|
|
||||||
handshake. `<PREFIX>` is the UTF-8 byte representation of the string
|
|
||||||
`libp2p-webrtc-noise:`. `<FINGERPRINT_A><FINGERPRINT_B>` is the concatenation
|
|
||||||
of the two TLS fingerprints of _A_ (Noise handshake responder) and then _B_
|
|
||||||
(Noise handshake initiator), in their multihash byte representation.
|
|
||||||
|
|
||||||
On Chrome _A_ can access its TLS certificate fingerprint directly via
|
|
||||||
`RTCCertificate#getFingerprints`. Firefox does not allow _A_ to do so. Browser
|
|
||||||
compatibility can be found
|
|
||||||
[here](https://developer.mozilla.org/en-US/docs/Web/API/RTCCertificate). In
|
|
||||||
practice, this is not an issue since the fingerprint is embedded in the local
|
|
||||||
SDP string.
|
|
||||||
|
|
||||||
3. On success of the authentication handshake, the used datachannel is
|
|
||||||
closed and the plain WebRTC connection is used with its multiplexing
|
|
||||||
capabilities via datachannels. See [Multiplexing](#multiplexing).
|
|
||||||
|
|
||||||
Note: WebRTC supports different hash functions to hash the TLS certificate (see
|
|
||||||
<https://datatracker.ietf.org/doc/html/rfc8122#section-5>). The hash function used
|
|
||||||
in WebRTC and the hash function used in the multiaddr `/certhash` component MUST
|
|
||||||
be the same. On mismatch the final Noise handshake MUST fail.
|
|
||||||
|
|
||||||
_A_ knows _B_'s fingerprint hash algorithm through _B_'s multiaddr. _A_ MUST use
|
|
||||||
the same hash algorithm to calculate the fingerprint of its (i.e. _A_'s) TLS
|
|
||||||
certificate. _B_ assumes _A_ to use the same hash algorithm it discovers through
|
|
||||||
_B_'s multiaddr. For now implementations MUST support sha-256. Future iterations
|
|
||||||
of this specification may add support for other hash algorithms.
|
|
||||||
|
|
||||||
Implementations SHOULD setup all the necessary callbacks (e.g.
|
|
||||||
[`ondatachannel`](https://developer.mozilla.org/en-US/docs/Web/API/RTCPeerConnection/datachannel_event))
|
|
||||||
before starting the Noise handshake. This is to avoid scenarios like one where
|
|
||||||
_A_ initiates a stream before _B_ got a chance to set the `ondatachannel`
|
|
||||||
callback. This would result in _B_ ignoring all the messages coming from _A_
|
|
||||||
targeting that stream.
|
|
||||||
|
|
||||||
Implementations MAY open streams before completion of the Noise handshake.
|
|
||||||
Applications MUST take special care what application data they send, since at
|
|
||||||
this point the peer is not yet authenticated. Similarly, the receiving side MAY
|
|
||||||
accept streams before completion of the handshake.
|
|
||||||
|
|
||||||
## Previous, ongoing and related work
|
## Previous, ongoing and related work
|
||||||
|
|
||||||
- Completed implementations of this specification:
|
- Completed implementations of this specification:
|
||||||
@@ -375,68 +171,7 @@ accept streams before completion of the handshake.
|
|||||||
WASM): <https://github.com/wngr/libp2p-webrtc>
|
WASM): <https://github.com/wngr/libp2p-webrtc>
|
||||||
- WebRTC using STUN and TURN: <https://github.com/libp2p/js-libp2p-webrtc-star>
|
- WebRTC using STUN and TURN: <https://github.com/libp2p/js-libp2p-webrtc-star>
|
||||||
|
|
||||||
## Test vectors
|
## FAQ
|
||||||
|
|
||||||
### Noise prologue
|
|
||||||
|
|
||||||
All of these test vectors represent hex-encoded bytes.
|
|
||||||
|
|
||||||
#### Both client and server use SHA-256
|
|
||||||
|
|
||||||
Here client is _A_ and server is _B_.
|
|
||||||
|
|
||||||
```
|
|
||||||
client_fingerprint = "3e79af40d6059617a0d83b83a52ce73b0c1f37a72c6043ad2969e2351bdca870"
|
|
||||||
server_fingerprint = "30fc9f469c207419dfdd0aab5f27a86c973c94e40548db9375cca2e915973b99"
|
|
||||||
|
|
||||||
prologue = "6c69627032702d7765627274632d6e6f6973653a12203e79af40d6059617a0d83b83a52ce73b0c1f37a72c6043ad2969e2351bdca870122030fc9f469c207419dfdd0aab5f27a86c973c94e40548db9375cca2e915973b99"
|
|
||||||
```
|
|
||||||
|
|
||||||
# FAQ
|
|
||||||
|
|
||||||
- _Why exchange the TLS certificate fingerprint in the multiaddr? Why not
|
|
||||||
base it on the libp2p public key?_
|
|
||||||
|
|
||||||
Browsers do not allow loading a custom certificate. One can only generate a
|
|
||||||
certificate via
|
|
||||||
[rtcpeerconnection-generatecertificate](https://www.w3.org/TR/webrtc/#dom-rtcpeerconnection-generatecertificate).
|
|
||||||
|
|
||||||
- _Why not embed the peer ID in the TLS certificate, thus rendering the
|
|
||||||
additional "peer certificate" exchange obsolete?_
|
|
||||||
|
|
||||||
Browsers do not allow editing the properties of the TLS certificate.
|
|
||||||
|
|
||||||
- _How about distributing the multiaddr in a signed peer record, thus rendering
|
|
||||||
the additional "peer certificate" exchange obsolete?_
|
|
||||||
|
|
||||||
Signed peer records are not yet rolled out across the many libp2p protocols.
|
|
||||||
Making the libp2p WebRTC protocol dependent on the former is not deemed worth
|
|
||||||
it at this point in time. Later versions of the libp2p WebRTC protocol might
|
|
||||||
adopt this optimization.
|
|
||||||
|
|
||||||
Note, one can role out a new version of the libp2p WebRTC protocol through a
|
|
||||||
new multiaddr protocol, e.g. `/webrtc-2`.
|
|
||||||
|
|
||||||
- _Why exchange fingerprints in an additional authentication handshake on top of
|
|
||||||
an established WebRTC connection? Why not only exchange signatures of ones TLS
|
|
||||||
fingerprints signed with ones libp2p private key on the plain WebRTC
|
|
||||||
connection?_
|
|
||||||
|
|
||||||
Once _A_ and _B_ established a WebRTC connection, _A_ sends
|
|
||||||
`signature_libp2p_a(fingerprint_a)` to _B_ and vice versa. While this has the
|
|
||||||
benefit of only requring two messages, thus one round trip, it is prone to a
|
|
||||||
key compromise and replay attack. Say that _E_ is able to attain
|
|
||||||
`signature_libp2p_a(fingerprint_a)` and somehow compromise _A_'s TLS private
|
|
||||||
key, _E_ can now impersonate _A_ without knowing _A_'s libp2p private key.
|
|
||||||
|
|
||||||
If one requires the signatures to contain both fingerprints, e.g.
|
|
||||||
`signature_libp2p_a(fingerprint_a, fingerprint_b)`, the above attack still
|
|
||||||
works, just that _E_ can only impersonate _A_ when talking to _B_.
|
|
||||||
|
|
||||||
Adding a cryptographic identifier of the unique connection (i.e. session) to
|
|
||||||
the signature (`signature_libp2p_a(fingerprint_a, fingerprint_b,
|
|
||||||
connection_identifier)`) would protect against this attack. To the best of our
|
|
||||||
knowledge the browser does not give us access to such identifier.
|
|
||||||
|
|
||||||
- _Why use Protobuf for WebRTC message framing. Why not use our own,
|
- _Why use Protobuf for WebRTC message framing. Why not use our own,
|
||||||
potentially smaller encoding schema?_
|
potentially smaller encoding schema?_
|
||||||
@@ -451,44 +186,11 @@ prologue = "6c69627032702d7765627274632d6e6f6973653a12203e79af40d6059617a0d83b83
|
|||||||
going forward. Using Protobuf is consistent with the many other libp2p
|
going forward. Using Protobuf is consistent with the many other libp2p
|
||||||
protocols. These benefits outweigh the drawback of additional overhead.
|
protocols. These benefits outweigh the drawback of additional overhead.
|
||||||
|
|
||||||
- _Can a browser know upfront its UDP port which it is listening for incoming
|
|
||||||
connections on? Does the browser reuse the UDP port across many WebRTC
|
|
||||||
connections? If that is the case one could connect to any public node, with
|
|
||||||
the remote telling the local node what port it is perceived on. Thus one could
|
|
||||||
use libp2p's identify and AutoNAT protocol instead of relying on STUN._
|
|
||||||
|
|
||||||
No, a browser uses a new UDP port for each `RTCPeerConnection`.
|
|
||||||
|
|
||||||
- _Why not load a remote node's certificate into one's browser trust-store and
|
|
||||||
then connect e.g. via WebSocket._
|
|
||||||
|
|
||||||
This would require a mechanism to discover remote node's certificates upfront.
|
|
||||||
More importantly, this does not scale with the number of connections a typical
|
|
||||||
peer-to-peer application establishes.
|
|
||||||
|
|
||||||
- _Why not use a central TURN servers? Why rely on libp2p's Circuit Relay v2
|
- _Why not use a central TURN servers? Why rely on libp2p's Circuit Relay v2
|
||||||
instead?_
|
instead?_
|
||||||
|
|
||||||
As a peer-to-peer networking library, libp2p should rely as little as possible
|
As a peer-to-peer networking library, libp2p should rely as little as possible
|
||||||
on central infrastructure.
|
on central infrastructure.
|
||||||
|
|
||||||
- _Can an attacker launch an amplification attack with the STUN endpoint of
|
|
||||||
the server?_
|
|
||||||
|
|
||||||
We follow the reasoning of the QUIC protocol, namely requiring:
|
|
||||||
|
|
||||||
> an endpoint MUST limit the amount of data it sends to the unvalidated
|
|
||||||
> address to three times the amount of data received from that address.
|
|
||||||
|
|
||||||
<https://datatracker.ietf.org/doc/html/rfc9000#section-8>
|
|
||||||
|
|
||||||
This is the case for STUN response messages which are only slight larger than
|
|
||||||
the request messages. See also
|
|
||||||
<https://datatracker.ietf.org/doc/html/rfc5389#section-16.1.2>.
|
|
||||||
|
|
||||||
- _Why does B start the Noise handshake and not A?_
|
|
||||||
|
|
||||||
Given that WebRTC uses DTLS 1.2, _B_ is the one that can send data first.
|
|
||||||
|
|
||||||
[QUIC RFC]: https://www.rfc-editor.org/rfc/rfc9000.html
|
[QUIC RFC]: https://www.rfc-editor.org/rfc/rfc9000.html
|
||||||
[uvarint-spec]: https://github.com/multiformats/unsigned-varint
|
[uvarint-spec]: https://github.com/multiformats/unsigned-varint
|
||||||
|
|||||||
312
webrtc/webrtc-direct.md
Normal file
312
webrtc/webrtc-direct.md
Normal file
@@ -0,0 +1,312 @@
|
|||||||
|
# WebRTC Direct
|
||||||
|
|
||||||
|
| Lifecycle Stage | Maturity | Status | Latest Revision |
|
||||||
|
|-----------------|---------------------------|--------|-----------------|
|
||||||
|
| 2A | Candidate Recommendation | Active | r1, 2023-04-12 |
|
||||||
|
|
||||||
|
Authors: [@mxinden]
|
||||||
|
|
||||||
|
Interest Group: [@marten-seemann]
|
||||||
|
|
||||||
|
[@marten-seemann]: https://github.com/marten-seemann
|
||||||
|
[@mxinden]: https://github.com/mxinden/
|
||||||
|
|
||||||
|
## Motivation
|
||||||
|
|
||||||
|
**No need for trusted TLS certificates.** Enable browsers to connect to public
|
||||||
|
server nodes without those server nodes providing a TLS certificate within the
|
||||||
|
browser's trustchain. Note that we can not do this today with our Websocket
|
||||||
|
transport as the browser requires the remote to have a trusted TLS certificate.
|
||||||
|
Nor can we establish a plain TCP or QUIC connection from within a browser. We
|
||||||
|
can establish a WebTransport connection from the browser (see [WebTransport
|
||||||
|
specification](../webtransport)).
|
||||||
|
|
||||||
|
## Addressing
|
||||||
|
|
||||||
|
WebRTC Direct multiaddresses are composed of an IP and UDP address component, followed
|
||||||
|
by `/webrtc-direct` and a multihash of the certificate that the node uses.
|
||||||
|
|
||||||
|
Examples:
|
||||||
|
- `/ip4/1.2.3.4/udp/1234/webrtc-direct/certhash/<hash>/p2p/<peer-id>`
|
||||||
|
- `/ip6/fe80::1ff:fe23:4567:890a/udp/1234/webrtc-direct/certhash/<hash>/p2p/<peer-id>`
|
||||||
|
|
||||||
|
The TLS certificate fingerprint in `/certhash` is a
|
||||||
|
[multibase](https://github.com/multiformats/multibase) encoded
|
||||||
|
[multihash](https://github.com/multiformats/multihash).
|
||||||
|
|
||||||
|
For compatibility implementations MUST support hash algorithm
|
||||||
|
[`sha-256`](https://github.com/multiformats/multihash) and base encoding
|
||||||
|
[`base64url`](https://github.com/multiformats/multibase). Implementations MAY
|
||||||
|
support other hash algorithms and base encodings, but they may not be able to
|
||||||
|
connect to all other nodes.
|
||||||
|
|
||||||
|
## Connection Establishment
|
||||||
|
|
||||||
|
### Browser to public Server
|
||||||
|
|
||||||
|
Scenario: Browser _A_ wants to connect to server node _B_ where _B_ is publicly
|
||||||
|
reachable but _B_ does not have a TLS certificate trusted by _A_.
|
||||||
|
|
||||||
|
1. Server node _B_ generates a TLS certificate, listens on a UDP port and
|
||||||
|
advertises the corresponding multiaddress (see [#addressing]) through some
|
||||||
|
external mechanism.
|
||||||
|
|
||||||
|
Given that _B_ is publicly reachable, _B_ acts as a [ICE
|
||||||
|
Lite](https://www.rfc-editor.org/rfc/rfc5245) agent. It binds to a UDP port
|
||||||
|
waiting for incoming STUN and SCTP packets and multiplexes based on source IP
|
||||||
|
and source port.
|
||||||
|
|
||||||
|
2. Browser _A_ discovers server node _B_'s multiaddr, containing _B_'s IP, UDP
|
||||||
|
port, TLS certificate fingerprint and optionally libp2p peer ID (e.g.
|
||||||
|
`/ip6/2001:db8::/udp/1234/webrtc-direct/certhash/<hash>/p2p/<peer-id>`), through some
|
||||||
|
external mechanism.
|
||||||
|
|
||||||
|
3. _A_ instantiates a `RTCPeerConnection`. See
|
||||||
|
[`RTCPeerConnection()`](https://developer.mozilla.org/en-US/docs/Web/API/RTCPeerConnection/RTCPeerConnection).
|
||||||
|
|
||||||
|
_A_ (i.e. the browser) SHOULD NOT reuse the same certificate across
|
||||||
|
`RTCPeerConnection`s. Reusing the certificate can be used to identify _A_
|
||||||
|
across connections by on-path observers given that WebRTC uses TLS 1.2.
|
||||||
|
|
||||||
|
4. _A_ constructs _B_'s SDP answer locally based on _B_'s multiaddr.
|
||||||
|
|
||||||
|
_A_ generates a random string prefixed with "libp2p+webrtc+v1/". The prefix
|
||||||
|
allows us to use the ufrag as an upgrade mechanism to role out a new version
|
||||||
|
of the libp2p WebRTC protocol on a live network. While a hack, this might be
|
||||||
|
very useful in the future. _A_ sets the string as the username (_ufrag_ or _username fragment_)
|
||||||
|
and password on the SDP of the remote's answer.
|
||||||
|
|
||||||
|
_A_ MUST set the `a=max-message-size:16384` SDP attribute. See reasoning
|
||||||
|
[multiplexing] for rational.
|
||||||
|
|
||||||
|
Finally _A_ sets the remote answer via
|
||||||
|
[`RTCPeerConnection.setRemoteDescription()`](https://developer.mozilla.org/en-US/docs/Web/API/RTCPeerConnection/setRemoteDescription).
|
||||||
|
|
||||||
|
5. _A_ creates a local offer via
|
||||||
|
[`RTCPeerConnection.createOffer()`](https://developer.mozilla.org/en-US/docs/Web/API/RTCPeerConnection/createOffer).
|
||||||
|
_A_ sets the same username and password on the local offer as done in (4) on
|
||||||
|
the remote answer.
|
||||||
|
|
||||||
|
_A_ MUST set the `a=max-message-size:16384` SDP attribute. See reasoning
|
||||||
|
[multiplexing] for rational.
|
||||||
|
|
||||||
|
Finally _A_ sets the modified offer via
|
||||||
|
[`RTCPeerConnection.setLocalDescription()`](https://developer.mozilla.org/en-US/docs/Web/API/RTCPeerConnection/setLocalDescription).
|
||||||
|
|
||||||
|
Note that this process, oftentimes referred to as "SDP munging" is disallowed
|
||||||
|
by the specification, but not enforced across the major browsers (Safari,
|
||||||
|
Firefox, Chrome) due to use-cases in the wild. See also
|
||||||
|
https://bugs.chromium.org/p/chromium/issues/detail?id=823036
|
||||||
|
|
||||||
|
6. Once _A_ sets the SDP offer and answer, it will start sending STUN requests
|
||||||
|
to _B_. _B_ reads the _ufrag_ from the incoming STUN request's _username_
|
||||||
|
field. _B_ then infers _A_'s SDP offer using the IP, port, and _ufrag_ of the
|
||||||
|
request as follows:
|
||||||
|
|
||||||
|
1. _B_ sets the the `ice-ufrag` and `ice-pwd` equal to the value read from
|
||||||
|
the `username` field.
|
||||||
|
|
||||||
|
2. _B_ sets an arbitrary sha-256 digest as the remote fingerprint as it does
|
||||||
|
not verify fingerprints at this point.
|
||||||
|
|
||||||
|
3. _B_ sets the connection field (`c`) to the IP and port of the incoming
|
||||||
|
request `c=IN <ip> <port>`.
|
||||||
|
|
||||||
|
4. _B_ sets the `a=max-message-size:16384` SDP attribute. See reasoning
|
||||||
|
[multiplexing] for rational.
|
||||||
|
|
||||||
|
_B_ sets this offer as the remote description. _B_ generates an answer and
|
||||||
|
sets it as the local description.
|
||||||
|
|
||||||
|
The _ufrag_ in combination with the IP and port of _A_ can be used by _B_
|
||||||
|
to identify the connection, i.e. demultiplex incoming UDP datagrams per
|
||||||
|
incoming connection.
|
||||||
|
|
||||||
|
Note that this step requires _B_ to allocate memory for each incoming STUN
|
||||||
|
message from _A_. This could be leveraged for a DOS attack where _A_ is
|
||||||
|
sending many STUN messages with different ufrags using different UDP source
|
||||||
|
ports, forcing _B_ to allocate a new peer connection for each. _B_ SHOULD
|
||||||
|
have a rate limiting mechanism in place as a defense measure. See also
|
||||||
|
https://datatracker.ietf.org/doc/html/rfc5389#section-16.1.2.
|
||||||
|
|
||||||
|
7. _A_ and _B_ execute the DTLS handshake as part of the standard WebRTC
|
||||||
|
connection establishment.
|
||||||
|
|
||||||
|
At this point _B_ does not know the TLS certificate fingerprint of _A_. Thus
|
||||||
|
_B_ can not verify _A_'s TLS certificate fingerprint during the DTLS
|
||||||
|
handshake. Instead _B_ needs to _disable certificate fingerprint
|
||||||
|
verification_ (see e.g. [Pion's `disableCertificateFingerprintVerification`
|
||||||
|
option](https://github.com/pion/webrtc/blob/360b0f1745c7244850ed638f423cda716a81cedf/settingengine.go#L62)).
|
||||||
|
|
||||||
|
On success of the DTLS handshake the connection provides confidentiality and
|
||||||
|
integrity but not authenticity. The latter is guaranteed through the
|
||||||
|
succeeding Noise handshake. See [Connection Security
|
||||||
|
section](#connection-security).
|
||||||
|
|
||||||
|
8. Messages on each `RTCDataChannel` are framed using the message
|
||||||
|
framing mechanism described in [Multiplexing].
|
||||||
|
|
||||||
|
9. The remote is authenticated via an additional Noise handshake. See
|
||||||
|
[Connection Security section](#connection-security).
|
||||||
|
|
||||||
|
WebRTC can run both on UDP and TCP. libp2p WebRTC implementations MUST support
|
||||||
|
UDP and MAY support TCP.
|
||||||
|
|
||||||
|
|
||||||
|
## Connection Security
|
||||||
|
|
||||||
|
Note that the below uses the message framing described in
|
||||||
|
[multiplexing].
|
||||||
|
|
||||||
|
While WebRTC offers confidentiality and integrity via TLS, one still needs to
|
||||||
|
authenticate the remote peer by its libp2p identity.
|
||||||
|
|
||||||
|
After [Connection Establishment](#connection-establishment):
|
||||||
|
|
||||||
|
1. _A_ and _B_ open a WebRTC data channel with `id: 0` and `negotiated: true`
|
||||||
|
([`pc.createDataChannel("", {negotiated: true, id:
|
||||||
|
0});`](https://developer.mozilla.org/en-US/docs/Web/API/RTCPeerConnection/createDataChannel)).
|
||||||
|
|
||||||
|
2. _B_ starts a Noise `XX` handshake on the new channel. See
|
||||||
|
[noise-libp2p](https://github.com/libp2p/specs/tree/master/noise).
|
||||||
|
|
||||||
|
_A_ and _B_ use the [Noise
|
||||||
|
Prologue](https://noiseprotocol.org/noise.html#prologue) mechanism. More
|
||||||
|
specifically _A_ and _B_ set the Noise _Prologue_ to
|
||||||
|
`<PREFIX><FINGERPRINT_A><FINGERPRINT_B>` before starting the actual Noise
|
||||||
|
handshake. `<PREFIX>` is the UTF-8 byte representation of the string
|
||||||
|
`libp2p-webrtc-noise:`. `<FINGERPRINT_A><FINGERPRINT_B>` is the concatenation
|
||||||
|
of the two TLS fingerprints of _A_ (Noise handshake responder) and then _B_
|
||||||
|
(Noise handshake initiator), in their multihash byte representation.
|
||||||
|
|
||||||
|
On Chrome _A_ can access its TLS certificate fingerprint directly via
|
||||||
|
`RTCCertificate#getFingerprints`. Firefox does not allow _A_ to do so. Browser
|
||||||
|
compatibility can be found
|
||||||
|
[here](https://developer.mozilla.org/en-US/docs/Web/API/RTCCertificate). In
|
||||||
|
practice, this is not an issue since the fingerprint is embedded in the local
|
||||||
|
SDP string.
|
||||||
|
|
||||||
|
3. On success of the authentication handshake, the used datachannel is
|
||||||
|
closed and the plain WebRTC connection is used with its multiplexing
|
||||||
|
capabilities via datachannels. See [Multiplexing].
|
||||||
|
|
||||||
|
Note: WebRTC supports different hash functions to hash the TLS certificate (see
|
||||||
|
https://datatracker.ietf.org/doc/html/rfc8122#section-5). The hash function used
|
||||||
|
in WebRTC and the hash function used in the multiaddr `/certhash` component MUST
|
||||||
|
be the same. On mismatch the final Noise handshake MUST fail.
|
||||||
|
|
||||||
|
_A_ knows _B_'s fingerprint hash algorithm through _B_'s multiaddr. _A_ MUST use
|
||||||
|
the same hash algorithm to calculate the fingerprint of its (i.e. _A_'s) TLS
|
||||||
|
certificate. _B_ assumes _A_ to use the same hash algorithm it discovers through
|
||||||
|
_B_'s multiaddr. For now implementations MUST support sha-256. Future iterations
|
||||||
|
of this specification may add support for other hash algorithms.
|
||||||
|
|
||||||
|
Implementations SHOULD setup all the necessary callbacks (e.g.
|
||||||
|
[`ondatachannel`](https://developer.mozilla.org/en-US/docs/Web/API/RTCPeerConnection/datachannel_event))
|
||||||
|
before starting the Noise handshake. This is to avoid scenarios like one where
|
||||||
|
_A_ initiates a stream before _B_ got a chance to set the `ondatachannel`
|
||||||
|
callback. This would result in _B_ ignoring all the messages coming from _A_
|
||||||
|
targeting that stream.
|
||||||
|
|
||||||
|
Implementations MAY open streams before completion of the Noise handshake.
|
||||||
|
Applications MUST take special care what application data they send, since at
|
||||||
|
this point the peer is not yet authenticated. Similarly, the receiving side MAY
|
||||||
|
accept streams before completion of the handshake.
|
||||||
|
|
||||||
|
## Test vectors
|
||||||
|
|
||||||
|
### Noise prologue
|
||||||
|
|
||||||
|
All of these test vectors represent hex-encoded bytes.
|
||||||
|
|
||||||
|
#### Both client and server use SHA-256
|
||||||
|
|
||||||
|
Here client is _A_ and server is _B_.
|
||||||
|
|
||||||
|
```
|
||||||
|
client_fingerprint = "3e79af40d6059617a0d83b83a52ce73b0c1f37a72c6043ad2969e2351bdca870"
|
||||||
|
server_fingerprint = "30fc9f469c207419dfdd0aab5f27a86c973c94e40548db9375cca2e915973b99"
|
||||||
|
|
||||||
|
prologue = "6c69627032702d7765627274632d6e6f6973653a12203e79af40d6059617a0d83b83a52ce73b0c1f37a72c6043ad2969e2351bdca870122030fc9f469c207419dfdd0aab5f27a86c973c94e40548db9375cca2e915973b99"
|
||||||
|
```
|
||||||
|
|
||||||
|
# FAQ
|
||||||
|
|
||||||
|
- _Why exchange the TLS certificate fingerprint in the multiaddr? Why not
|
||||||
|
base it on the libp2p public key?_
|
||||||
|
|
||||||
|
Browsers do not allow loading a custom certificate. One can only generate a
|
||||||
|
certificate via
|
||||||
|
[rtcpeerconnection-generatecertificate](https://www.w3.org/TR/webrtc/#dom-rtcpeerconnection-generatecertificate).
|
||||||
|
|
||||||
|
- _Why not embed the peer ID in the TLS certificate, thus rendering the
|
||||||
|
additional "peer certificate" exchange obsolete?_
|
||||||
|
|
||||||
|
Browsers do not allow editing the properties of the TLS certificate.
|
||||||
|
|
||||||
|
- _How about distributing the multiaddr in a signed peer record, thus rendering
|
||||||
|
the additional "peer certificate" exchange obsolete?_
|
||||||
|
|
||||||
|
Signed peer records are not yet rolled out across the many libp2p protocols.
|
||||||
|
Making the libp2p WebRTC protocol dependent on the former is not deemed worth
|
||||||
|
it at this point in time. Later versions of the libp2p WebRTC protocol might
|
||||||
|
adopt this optimization.
|
||||||
|
|
||||||
|
Note, one can role out a new version of the libp2p WebRTC protocol through a
|
||||||
|
new multiaddr protocol, e.g. `/webrtc-direct-2`.
|
||||||
|
|
||||||
|
- _Why exchange fingerprints in an additional authentication handshake on top of
|
||||||
|
an established WebRTC connection? Why not only exchange signatures of ones TLS
|
||||||
|
fingerprints signed with ones libp2p private key on the plain WebRTC
|
||||||
|
connection?_
|
||||||
|
|
||||||
|
Once _A_ and _B_ established a WebRTC connection, _A_ sends
|
||||||
|
`signature_libp2p_a(fingerprint_a)` to _B_ and vice versa. While this has the
|
||||||
|
benefit of only requring two messages, thus one round trip, it is prone to a
|
||||||
|
key compromise and replay attack. Say that _E_ is able to attain
|
||||||
|
`signature_libp2p_a(fingerprint_a)` and somehow compromise _A_'s TLS private
|
||||||
|
key, _E_ can now impersonate _A_ without knowing _A_'s libp2p private key.
|
||||||
|
|
||||||
|
If one requires the signatures to contain both fingerprints, e.g.
|
||||||
|
`signature_libp2p_a(fingerprint_a, fingerprint_b)`, the above attack still
|
||||||
|
works, just that _E_ can only impersonate _A_ when talking to _B_.
|
||||||
|
|
||||||
|
Adding a cryptographic identifier of the unique connection (i.e. session) to
|
||||||
|
the signature (`signature_libp2p_a(fingerprint_a, fingerprint_b,
|
||||||
|
connection_identifier)`) would protect against this attack. To the best of our
|
||||||
|
knowledge the browser does not give us access to such identifier.
|
||||||
|
|
||||||
|
- _Can a browser know upfront its UDP port which it is listening for incoming
|
||||||
|
connections on? Does the browser reuse the UDP port across many WebRTC
|
||||||
|
connections? If that is the case one could connect to any public node, with
|
||||||
|
the remote telling the local node what port it is perceived on. Thus one could
|
||||||
|
use libp2p's identify and AutoNAT protocol instead of relying on STUN._
|
||||||
|
|
||||||
|
No, a browser uses a new UDP port for each `RTCPeerConnection`.
|
||||||
|
|
||||||
|
- _Why not load a remote node's certificate into one's browser trust-store and
|
||||||
|
then connect e.g. via WebSocket._
|
||||||
|
|
||||||
|
This would require a mechanism to discover remote node's certificates upfront.
|
||||||
|
More importantly, this does not scale with the number of connections a typical
|
||||||
|
peer-to-peer application establishes.
|
||||||
|
|
||||||
|
- _Can an attacker launch an amplification attack with the STUN endpoint of
|
||||||
|
the server?_
|
||||||
|
|
||||||
|
We follow the reasoning of the QUIC protocol, namely requiring:
|
||||||
|
|
||||||
|
> an endpoint MUST limit the amount of data it sends to the unvalidated
|
||||||
|
> address to three times the amount of data received from that address.
|
||||||
|
|
||||||
|
https://datatracker.ietf.org/doc/html/rfc9000#section-8
|
||||||
|
|
||||||
|
This is the case for STUN response messages which are only slight larger than
|
||||||
|
the request messages. See also
|
||||||
|
https://datatracker.ietf.org/doc/html/rfc5389#section-16.1.2.
|
||||||
|
|
||||||
|
- _Why does B start the Noise handshake and not A?_
|
||||||
|
|
||||||
|
Given that WebRTC uses DTLS 1.2, _B_ is the one that can send data first.
|
||||||
|
|
||||||
|
[multiplexing]: ./README.md#multiplexing
|
||||||
122
webrtc/webrtc.md
Normal file
122
webrtc/webrtc.md
Normal file
@@ -0,0 +1,122 @@
|
|||||||
|
# WebRTC
|
||||||
|
|
||||||
|
| Lifecycle Stage | Maturity | Status | Latest Revision |
|
||||||
|
|-----------------|--------------------------|--------|-----------------|
|
||||||
|
| 2A | Candidate Recommendation | Active | r0, 2023-04-12 |
|
||||||
|
|
||||||
|
Authors: [@mxinden]
|
||||||
|
|
||||||
|
## Motivation
|
||||||
|
|
||||||
|
libp2p transport protocol enabling two private nodes (e.g. two browsers) to establish a direct connection.
|
||||||
|
|
||||||
|
Browser _A_ wants to connect to Browser node _B_ with the help of server node _R_.
|
||||||
|
Both _A_ and _B_ cannot listen for incoming connections due to running in a constrained environment (i.e. a browser) with its only transport capability being the W3C WebRTC `RTCPeerConnection` API and being behind a NAT and/or firewall.
|
||||||
|
Note that _A_ and/or _B_ may as well be non-browser nodes behind NATs and/or firewalls.
|
||||||
|
However, for two non-browser nodes using TCP or QUIC hole punching with [DCUtR] will be the more efficient way to establish a direct connection.
|
||||||
|
|
||||||
|
On a historical note, this specification replaces the existing [libp2p WebRTC star](https://github.com/libp2p/js-libp2p-webrtc-star) and [libp2p WebRTC direct](https://github.com/libp2p/js-libp2p-webrtc-direct) protocols.
|
||||||
|
|
||||||
|
## Connection Establishment
|
||||||
|
|
||||||
|
1. _B_ advertises support for the WebRTC browser-to-browser protocol by appending `/webrtc` to its relayed multiaddr, meaning it takes the form of `<relayed-multiaddr>/webrtc/p2p/<b-peer-id>`.
|
||||||
|
|
||||||
|
2. Upon discovery of _B_'s multiaddress, _A_ learns that _B_ supports the WebRTC transport and knows how to establish a relayed connection to _B_ to run the `/webrtc-signaling` protocol on top.
|
||||||
|
|
||||||
|
3. _A_ establishes a relayed connection to _B_.
|
||||||
|
Note that further steps depend on the relayed connection to be authenticated, i.e. that data sent on the relayed connection can be trusted.
|
||||||
|
|
||||||
|
4. _A_ (outbound side of relayed connection) creates an `RTCPeerConnection` provided by a W3C compliant WebRTC implementation (e.g. a browser).
|
||||||
|
See [STUN](#stun) section on what STUN servers to configure at creation time.
|
||||||
|
_A_ creates an SDP offer via `RTCPeerConnection.createOffer()`.
|
||||||
|
_A_ initiates the signaling protocol to _B_ via the relayed connection from (1), see [Signaling Protocol](#signaling-protocol) and sends the offer to _B_.
|
||||||
|
Note that _A_ being the initiator of the stream is merely a convention preventing both nodes to simultaneously initiate a new connection thus potentially resulting in two WebRTC connections.
|
||||||
|
_A_ MUST as well be able to handle an incoming signaling protocol stream to support the case where _B_ initiates the signaling process.
|
||||||
|
|
||||||
|
5. On reception of the incoming stream, _B_ (inbound side of relayed connection) creates an `RTCPeerConnection`.
|
||||||
|
Again see [STUN](#stun) section on what STUN servers to configure at creation time.
|
||||||
|
_B_ receives _A_'s offer sent in (2) via the signaling protocol stream and provides the offer to its `RTCPeerConnection` via `RTCPeerConnection.setRemoteDescription`.
|
||||||
|
_B_ then creates an answer via `RTCPeerConnection.createAnswer` and sends it to _A_ via the existing signaling protocol stream (see [Signaling Protocol](#signaling-protocol)).
|
||||||
|
|
||||||
|
6. _A_ receives _B_'s answer via the signaling protocol stream and sets it locally via `RTCPeerConnection.setRemoteDescription`.
|
||||||
|
|
||||||
|
7. _A_ and _B_ send their local ICE candidates via the existing signaling protocol stream to enable trickle ICE.
|
||||||
|
Both nodes continuously read from the stream, adding incoming remote candidates via `RTCPeerConnection.addIceCandidate()`.
|
||||||
|
|
||||||
|
8. On successful establishment of the direct connection, _B_ and _A_ close the signaling protocol stream.
|
||||||
|
On failure _B_ and _A_ reset the signaling protocol stream.
|
||||||
|
|
||||||
|
Behavior for transferring data on a relayed connection, in the case where the direct connection failed, is out of scope for this specification and dependent on the application.
|
||||||
|
|
||||||
|
9. Messages on `RTCDataChannel`s on the established `RTCPeerConnection` are framed using the message framing mechanism described in [multiplexing].
|
||||||
|
|
||||||
|
## STUN
|
||||||
|
|
||||||
|
A node needs to discover its public IP and port, which is forwarded to the remote node in order to connect to the local node.
|
||||||
|
On non-browser libp2p nodes doing a hole punch with TCP or QUIC, the libp2p node discovers its public address via the [identify] protocol.
|
||||||
|
One cannot use the [identify] protocol on browser nodes to discover ones public IP and port given that the browser uses a new port for each connection.
|
||||||
|
For example say that the local browser node establishes a WebRTC connection C1 via browser-to-server to a server node and runs the [identify] protocol.
|
||||||
|
The returned observed public port P1 will most likely (depending on the NAT) be a different port than the port observed on another connection C2.
|
||||||
|
The only browser supported mechanism to discover ones public IP and port for a given WebRTC connection is the non-libp2p protocol STUN.
|
||||||
|
This is why this specification depends on STUN, and thus the availability of one or more STUN servers for _A_ and _B_ to discovery their public addresses.
|
||||||
|
|
||||||
|
Implementations MAY use one of the publicly available STUN servers, or deploy a dedicated server for a given libp2p network.
|
||||||
|
Further specification of the usage of STUN is out of scope for this specifitcation.
|
||||||
|
|
||||||
|
It is not necessary for _A_ and _B_ to use the same STUN server when establishing a WebRTC connection.
|
||||||
|
|
||||||
|
## Signaling Protocol
|
||||||
|
|
||||||
|
The protocol id is `/webrtc-signaling`.
|
||||||
|
Messages are sent prefixed with the message length in bytes, encoded as an unsigned variable length integer as defined by the [multiformats unsigned-varint spec][uvarint-spec].
|
||||||
|
|
||||||
|
``` protobuf
|
||||||
|
syntax = "proto3";
|
||||||
|
|
||||||
|
message Message {
|
||||||
|
// Specifies type in `data` field.
|
||||||
|
enum Type {
|
||||||
|
// String of `RTCSessionDescription.sdp`
|
||||||
|
SDP_OFFER = 0;
|
||||||
|
// String of `RTCSessionDescription.sdp`
|
||||||
|
SDP_ANSWER = 1;
|
||||||
|
// String of `RTCIceCandidate.toJSON()`
|
||||||
|
ICE_CANDIDATE = 2;
|
||||||
|
}
|
||||||
|
|
||||||
|
optional Type type = 1;
|
||||||
|
optional string data = 2;
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
## FAQ
|
||||||
|
|
||||||
|
- Why is there no additional Noise handshake needed?
|
||||||
|
|
||||||
|
This specification (browser-to-browser) requires _A_ and _B_ to exchange their SDP offer and answer over an authenticated channel.
|
||||||
|
Offer and answer contain the TLS certificate fingerprint.
|
||||||
|
The browser validates the TLS certificate fingerprint through the DTLS handshake during the WebRTC connection establishment.
|
||||||
|
|
||||||
|
In contrast, the browser-to-server specification allows exchange of the server's multiaddr, containing the server's TLS certificate fingerprint, over unauthenticated channels.
|
||||||
|
In other words, the browser-to-server specification does not consider the TLS certificate fingerprint in the server's multiaddr to be trusted.
|
||||||
|
|
||||||
|
- Why use a custom signaling protocol? Why not use [DCUtR]?
|
||||||
|
|
||||||
|
DCUtR offers time synchronization through a two-step protocol (first `Connect`, then `Sync`).
|
||||||
|
This is not needed for WebRTC.
|
||||||
|
|
||||||
|
DCUtR does not provide a mechanism to trickle local address candidates to the remote as they are discovered.
|
||||||
|
Trickling candidates just-in-time allows for faster WebRTC connection establishment.
|
||||||
|
|
||||||
|
- Why does _A_ and not _B_ initiate the signaling protocol?
|
||||||
|
|
||||||
|
In [DCUtR] _B_ (inbound side of the relayed connection) initiates the [DCUtR] protocol by opening the [DCUtR] protocol stream.
|
||||||
|
The reason is that in case _A_ is publicly reachable, _B_ might be able to use connection reversal to connect to _A_ directly.
|
||||||
|
This reason does not apply to the WebRTC browser-to-browser protocol.
|
||||||
|
Given that _A_ and _B_ at this point already have a relayed connection established, they might as well use it to exchange SDP, instead of using connection reversal and WebRTC browser-to-server.
|
||||||
|
Thus, for the WebRTC browser-to-browser protocol, _A_ initiates the signaling protocol by opening the signaling protocol stream.
|
||||||
|
|
||||||
|
[DCUtR]: ./../relay/DCUtR.md
|
||||||
|
[identify]: ./../identify/README.md
|
||||||
|
[multiplexing]: ./README.md#multiplexing
|
||||||
|
[uvarint-spec]: https://github.com/multiformats/unsigned-varint
|
||||||
Reference in New Issue
Block a user