Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Split error code space #722

Merged
merged 19 commits into from Oct 5, 2017
Merged

Split error code space #722

merged 19 commits into from Oct 5, 2017

Conversation

martinthomson
Copy link
Member

@martinthomson martinthomson commented Aug 11, 2017

This creates two orthogonal spaces for error codes. Application error codes
can be used for both connection and stream errors and are under the control of
the application protocol. Transport error codes are QUIC-controlled, but can
only terminate the connection.

To fix this, I had to add IANA considerations for error codes, plus a few extra
tweaks. I think that the error code space could be narrowed to 16 bits after
this, if only to keep things sane.

Closes #485, #184.

(I've stacked this on top of #721 because I don't like managing conflicts and this touches the same general area.)

@martinthomson martinthomson added -http -tls -transport design An issue that affects the design of the protocol; resolution requires consensus. labels Aug 11, 2017
Copy link
Contributor

@janaiyengar janaiyengar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comments inline. Can you also please not stack these PRs? Github is really difficult to use for review of large changes like this one, and stacking PRs makes it even harder. I understand the pains of merge conflicts, but honestly, that's a one-time pain during merge, not a continual pain during review.

0xC0000000-0xFFFFFFFF. The following error codes are defined when TLS is used
for the crypto handshake:
This document defines error codes from the error code space used in
{{QUIC-TRANSPORT}}.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This document --> This section.
Also, editorial nit: I like the mention of "crypto handshake when TLS is used for it".

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The crypto handshake bit is on the very next line. The point of separation is that the codes are consumed even if TLS isn't used.


Error Code:

: A 32-bit error code which indicates the reason for closing this connection.
TRANSPORT_CLOSE uses codes from the space defined in {{error-codes}};
APPLICATION_CLOSE uses codes from the application protocol error code space
({{app-error-codes}}).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems unnecessary to mention APPLICATION_CLOSE in the definitoion of TRANSPORT_CLOSE.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I considered not doing that, but it seemed more helpful to highlight the distinction here given that I wasn't repeating the format stuff in the description of APPLICATION_CLOSE.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would at least parenthesize it then.

already in the "half-closed (local)" or "closed" states, a RST_STREAM frame MAY
still be sent in order to cancel retransmission of previously-sent STREAM
frames.
(remote)" states, an endpoint MUST send a RST_STREAM frame. If the STOP_SENDING
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the proposed error code in the RST_STREAM now? If there's a requirement for QUIC to generate and send this frame, the error code has to be specified.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, the point here is that the application sends the RST_STREAM. The transport can't send RST_STREAM. The transport can't do that without risking putting the application into a bad state. I will change this to make it clearer.

Actually, I'm going to raise this on the list. It seems like STOP_SENDING could be made an application-layer construct and we could avoid that complexity at the transport layer entirely. I'm trying to see how visibility at the transport layer would be of value, but I'm drawing a blank.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand, but there's an issue here. If the transport receives a STOP_SENDING and MUST send a RST_STREAM, there's nothing in here that requires the application to do something in between the receiving of the STOP_SENDING and sending of the RST_STREAM. If it's not a transport requirement to send a RST_STREAM, then the "MUST" doesn't belong here. So I would remove any mention of RST_STREAM being sent here; that's an application's prerogative. The transport does not need to ensure this.

That said, I think I'm with you on STOP_SENDING. It's an HTTP-ism, so it probably belongs in the application. I'll chime in on the list.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. I think that I'd rather do the move of STOP_SENDING, but I'll tweak the text here in case someone objects to the other PR.

APPLICATION_CLOSE risks a peer missing the first such packet. The only
mechanism available to an endpoint that continues to receive data for a
terminated connection is to use the stateless reset process
({{stateless-reset}}).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

editorial nit: I would merge this para with the one above, since this one follows from the earlier.

An endpoint that receives an invalid CONNECTION_CLOSE frame MUST NOT signal the
existence of the error to its peer.
An endpoint that chooses not to retransmit packets containing TRANSPORT_CLOSE or
APPLICATION_CLOSE risks a peer missing the first such packet. The only
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not just the first packet, but any packets lost in the network.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll remove this sentence, reword the remaining sentence, and merge.

Error codes are 32 bits long, with the first two bits indicating the source of
the error code:
An endpoint that detects a stream error MAY choose to treat the error as a
connection error and send an APPLICATION_CLOSE frame in place of RST_STREAM.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is confusing. What is the "endpoint"? QUIC, or HTTP/QUIC? If QUIC is no longer generating RST_STREAM, then which stream errors are detected by QUIC? I suspect the answer is that the endpoint is HTTP/QUIC, in which case I would remove this sentence from here entirely.

error codes are used for the RST_STREAM ({{frame-rst-stream}}) and
APPLICATION_CLOSE ({{frame-application-close}}) frames.

Application protocols SHOULD define an error codes for indicating no error and
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: remove "an"


## Application Protocol Error Codes {#app-error-codes}

Application protocol error codes are 32-bits long, but the management of
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

16-bits

@@ -1596,7 +1622,7 @@ The RST_STREAM frame is as follows:
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Stream ID (32) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Error Code (32) |
| App Protocol Error Code (16) |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is still 32 bits in this PR, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I must have split this from the other PR poorly.


## Error Codes
A stream error is always an application-layer construct. A RST_STREAM MUST NOT
be generated by the QUIC transport layer. Resetting a stream at the transport
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

drop "transport layer".

Also, it's odd to say QUIC MUST not generate a RST_STREAM, since it is a QUIC frame and QUIC does in fact generate it. How about "RST_STREAM MUST be instigated by the application and MUST carry an application error code."?

Copy link
Member Author

@martinthomson martinthomson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't seen a review for #721 yet. I apologize for stacking like this, but #721 is pretty self-contained, so the amount of leakage from that is small. That said, I tried removing it from this changeset and found that a lot of the text is intertwined. I've started stacking PRs differently now, so the reviewing should be easier.

@@ -1596,7 +1622,7 @@ The RST_STREAM frame is as follows:
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Stream ID (32) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Error Code (32) |
| App Protocol Error Code (16) |
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I must have split this from the other PR poorly.


Error Code:

: A 32-bit error code which indicates the reason for closing this connection.
TRANSPORT_CLOSE uses codes from the space defined in {{error-codes}};
APPLICATION_CLOSE uses codes from the application protocol error code space
({{app-error-codes}}).
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I considered not doing that, but it seemed more helpful to highlight the distinction here given that I wasn't repeating the format stuff in the description of APPLICATION_CLOSE.

already in the "half-closed (local)" or "closed" states, a RST_STREAM frame MAY
still be sent in order to cancel retransmission of previously-sent STREAM
frames.
(remote)" states, an endpoint MUST send a RST_STREAM frame. If the STOP_SENDING
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, the point here is that the application sends the RST_STREAM. The transport can't send RST_STREAM. The transport can't do that without risking putting the application into a bad state. I will change this to make it clearer.

Actually, I'm going to raise this on the list. It seems like STOP_SENDING could be made an application-layer construct and we could avoid that complexity at the transport layer entirely. I'm trying to see how visibility at the transport layer would be of value, but I'm drawing a blank.

0xC0000000-0xFFFFFFFF. The following error codes are defined when TLS is used
for the crypto handshake:
This document defines error codes from the error code space used in
{{QUIC-TRANSPORT}}.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The crypto handshake bit is on the very next line. The point of separation is that the codes are consumed even if TLS isn't used.

An endpoint that receives an invalid CONNECTION_CLOSE frame MUST NOT signal the
existence of the error to its peer.
An endpoint that chooses not to retransmit packets containing TRANSPORT_CLOSE or
APPLICATION_CLOSE risks a peer missing the first such packet. The only
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll remove this sentence, reword the remaining sentence, and merge.

@mnot
Copy link
Member

mnot commented Sep 4, 2017

@martinthomson I think CI needs to be re-run; seems to have been a transient error from GH.

@martinthomson
Copy link
Member Author

Rebased it. I've changed the config so that intermittent error won't cause failures (it's the issue sync, which isn't critical).

Copy link
Contributor

@janaiyengar janaiyengar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few comments. This needs to be rebased and the resolution on the list will cause this to change further, I expect.

@@ -844,7 +844,8 @@ explained in more detail as they are referenced later in the document.
|:------------|:------------------|:----------------------------|
| 0x00 | PADDING | {{frame-padding}} |
| 0x01 | RST_STREAM | {{frame-rst-stream}} |
| 0x02 | CONNECTION_CLOSE | {{frame-connection-close}} |
| 0x02 | TRANSPORT_CLOSE | {{frame-transport-close}} |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer connection close -- a connection is a transport notion anyways. CONNECTION_CLOSE and APPLICATION_CLOSE ought to be clear enough, IMO.

@@ -1609,7 +1635,7 @@ The RST_STREAM frame is as follows:
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Stream ID (32) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Error Code (32) |
| App Protocol Error Code (32) |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: spell out "Application"

@@ -1623,43 +1649,51 @@ Stream ID:

: The 32-bit Stream ID of the stream being terminated.

Error code:
App Protocol Error Code:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto


Error Code:

: A 32-bit error code which indicates the reason for closing this connection.
TRANSPORT_CLOSE uses codes from the space defined in {{error-codes}};
APPLICATION_CLOSE uses codes from the application protocol error code space
({{app-error-codes}}).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would at least parenthesize it then.

already in the "half-closed (local)" or "closed" states, a RST_STREAM frame MAY
still be sent in order to cancel retransmission of previously-sent STREAM
frames.
(remote)" states, an endpoint MUST send a RST_STREAM frame. If the STOP_SENDING
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand, but there's an issue here. If the transport receives a STOP_SENDING and MUST send a RST_STREAM, there's nothing in here that requires the application to do something in between the receiving of the STOP_SENDING and sending of the RST_STREAM. If it's not a transport requirement to send a RST_STREAM, then the "MUST" doesn't belong here. So I would remove any mention of RST_STREAM being sent here; that's an application's prerogative. The transport does not need to ensure this.

That said, I think I'm with you on STOP_SENDING. It's an HTTP-ism, so it probably belongs in the application. I'll chime in on the list.

Copy link
Contributor

@MikeBishop MikeBishop left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change looks fine.

@@ -1067,7 +1070,7 @@ The HTTP/2 error codes defined in Section 7 of {{!RFC7540}} map to QUIC error
codes as follows:

NO_ERROR (0x0):
: QUIC_NO_ERROR
: HTTP_NO_ERROR
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"...in {{http-error-codes}}." for consistency with the others.

Copy link
Contributor

@janaiyengar janaiyengar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, the only thing I have a concern about is that if the transport is required to send a RST_STREAM in response to a STOP_SENDING, what's it to send without an error code? I would remove that requirement. Of course, this puts the onus on the application, but I think that's a natural outcome. (I think it also clarifies that STOP_SENDING is really an application signal.)

@martinthomson
Copy link
Member Author

Yeah, for that though I would prefer to just merge #759.

@janaiyengar
Copy link
Contributor

I agree, but I don't think we've reached a resolution there yet. In the meanwhile, this PR could remove the requirement within QUIC to send a RST_STREAM in response to a STOP_SENDING, since it doesn't make sense anyways.

@martinthomson
Copy link
Member Author

Busywork like that is low priority, but you are welcome to make a change to the branch if you feel strongly about it.

@janaiyengar
Copy link
Contributor

Then let's wait for #759 to land first.

@mnot mnot removed the design An issue that affects the design of the protocol; resolution requires consensus. label Sep 25, 2017
@martinthomson
Copy link
Member Author

@janaiyengar, #759 actually depends on this, so merging that first would be hard. Though we could wait until we agree on that, it seems like we're not going to agree without some discussion.

In the meantime, I've looked at the text around STOP_SENDING and I think that I have addressed your concerns. Actually, the rebase for this seems to have picked up other changes that did most of the work. The most recent commit just shuffles things around slightly and makes a few editorial corrections.

I think that this is ready, but am happy to defer until Monday to discuss if you have any lingering reservations.

This creates two orthogonal spaces for error codes.  Application error codes
can be used for both connection and stream errors and are under the control of
the application protocol.  Transport error codes are QUIC-controlled, but can
only terminate the connection.

To fix this, I had to add IANA considerations for error codes, plus a few extra
tweaks.  I think that the error code space could be narrowed to 16 bits after
this, if only to keep things sane.

Closes #132.
Copy link
Contributor

@janaiyengar janaiyengar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good to finally have this done. Thanks for getting it sorted out!

@janaiyengar janaiyengar merged commit f0034b4 into master Oct 5, 2017
@martinthomson martinthomson deleted the split-errors branch October 12, 2017 23:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Stream cancellation by transport
4 participants