Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix language stating when start tags must not be omitted #10752

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

sideshowbarker
Copy link
Contributor

@sideshowbarker sideshowbarker commented Nov 8, 2024

This change fixes #10691 by correcting the following statement:

However, a start tag must never be omitted if it has any attributes.

The correction in this change makes it clear that the requirement is about whether the start tag’s element has attributes — not about whether the start tag itself does.

Otherwise, without this change, the problem with the current text is the word “it”, which incorrectly refers to the start tag itself.


/syntax.html ( diff )

@annevk
Copy link
Member

annevk commented Nov 8, 2024

This is interesting, but start tag has this line in its definition:

Then, the start tag may have a number of attributes, the syntax for which is described below. Attributes must be separated from each other by one or more ASCII whitespace.

I suspect unwinding this all might be quite involved.

@sideshowbarker
Copy link
Contributor Author

sideshowbarker commented Nov 8, 2024

This is interesting, but start tag has this line in its definition:

Then, the start tag may have a number of attributes, the syntax for which is described below. Attributes must be separated from each other by one or more ASCII whitespace.

I suspect unwinding this all might be quite involved.

I’m not sure there’s actually anything more that needs to be unwound.

Specifically: I think what the text cited above is implicitly is describing is attribute markup or attribute serialization. Or even more specifically, it’s describing a syntax for how attributes can be represented or expressed in a markup/serialization for an HTML document.

In other words, attributes are something that only actually exist in the DOM and that don’t strictly have syntax. So any other mention of attributes in something that’s not about the DOM but is instead about the syntax of attributes is implicitly talking not about actual attributes but instead some representation of attributes, and the syntax for that representation.

I realize that can be confusing to casual readers of the spec who don’t understand the distinction. But I think the spec wording cited above is definitely not contradictory at least. I don’t think it’s even ambiguous at all, in context.

I mean, given that “start tags” are not something that actually exist in (parsed) HTML documents, then any sentence that’s talking about start tags is talking about the syntax for a particular markup/serialization of an HTML document. It just happens to be the standard syntax/markup that HTML parsers understand.

And so, given all that, I’m not sure there’s actually anything more spec-wise that needs to be unwound here.

I can imagine that ideally we might add a new subsection to the spec that basically restates what I’ve tried to state above — that is, some section written to help readers understand the distinction between (A) actual parsed HTML documents, in the DOM, and (B) markup/serialization/syntax for representing HTML documents. And it’d have, for example, stuff like:

Where this specification talks about the attributes for start tags — and the syntax of such attributes — it’s important to understand that it’s talking about something different than the actual attributes in a parsed HTML document. In that context it’s not talking about actual attributes, but instead about markup for representing actual attributes.

Only elements in the DOM have actual attributes. Thus, a start tag doesn’t have actual attributes but instead has a syntax for marking up a representation of the actual attributes of the element corresponding to that start tag in the HTML document in the DOM produced by parsing that start tag with an HTML parser.

If you or other folks care enough about this to reckon we should add something like that, I could make time myself to add it.

@zcorpan
Copy link
Member

zcorpan commented Nov 8, 2024

In other words, attributes are something that only actually exist in the DOM and that don’t strictly have syntax.

I disagree with this statement, they do exist in the syntax as defined in https://html.spec.whatwg.org/#syntax-attributes

However, the section on optional tags indeed talks about when an element's tags can be omitted, and so the wording here can be made more consistent with the other requirements. I suggest:

An element's start tag must not be omitted if the element has one or more attributes.

This change corrects the following statement:

> However, a start tag must never be omitted if it has any attributes.

The correction in this change makes it clear that the requirement is
about whether the start tag’s _element_ has attributes — not about
whether the start tag itself does.

Otherwise, without this change, the problem with the text as written is
the word “it”, which incorrectly refers to the start tag itself.
@sideshowbarker sideshowbarker force-pushed the sideshowbarker/start-tag-omission branch from f15dcff to e089633 Compare November 8, 2024 09:04
@sideshowbarker
Copy link
Contributor Author

However, the section on optional tags indeed talks about when an element's tags can be omitted, and so the wording here can be made more consistent with the other requirements. I suggest:

An element's start tag must not be omitted if the element has one or more attributes.

👍 Made it so: https://whatpr.org/html/10752/0511ae0...e089633/syntax.html#optional-tags:concept-element

@sideshowbarker
Copy link
Contributor Author

In other words, attributes are something that only actually exist in the DOM and that don’t strictly have syntax.

I disagree with this statement, they do exist in the syntax as defined in html.spec.whatwg.org#syntax-attributes

Fair enough. But I’d still argue it’s the case that we’re using the term “attributes” to talk about two strictly-different things.

And so, I think if we wanted to be more clear, then for those “attributes” which exist in that defined syntax, we’d rightly refer to them with some qualified term — say, markup attributes — to distinguish them from attributes as they exist in the DOM, which is what the vast majority of usage of the term attributes in the spec is about.

To make an analogy: We have a standard form of notation for representing music — a music syntax or music markup — with a representation of a musical scale, and with things written/arranged/marked-up on that scale which we refer to as “notes”. But we also use exactly that same term “notes” for the sounds in actual music that we hear.

In the case where we’re talking about music and use that same term “notes” for those two very-different things, it’s fine because it’s almost always the case that it’s going to be very clear from the context which kind of “notes” we mean.

But when talking about HTML, the facts seem to be: while it’s similarly the case that we use the same term “attributes” for two kinds of things that really aren’t strictly the same — it’s further the case that when we do use the term, it’s not always clear at all from the context exactly which kind of “attributes” we mean: markup attributes vs attributes in the DOM. That is, we lack the clarity-from-context that the musical-notation-notes vs actual-sounds-you-can-actually-hear-notes has.

And I’d further argue that #10691 is a evidence that those facts can lead to problems.

@sideshowbarker
Copy link
Contributor Author

In other words, attributes are something that only actually exist in the DOM and that don’t strictly have syntax.

I disagree with this statement, they do exist in the syntax as defined in html.spec.whatwg.org#syntax-attributes

You’re right. But I think that’s a mistake that we should fix. So let’s fix it: #10756.

This is interesting, but start tag has this line in its definition:

Then, the start tag may have a number of attributes, the syntax for which is described below. Attributes must be separated from each other by one or more ASCII whitespace.

I suspect unwinding this all might be quite involved.

I think the change in #10756 unwinds it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

“a start tag must never be omitted if it has any attributes”
3 participants