Skip to content

Commit 6c78200

Browse files
authored
Ensure opaque paths always roundtrip
In fdaa0e5 we tackled a problem whereby removing the fragment or query from a URL with an opaque path through the API would not make the URL roundtrip due to the opaque path being able to end in non-percent-encoded spaces. However, this failed to address other ways of serializing the URL. As such this is a new approach whereby opaque paths simply cannot end with non-percent-encoded spaces. Enforcing this in the URL parser allows us to completely revert the aforementioned commit, greatly simplifying the API implementation. Tests: web-platform-tests/wpt#51129. Fixes #784. Supersedes and closes #785.
1 parent 076afff commit 6c78200

File tree

1 file changed

+19
-61
lines changed

1 file changed

+19
-61
lines changed

url.bs

+19-61
Original file line numberDiff line numberDiff line change
@@ -2920,17 +2920,26 @@ and then runs these steps:
29202920
to the empty string and <var>state</var> to <a>fragment state</a>.
29212921

29222922
<li>
2923-
<p>Otherwise:
2923+
<p>Otherwise, if <a>c</a> is U+0020 SPACE:
2924+
2925+
<ol>
2926+
<li><p>If <a>remaining</a> starts with U+003F (?) or U+003F (#), then append
2927+
"<code>%20</code>" to <var>url</var>'s <a for=url>path</a>.
2928+
2929+
<li><p>Otherwise, append U+0020 SPACE to <var>url</var>'s <a for=url>path</a>.
2930+
</ol>
2931+
2932+
<li>
2933+
<p>Otherwise, if <a>c</a> is not the <a>EOF code point</a>:
29242934

29252935
<ol>
2926-
<li><p>If <a>c</a> is not the <a>EOF code point</a>, not a <a>URL code point</a>, and not
2927-
U+0025 (%), <a>invalid-URL-unit</a> <a>validation error</a>.
2936+
<li><p>If <a>c</a> is not a <a>URL code point</a> and not U+0025 (%), <a>invalid-URL-unit</a>
2937+
<a>validation error</a>.
29282938

29292939
<li><p>If <a>c</a> is U+0025 (%) and <a>remaining</a> does not start with two
29302940
<a>ASCII hex digits</a>, <a>invalid-URL-unit</a> <a>validation error</a>.
29312941

2932-
<li><p>If <a>c</a> is not the <a>EOF code point</a>,
2933-
<a for="code point">UTF-8 percent-encode</a> <a>c</a> using the
2942+
<li><p><a for="code point">UTF-8 percent-encode</a> <a>c</a> using the
29342943
<a>C0 control percent-encode set</a> and append the result to <var>url</var>'s
29352944
<a for=url>path</a>.
29362945
</ol>
@@ -3437,23 +3446,6 @@ interface URL {
34373446
object.
34383447
</ul>
34393448

3440-
<div algorithm>
3441-
<p>To <dfn>potentially strip trailing spaces from an opaque path</dfn> given a {{URL}} object
3442-
<var>url</var>:
3443-
3444-
<ol>
3445-
<li><p>If <var>url</var>'s <a for=URL>URL</a> does not have an <a for=url>opaque path</a>, then
3446-
return.
3447-
3448-
<li><p>If <var>url</var>'s <a for=URL>URL</a>'s <a for=url>fragment</a> is non-null, then return.
3449-
3450-
<li><p>If <var>url</var>'s <a for=URL>URL</a>'s <a for=url>query</a> is non-null, then return.
3451-
3452-
<li><p>Remove all trailing U+0020 SPACE <a for=/>code points</a> from <var>url</var>'s
3453-
<a for=URL>URL</a>'s <a for=url>path</a>.
3454-
</ol>
3455-
</div>
3456-
34573449
<div algorithm>
34583450
<p>The <dfn>API URL parser</dfn> takes a <a>scalar value string</a> <var>url</var> and an optional
34593451
null-or-<a>scalar value string</a> <var>base</var> (default null), and then runs these steps:
@@ -3781,19 +3773,9 @@ one might have assumed the setter to always "reset" both.
37813773
<ol>
37823774
<li><p>Let <var>url</var> be <a>this</a>'s <a for=URL>URL</a>.
37833775

3784-
<li>
3785-
<p>If the given value is the empty string:
3786-
3787-
<ol>
3788-
<li><p>Set <var>url</var>'s <a for=url>query</a> to null.
3789-
3790-
<li><p><a for=list>Empty</a> <a>this</a>'s <a for=URL>query object</a>'s
3791-
<a for=URLSearchParams>list</a>.
3792-
3793-
<li><p><a>Potentially strip trailing spaces from an opaque path</a> with <a>this</a>.
3794-
3795-
<li><p>Return.
3796-
</ol>
3776+
<li><p>If the given value is the empty string, then set <var>url</var>'s <a for=url>query</a> to
3777+
null, <a for=list>empty</a> <a>this</a>'s <a for=URL>query object</a>'s
3778+
<a for=URLSearchParams>list</a>, and return.
37973779

37983780
<li><p>Let <var>input</var> be the given value with a single leading U+003F (?) removed, if any.
37993781

@@ -3806,11 +3788,6 @@ one might have assumed the setter to always "reset" both.
38063788
<li><p>Set <a>this</a>'s <a for=URL>query object</a>'s <a for=URLSearchParams>list</a> to the
38073789
result of <a lt="urlencoded string parser">parsing</a> <var>input</var>.
38083790
</ol>
3809-
3810-
<p class=note>The {{URL/search}} setter has the potential to remove trailing U+0020 SPACE
3811-
<a for=/>code points</a> from <a>this</a>'s <a for=URL>URL</a>'s <a for=url>path</a>. It does this
3812-
so that running the <a>URL parser</a> on the output of running the <a>URL serializer</a> on
3813-
<a>this</a>'s <a for=URL>URL</a> does not yield a <a for=/>URL</a> that is not <a for=url>equal</a>.
38143791
</div>
38153792

38163793
<div algorithm>
@@ -3833,16 +3810,8 @@ so that running the <a>URL parser</a> on the output of running the <a>URL serial
38333810
<p>The <code><a attribute for=URL>hash</a></code> setter steps are:
38343811

38353812
<ol>
3836-
<li>
3837-
<p>If the given value is the empty string:
3838-
3839-
<ol>
3840-
<li><p>Set <a>this</a>'s <a for=URL>URL</a>'s <a for=url>fragment</a> to null.
3841-
3842-
<li><p><a>Potentially strip trailing spaces from an opaque path</a> with <a>this</a>.
3843-
3844-
<li><p>Return.
3845-
</ol>
3813+
<li><p>If the given value is the empty string, then set <a>this</a>'s <a for=URL>URL</a>'s
3814+
<a for=url>fragment</a> to null and return.
38463815

38473816
<li><p>Let <var>input</var> be the given value with a single leading U+0023 (#) removed, if any.
38483817

@@ -3852,9 +3821,6 @@ so that running the <a>URL parser</a> on the output of running the <a>URL serial
38523821
<a for=URL>URL</a> as <a for="basic URL parser"><i>url</i></a> and <a>fragment state</a> as
38533822
<a for="basic URL parser"><i>state override</i></a>.
38543823
</ol>
3855-
3856-
<p class=note>The {{URL/hash}} setter has the potential to change <a>this</a>'s <a for=URL>URL</a>'s
3857-
<a for=url>path</a> in a manner equivalent to the {{URL/search}} setter.
38583824
</div>
38593825

38603826

@@ -3925,10 +3891,6 @@ console.log(url.searchParams.get('b')); // "~"</code></pre>
39253891
a {{URL}} object, initially null.
39263892
</ul>
39273893

3928-
<p class=note>A {{URLSearchParams}} object with a non-null <a for=URLSearchParams>URL object</a> has
3929-
the potential to change that object's <a for=url>path</a> in a manner equivalent to the {{URL}}
3930-
object's {{URL/search}} and {{URL/hash}} setters.
3931-
39323894
<div algorithm>
39333895
<p>To <dfn for=URLSearchParams oldids=concept-urlsearchparams-new>initialize</dfn> a
39343896
{{URLSearchParams}} object <var>query</var> with <var>init</var>:
@@ -3977,10 +3939,6 @@ object <var>query</var>:
39773939

39783940
<li><p>Set <var>query</var>'s <a for=URLSearchParams>URL object</a>'s <a for=URL>URL</a>'s
39793941
<a for=url>query</a> to <var>serializedQuery</var>.
3980-
3981-
<li><p>If <var>serializedQuery</var> is null, then
3982-
<a>potentially strip trailing spaces from an opaque path</a> with <var>query</var>'s
3983-
<a for=URLSearchParams>URL object</a>.
39843942
</ol>
39853943
</div>
39863944

0 commit comments

Comments
 (0)