Replies: 4 comments 4 replies
-
I really like the idea of "longest matching static prefix" but I do think it would constitute a breaking change based on the ambiguity/flexibility of today's routes. Our unit tests will give some indication of the severity, but we'll have no way of making that determination for real-world apps. So it might need to be a v7 thing. I've been envisioning something like this trie walk but where we keep some version of the current scoring (for v6). So we would walk all matching paths to completion (while pruning whole sub-sections of the tree) and we just score along the way using the same values as we use today. So it's a single trie walk (probably higher than Even with this approach, we'd still be significantly cheaper from a |
Beta Was this translation helpful? Give feedback.
-
@pcattori something to keep in mind when we implement this: remix-run/react-router#9925 |
Beta Was this translation helpful? Give feedback.
-
@pcattori , I made this in order to have a better understanding of react-router. However, I have a few questions about this proposal. I probably don't have the ability to understand correctly, but what do you mean by What interests me most is the fact that you could add/delete routes at runtime. |
Beta Was this translation helpful? Give feedback.
-
👋 We use react-router on Tumblr. When profiling production traffic, I'm seeing We're currently working around how slow |
Beta Was this translation helpful? Give feedback.
-
tl;dr: Let's just copy what routers like DNS do!
Routes are inherently hierarchical as each
/
represents a subpath.A tree directly models these hierarchies.
Ubiquitous, reliable, and performant routers use trees to match routes.
For example, DNS uses uses trie (a specific type of tree, also known as a prefix tree) to store and search domain names efficiently.
The main benefit of representing routes as a tree is that our route matching will be simpler, leading to more robust and predictable matching. By robustness, I mean that redundant or ambiguous routes will not be representable in a tree like they are now.
And by predictable, I mean that it will be simpler to explain and understand why the route was chosen as the best match for an incoming path.
Additionally, all matching will run at optimal$O(input)$ , so we should see also see perf improvements, especially at larger scales.
Robust matching
Currently,
/one/:two
and/one/:three
can coexist as routes and we arbitrarily (but consistently) choose one of these to have a higher priority.While we could prune these routes manually, a better solution is to never allow redundant routes like these to coexist in the first place.
Encoding routes into a tree would ensure that we never have redundant routes, as redundant routes would occupy the same edge.
We can even warn or error when an ambiguous route is being added to the tree:
WARN: '/one/:three' is ambiguous since '/one/:two' is already a route
.Simple, fast, deterministic matching
Trees natively support$O(input)$ lookups where $input$ is the number of nested segments of the incoming path.
We would then do one tree lookup to match an incoming path to a route.
Note that this is indeed optimal because we are using$O(log(n))$ where $n$ is the number of nodes in the tree if the tree is balanced.
input
to represent the size of the input, not the number of nodes in the tree. So this would beMutually exclusive sibling branches?
One decision we'll have to make is if dynamic segments are automatically mutually exclusive of their sibling paths or not.
For example, if we have routes
/one/two
and/one/:param1/three
, the tree would look like:Can we tell at the
/one
node whether to go down thetwo
branch or the:param1
branch?If
:param1
is modeled to be exclusive of its siblings (e.g.two
), then yes!We'd know to go down the
two
branch.Then we'd conclude that there is no valid match for
/one/two/three
If we don't assume mutual exclusivity of sibling branches, then we'd have to traverse both
two
and:param1
.Like before, the
two
branch would not yield a valid match, but in this case:param1
branch would yield a match for/one/:param1/three
.1In this case, we could have gotten multiple valid matches if we replaced
/one/two
with/one/two/three
.For now, we can pick the "best" match out of these via the existing scoring of routes, though we can improve on this as well.
Changing routes at runtime
If we represent our routes as a tree, we'll be able to use standard tree algorithms to efficiently match paths to a route.$O(input)$ insertions and deletions, so we can efficiently support route modifications at runtime if needed.
Additionally, tries support
Longest common static prefix
By default, lookups in our route tree would prefer static paths over dynamic paths so that
/one
matches/one
over/:param
.This mirrors the "longest common prefix" matching found in DNS and other routers.
However, if we decide not to make dynamic segments mutually exclusive of their siblings, then we'd need to still pick the "best" match from among the valid matches.
A simple method to extend "longest common prefix" is to prefer "longest common static prefix".
For example, consider these two routes:
/one/two/:foo
/one/:bar/three
Which one should
/one/two/three
match?Because the longest common static prefix for each is:
/one/two/:foo
->/one/two
/one/:bar/three
->/one
So
/one/two/:foo
should be chosen.Note that we are measuring "longest" in terms of number of segments, not characters.
For two matching routes with the same number of static segments, we keep going until a tie is broken where one route has a static segment and one route has a dynamic segment.
For example, matching the path
/one/two/baz/three
to the routes:/one/two/:bar1/:bar2'
->/one/two/:/:
/one/two/:foo/three
->/one/two/:/three
where we use
:
to denote a dynamic segment so that its easy to compare the routes.Both match the same up to
/one/two/:
, but then/one/two/:foo/three
matches statically while the other route matches dynamically.So
/one/two/:foo/three
should be chosen.Currently, we use scoring scheme with arbitrary weights to rank valid matches.
Switching to longest common static prefix would be a breaking change.
We can still switch internally to a tree for matching and use the existing scoring to pick between valid matches returned by the tree lookup to vastly simplify and speed up our implementation without any breaking changes.
I think longest common static prefix captures most people's intuitions about how matching should work better than our scoring scheme and is much more easily explained.
I also think that our scoring was trying to capture the same intuition in a less robust way.
So its likely that virtually all use-cases would not be breaking and I would bet that any use-cases that are breaking are probably behaving unpredictably for users today.
So we could philosophically call this a bug fix to our current routing rather than a breaking change.
That's probably what I would do, but I realize this could be contentious.
To make it less contentious, we could try see how closely it matches existing usage in our test suite, examples, and existing Remix projects.
But we can also treat this as a breaking change if we want to be super risk-averse.
Again, with context, I think I can convince you that this really is a bug fix, but let's talk about it 😁
Footnotes
For large paths with many dynamic segments, lookup would take $O(2^dynamic)$ where $dynamic$ is the number of dynamic segments. Compare that to the exponentially faster $O(input)$ when dynamic segments are mutually exclusive to their siblings. Note that $O(2^dynamic)$ is the best we can hope to do in the worst case as there are $2^dynamic$ possible combinations for a path with dynamic segments. But this should be a rare case as we don't expect multiple routes of the form
/:a1/:b2/:c3/:d4/:e5/:f6/some-static-segment-at-the-end
to be a common use-case. ↩Beta Was this translation helpful? Give feedback.
All reactions