-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rust: Implement basic type inference in QL #18632
Conversation
66bea42
to
2d0e953
Compare
c88c9fe
to
a8540b1
Compare
02184af
to
4f723dd
Compare
486f813
to
a254679
Compare
42e5970
to
be2ec0f
Compare
Your understanding is absolutely correct.
Good idea, I'll change it.
I don't know if this could work, but feel free to try it out. Note that we also need type constraints for traits that extend other traits. |
e21ecc3
to
3bb89ea
Compare
@paldepind : Thanks for the review, I have (hopefully) addressed all your comments. I also decided to rebase, so we can see the combined effects with #18228 on DCA. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for addressing my comments :) Looks great to me 🎉 🎉
Declaration decl, DeclarationPosition dpos, Type base, TypePath path, Type t | ||
) { | ||
t = decl.getDeclaredType(dpos, path) and | ||
path.isCons(base.getATypeParameter(), _) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This relies crucially on TypeParameter
s being unique to a given Type
. Should that be checked in a consistency check somewhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, presumably it would be equivalent to replace path.isCons(base.getATypeParameter(), _)
with base = decl.getDeclaredType(dpos, TypePath::nil()) and path != TypePath::nil()
, and wouldn't that be more efficient in terms of the amount of string manipulation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The rewrite will not work if a declaration has multiple declared types at dpos
, which there is no reason to rule out. But I noticed that declarationBaseType
is only used in a context where t
is in fact a TypeParameter
, so we may as well specialize it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This relies crucially on TypeParameters being unique to a given Type. Should that be checked in a consistency check somewhere?
Good idea; not all type parameters need to belong to a type though (e.g. method type parameters in C#), but they should belong to at most one type.
accessBaseType(a, apos, target, base, pathToTypeParam.append(path), t) and | ||
declarationBaseType(target, dpos, base, pathToTypeParam, tp) and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To check my understanding: The join of base
here is the point where we introduce covariance, correct? Perhaps leave a comment in this spot about covariance if this is the place that needs updating to potentially support contravariance in the future?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct.
directTypeMatch(a, target, path, t, tp) | ||
or | ||
baseTypeMatch(a, target, path, t, tp) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So adjustAccessType
is applied in directTypeMatch
, but not in baseTypeMatch
? Maybe an example to clarify why we have this distinction would be good to add in the qldoc for adjustAccessType
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct, only used in directTypeMatch
. I think we would have to add it in accessBaseType
as well.
I've resolved the comments that should be addressed over in #19081. |
Overview
This PR adds basic type inference for Rust, implemented in QL. Type inference and method call resolution are mutually recursive:
In the example above, in order to resolve the call
x.foo()
we need to know the type ofx
, and in order to resolve the type ofx.foo()
we need to know the return type offoo
(meaning we need to be able to resolve the call).The example above involves generics, and we want to be able to conclude that
x.foo()
has typei64
and not simply the typeS
, which is the declared return type offoo
.This means that we need to be able to represent the fact that
x
has typeMyStruct<i64>
, and while it is tempting to represent constructed types like this using anewtype
encoding, such an encoding will become mutually recursive with type inference/call resolution, meaning (at best) poor performance and (at worst) non-monotonic recursion. Another issue with using anewtype
encoding is that we could only construct a constructed generic type once we are able to infer all type arguments (type arguments will be encoded as cons-lists), and in case one of the type arguments cannot be inferred, we will have to give up completely.Instead, we index all type resolution relations with a type path, which is conceptually a (possibly empty) list of type argument indices. For example, the type
MyStruct<i64>
can be represented as""
MyStruct
"0"
i64
Type paths are represented as strings, so there will be no need for additional mutual recursion.
Using type path indexing, we can infer the following types for the program above:
self
""
MyStruct
self
"0"
S
x
""
MyStruct
x
"0"
i64
which means we will be able to match
S
against the typei64
. We can then combine this with the return type offoo
to get the correct return type of the call.Implementation
The implementation is split into a shared language-agnostic type inference library (in a new
typeinference
QL pack) and a Rust-specific implementation on top.The shared library defines the
TypePath
class, and provides the moduleMatching
for matching types of arguments with types of parameters in the declarations that they target (like matchingi64
withS
in the example above). Matching takes into account that type arguments may be supplied explicitly (likeMyStruct<i64>::foo(x)
) and that matching may need to take base types into account in order to match.The entry predicate of the Rust-specific implementation is
which assigns type-path-indexed types to AST nodes. Using the
Matching
module from the shared library, the implementation infers types for record expressionsFoo { bar = baz}
, call expressionsx.foo()
andFoo::bar(x)
, and field expressionsx.field
(in mutual recursion withresolveType
in the latter two cases). For call expressions, we take implicit borrows and implicit dereferences into account, and base types in the context of Rust translates to trait bounds andimpl
blocks.Known limitations
The implementation has a lot of known limitations, for example:
impl
blocks.dyn
trait types.Deref
trait and more generally type coercions.Evaluation
DCA shows that the implementation adds a negligible overhead to the total analysis time (~5 %). Overall, we mostly loose call edges compared to what we get from
rust-analyzer
(except for thediem
project), but this is really not surprising given the known limitations above, as well as the known limitations of our path resolution implementation.Note to the reviewer
Commit-by-commit review is encouraged.