-
Notifications
You must be signed in to change notification settings - Fork 13.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Forbid object lifetime changing pointer casts #136776
base: master
Are you sure you want to change the base?
Conversation
@bors try |
…, r=<try> [WIP] Forbid object lifetime changing pointer casts Fixes rust-lang#136702 r? `@ghost`
☀️ Try build successful - checks-actions |
@craterbot check |
👌 Experiment ℹ️ Crater is a tool to run experiments across parts of the Rust ecosystem. Learn more |
🚧 Experiment ℹ️ Crater is a tool to run experiments across parts of the Rust ecosystem. Learn more |
🎉 Experiment
|
Most of these are on github; in terms of crates.io regressions all we have is:
Overall, 142 regressions are caused by EDIT: Ah, there's also |
We discussed this in the lang triage call today. We wanted to think more about it, so we're leaving it nominated to discuss again. |
@BoxyUwU Do you think it would be possible to implement this as an FCW? We talked about this in lang triage today and would prefer to start with that if we can. If it's not feasible, a hard error can also work (I would say though that we should upstream PRs to any crates we break). Another small thing I noticed is that the error message links to the Nomicon section on variance, but it would be ideal to link to a tracking issue or something describing this issue in particular. |
To add on to what tmandry, said, in our discussions we did feel that the approach taken in this PR is generally the right way forward, and we're happy to see this progress so as to help clear the way for cc @rust-lang/lang |
@tmandry I do expect it to be possible to FCW this. We can likely do something hacky around to fully emulate the fix (but as a lint), but if that doesn't work out all the regression we found were relatively "simple" cases that can probably be taken advantage of (if need be) to lint a subset of the actual cases we'd break with this PR edit: see compiler-errors' comment, I'm not so convinced this will be possible to FCW anymore and will likely investigate improving the diagnostics here. I've already filed PRs to the affected crates to migrate them over to a transmute to avoid the breakage if this lands |
I was thinking earlier that it may be possible to implement a lint to detect, but it seems to me that MIR borrowck is not equipped to implement such a lint. Specifically, it seems near impossible to answer whether a region outlives constraint (like, To fix this would require some significant engineering effort to refactor how NLL processes its region graph to make it easier to clone and reprocess with new constraints. |
…uto_to_object-hard-error, r=oli-obk Make `ptr_cast_add_auto_to_object` lint into hard error In Rust 1.81, we added a FCW lint (including linting in dependencies) against pointer casts that add an auto trait to dyn bounds. This was part of work making casts of pointers involving trait objects stricter, and was part of the work needed to restabilize trait upcasting. We considered just making this a hard error, but opted against it at that time due to breakage found by crater. This breakage was mostly due to the `anymap` crate which has been a persistent problem for us. It's now a year later, and the fact that this is not yet a hard error is giving us pause about stabilizing arbitrary self types and `derive(CoercePointee)`. So let's see about making a hard error of this. r? ghost cc `@adetaylor` `@Darksonn` `@BoxyUwU` `@RalfJung` `@compiler-errors` `@oli-obk` `@WaffleLapkin` Related: - rust-lang#135881 - rust-lang#136702 - rust-lang#136776 Tracking: - rust-lang#127323 - rust-lang#44874 - rust-lang#123430
…uto_to_object-hard-error, r=oli-obk Make `ptr_cast_add_auto_to_object` lint into hard error In Rust 1.81, we added a FCW lint (including linting in dependencies) against pointer casts that add an auto trait to dyn bounds. This was part of work making casts of pointers involving trait objects stricter, and was part of the work needed to restabilize trait upcasting. We considered just making this a hard error, but opted against it at that time due to breakage found by crater. This breakage was mostly due to the `anymap` crate which has been a persistent problem for us. It's now a year later, and the fact that this is not yet a hard error is giving us pause about stabilizing arbitrary self types and `derive(CoercePointee)`. So let's see about making a hard error of this. r? ghost cc ``@adetaylor`` ``@Darksonn`` ``@BoxyUwU`` ``@RalfJung`` ``@compiler-errors`` ``@oli-obk`` ``@WaffleLapkin`` Related: - rust-lang#135881 - rust-lang#136702 - rust-lang#136776 Tracking: - rust-lang#127323 - rust-lang#44874 - rust-lang#123430
From the perspective of an So far Currently, what is the actual unsafe operation is a major and tricky task when reviewing unsafe code. Fortunately, it mostly boils down to "look at the function calls" and "look at the I definitely feel heard by @RalfJung's comment here:
I think that's not a good precedent to set at all. And by @compiler-errors's comment:
Yep, this feels rather drastic to me as an unsafe reviewer . I like @nikomatsakis' axioms here, and I'll note that making this UB is breakage from an unsafe-writer's POV too, even if it doesn't cause compilation failures, and I'd argue that that's worse since there's no way to detect it. Minimizing the position on @nikomatsakis' breakage ladder should account for this type of breakage too, it shouldn't just be about "which code still compiles".
It feels like talking about the semver issue feels premature unless we know that the When it comes to these kinds of breakages I feel like there's a lot of value in negotiating with our users. Some users have legitimate needs and will say no to stuff like this, but quite often the answer can be "...yeah, okay, not ideal but we can work with that". As a crate maintainer I've had to do that often enough, from both sides of the equation. I understand that this doesn't cover private crates, which is a risk, but this is a problem that can be attacked from multiple angles, including the FCW.
I do feel like this is a tradeoff that one can make a call on: either choice at least gives people something to work with. To me it feels like some pain now is worth avoiding perpetual pain in the long run. Also, perhaps I'm missing something: but the FCW is being talked about for the situation where this becomes a hard error, yes? If it's possible to hard error, it should be possible to FCW for that case with no false negatives, no? One thing I'll add, looking holistically at this: personally I don't particularly enjoy replacing Footnotes
|
I don't understand what you mean by this. I can transmute Transmute cannot change the size of the pointer, e.g.
This property is preserved. A bad The difference to before is that so far |
I'll try not to re-cover what other people have already commented, e.g. feelings about teachability here, or pointer casts not being strictly more powerful (or equal to) transmutes. First, reading the lang meeting minutes and the summarizing comment here I get the impression that lang is making decisions under the belief that writing In item signatures we have the dyn type lifetime default rules that do give this behaviour, e.g. fn foo<T: Trait>(ptr: *const T) {
let a: *const dyn Trait = ptr;
} This currently compiles on stable and will continue to do so under this PR. If the type annotation on the let statement meant I am somewhat confused by the proposal to start linting on code. The exact details of what we're supposed to lint on are unclear to me, especially given the prior context of having already ruled out being able to do a FCW. Having read the meeting notes my understand is that lang is considering a lint that forbids eliding dyn type lifetimes altogether in For example that the following would emit a lint: fn foo<T: Trait>(ptr: *const T) {
ptr as *const dyn Trait;
} I would expect this to have a lot of false positives. I also don't believe this really helps alleviate the footguns involved with these pointer casts (which seemed to be a big point of focus in the meeting notes, that this removes a footgun). Even when explicitly writing out the lifetimes involved you can just write lifetimes that make it seem like no real lifetime changing has occurred. Taking the example from the lang meeting notes: fn foo(output: &dyn Write) {
/* ... */
unsafe { output as *const (dyn Write + 'static) as *mut (dyn Write + 'static) };
/* ... */
} This example still seems like a footgun, the explicitly written lifetime bounds almost make it worse as you could easily believe that the pointee of In order to avoid a footgun here the type of I am also unclear on where lang stands in regards to breaking such unsafe code over an edition. The summary comment here was not particularly clear, and reading the meeting notes I have been given the impression that no real consensus was arrived at and that the hypothetical lint obviates any need to break this unsafe code. I am not sure if things have changed for lang now given my previous statements about footguns and dyn-type elision rules. Before going through with this I would like for lang to fully commit to either supporting such casts as an actual language feature, or only as a migration hack with a commitment to break it going forward across the 2025 edition (with the understanding that it may not be possible to FCW or auto-fix). Generally I would like to avoid going forward with this with a vague handwaving of "we could potentially break this over an edition" to assuage concerns of teachability/etc, and then potentially wind up with lang deciding not to do so. On the general consistency of pointer casts here. Lang has already forbidden casting If lang wishes to extend the power of On the other hand if this is only intended as a migration strategy then this inconsistency seems fine to me as we expect all code to either be legacy (i.e. no longer maintained) or updated to a newer edition where there is no inconsistency. Finally, reading the meeting notes I see that there were parallels drawn between raw pointer derefs being the same syntax as derefing references and how one is safe and the other not. Admittedly I do see the parallel here but I think it's worth pointing out that it is significantly easier to determine whether one is dereferencing a pointer or reference, in comparison to determining whether a lifetime has been extended. I think lang is aware of this hence the discussion of the lint and avoiding footguns? I think this is worth revisiting in light of previous statements about footguns/the lint. I would also like to note that r-a currently has the ability to highlight unsafe operations in a specific colour to make it more clear when safety invariants are introduced in unsafe code. I don't know how well that can be supported with this change where all Do we expect r-a to be able to figure this out or just conservatively consider all of these I generally find it hard to go along with the idea that its "just" allowing a new operation to be performed inside of I think the fact that there is ~no world in which the compiler impl will align with this mental model of "its a new operation allowed in unsafe" should be a good signal that it's not the right mental model. |
Separately from my previous comment I want to just say that the The (If you look at the crater regressions there are maybe a dozen due to |
@Veykril Could you weigh in on the technical feasibility of both detecting this new class of |
Thanks for the feedback everyone. I was the one primarily pushing to avoid unnecessary breakage in this case. That is conditional on the feasibility and maintainability of a solution, and @compiler-errors makes a compelling case that it would be neither. No one thought we were proposing changes to MIR borrow check! @BoxyUwU may be able to confirm/deny if that is what she had in mind. I think we've heard enough to convince me that this is not a path worth going down now. It sounds like the lint mitigation was based on some faulty assumptions. Upon reflection, I agree with @BoxyUwU that we should have consensus within the lang team to walk this safe/unsafe difference back over an edition before releasing it. Deciding this is infeasible or a maintenance burden, or being convinced (as others seem to be) that the breakage is very minimal, would be enough as well. In this particular case I think the breakage is unfortunate but workable, so I am okay to move forward. I don't speak for the rest of the lang team in this comment, but no one spoke up about wanting to avoid the breakage as much as I did. @Manishearth You mentioned an FCW; my understanding has been that an FCW in this case is infeasible. I do agree with your assertion that this would make unsafe reviews harder.
@compiler-errors That wasn't my understanding from the discussion, which included a 6yr old version of diesel. Was that relying on this as an unstable feature until recently, or what am I missing? |
@tmandry: I guess I was a bit misled when I said we only recently started to allow this behavior. What I should have said is that we only recently made this behavior more relaxed beginning with #113262. That PR made casts where nothing but the lifetime was changing work, i.e. But there are rare cases where this behavior coincidentally worked when there were more than 1 casts chained together and between the two casts more than the lifetime changed in the casted type. That old diesel one is an example. Boxy and I worked out a few examples where this was allowed even pre 1.75 which is when that linked PR landed. This is the moral equivalent of the diesel example:
And this is the moral equivalent of the brainfuck one:
Both of these examples worked before 1.75 because they were emitting something that was a non-trivial pointer cast. But they still are definitely violations of the vtable validity here. I will note that we're still really funny with what we allow when doing raw pointer casts of wide pointers. For example, this errors today:
i.e. it was not fixed by #113262. You could argue that @BoxyUwU's PR here is making the language more consistent by always enforcing the correct lifetimes in wide pointers along these lines :) |
given rust-analyzer is still completely oblivious to lifetimes in our IRs I'd say very difficult. Though either way that feature is only meant to show what would be erroring if the unsafe block was missing, which reading from this, the |
On the substance here, we'll obviously take this back up on Wednesday. My estimate is that we're rather likely to readopt the original plan and to "stay the course" on that. We very nearly went that way last Wednesday. Speaking for myself, I think what's being done here, in terms of making PRs to the affected public projects and working out a diagnostic that guides people in the right direction, is what we need to do, and combined with the soundness arguments we've been making, is sufficient to justify and support this change. I appreciate -- and I know we all on lang appreciate -- the work @BoxyUwU did to make those PRs and is doing to put this all together. It's a nice additional benefit that we apparently only allowed the most likely kinds of this semi-recently and that this probably does make the language more consistent in the way that CE mentioned. |
By "diagnostic" here you refer to the hard error that shows in code this breaks? @BoxyUwU @compiler-errors how practical is it to give a dedicated diagnostic for this case? |
That's right. |
Maybe I'm missing something, and part of the issue with this thread is that it's not always clear what changes specifically are being talked about. But for the proposal on the table that is:
It should be instead possible to forbid those things entirely in all If the FCW for forbidding such casts is impossible to write, then I ask: how are you going to forbid the operation completely in safe Rust as is the lang team's proposal?
Could you explicitly write down the "original plan"? This thread is rather self referential, the linked comment doesn't specify the plan, it refers to a previous comment which also isn't fully clear. |
Because it amounts to literally running all of borrowck twice. It's not impossible to write, but it's prohibitively difficult to do and would require quite a lot of refactoring. Straight up denying it amounts to a few lines of change, as implemented in the PR today. |
It's what I went on to say. We do the breaking change along with a diagnostic on the error that tries to point people in the right direction and we make PRs to affected projects. |
@traviscross there are multiple possible breaking changes in this thread, and multiple lints. I don't actually think anyone has precisely defined what is happening so far all in one place, instead referencing previous comments with wording like "the breaking change". |
The breaking change is that casting |
Hmm, from what I recall that specific example is possible to lint against in a late lint pass. There's a lot of talk of MIR borrowck when talking about feasability, maybe I'm missing something, but from clippy experience that seems implementable? |
It's not just about that specific example, for the general case you need borrowck to check things like |
I mean, yes, I understand it's not a simple AST match, but ultimately those are both An FCW with a lot of false positives or false negatives is potentially still worth trying, though, if they don't end up triggering often. |
@Manishearth The only part of the compiler that actually knows lifetimes is borrowck. So nothing outside borrowck can implement this lint as there's no way to actually get hold of the two lifetimes that have to outlive each other. Inside borrowck, the way the hard error is implemented is to add this new constraint to the big sea of constraints that is gathered up during borrowck. Borrowck never explicitly materializes concrete lifetimes, all it does is gather constraints and then check if the constraint system is satisfiable. That makes it impossible to lint since when borrowck fails, it is entirely unclear whether it would have succeeded without that extra constraint. |
Ah, I see. I was under the impression some of this information was available after MIR borrowck, but looking through clippy we don't actually do that. Fair enough. (I may be operating on a model of things from before MIR borrowck) |
We talked about this on the lang call today, and we'd like to unblock this in favor of proceeding with the original plan, which is that we plan to accept the breaking change as originally proposed in this PR subject to PRs being made to affected projects (already done) and an attempt being made to provide a diagnostic along with the hard error that will point people in the right direction on this as best as possible. When this PR is ready, along those lines, please renominate it for us, and we'll propose lang + types FCP. @rustbot labels -I-lang-nominated +I-lang-radar |
Fixes #136702
r? @ghost