Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stop tracing when we start unrolling. #1603

Merged
merged 1 commit into from
Feb 13, 2025
Merged

Conversation

ltratt
Copy link
Contributor

@ltratt ltratt commented Feb 13, 2025

Though it may not be obvious that this is what it's doing, this commit stops us tracing undesirable things, notably when we start unrolling inner loops.

Consider this example program:

for x in range(...):
  if x > ...:
    for y in range(...):

The outer loop can get hot before the inner loop, so we start tracing the outer loop -- which is fine. But when we trace, we may go around the inner loop an unbounded number of times, endlessly unrolling the inner loop. This can lead to gigantic, and "unstable" traces -- because we've unrolled a specific number of iterations, there's a very high chance that we'll see a different number of iterations next time and encounter a guard on the first "iteration" of the gigantic-loop. This then leads to us creating large numbers of essentially identical side-traces, hugely bloating the system. Using havlak as an example this (combined with a couple of other things) leads to ~10x more traces being created than we would expect.

To avoid this happening, we have to track what Locations we've seen while tracing. There are some potentially cunning things we could do here if/when performance becomes an issue, but for now we force every Location we see when tracing to become a HotTracing. That gives us a stable ID (the HotLocation's Mutex) which we can use as a reliable proxy for "have we seen this Location before?"

Though it may not be obvious that this is what it's doing, this commit
stops us tracing undesirable things, notably when we start unrolling
inner loops.

Consider this example program:

```python
for x in range(...):
  if x > ...:
    for y in range(...):
```

The outer loop can get hot before the inner loop, so we start tracing
the outer loop -- which is fine. But when we trace, we may go around the
inner loop an unbounded number of times, endlessly unrolling the inner
loop. This can lead to gigantic, and "unstable" traces -- because we've
unrolled a specific number of iterations, there's a very high chance
that we'll see a different number of iterations next time and encounter
a guard on the first "iteration" of the gigantic-loop. This then leads
to us creating large numbers of essentially identical side-traces,
hugely bloating the system. Using havlak as an example this (combined
with a couple of other things) leads to ~10x more traces being created
than we would expect.

To avoid this happening, we have to track what `Location`s we've seen
while tracing. There are some potentially cunning things we could do
here if/when performance becomes an issue, but for now we force every
`Location` we see when tracing to become a `HotTracing`. That gives us a
stable ID (the `HotLocation`'s `Mutex`) which we can use as a reliable
proxy for "have we seen this `Location` before?"
4 => unsafe { ptr::write::<u32>(temp as *mut u32, jitval as u32) },
8 => unsafe { ptr::write::<u64>(temp as *mut u64, jitval) },
16 => {
// FIXME: This case is clearly not safe in general: it just so
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this isn't really what we are doing here in this PR, but a casual reader will think "if the largest object we can handle is 64-bits, then why is this arm reachable". Did we get to the bottom of why this happens?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We didn't -- but, regrettably, the different paths we now trace hit this case. It sucks, and we need work out why we're being told to sometimes use a 16 byte register.

Suggestion: we let this go for now (as much as I dislike it) and I raise an issue as soon as the PR is merged to remind us to work out what's going on here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tend to agree. Perhaps raise an issue?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I suggest I do so if/when this merges, as I can then reference the commit and line number.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's extra weird is that the jit value reports a size of 64-bit. I wonder how that mismatch happened.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's something we should look at pretty pronto: it only crops up in a small number of traces, but it does indeed crop up.

@vext01
Copy link
Contributor

vext01 commented Feb 13, 2025

Trying to understand the high-level goal here. So you keep track of which locations you've already seen while tracing and if you see a one a second time, but which isn't the location that started tracing, we have unrolled an inner loop, and we abort? Is that the gist of this?

@ltratt
Copy link
Contributor Author

ltratt commented Feb 13, 2025

Correct!

@ptersilie
Copy link
Contributor

Do we not outline functions with loops in it already? Does this mean we observe this only on the main interpreter loop?

@ltratt
Copy link
Contributor Author

ltratt commented Feb 13, 2025

"Don't outline functions with loops" is an LLVM-level thing (i.e. "don't outline C functions with loops in"). This PR is working at the upper-language (e.g. Lua) level.

// having failed to trace properly.
return TransitionControlPoint::AbortTracing;
}
if !seen_hls.insert(hl_ptr) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are two places you insert into the "seen" set. What is the difference between them?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's due to the two representations of Counting: one without a HotLocation, one with.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I've forgotten how that part of the system works, so I'll have to trust you on this one!

@ptersilie
Copy link
Contributor

Oh right. Got you.

else {
panic!()
};
if frameaddr != *tracing_frameaddr {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this bit necessary, or something unrelated we are sneaking in?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair point: it's snuck in, because it was missing before, and I happened to notice it while working in this code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[i.e. it should always have been present]

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you want to keep it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should always have been in, and it directly affects this code, so yes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry, we raced. All ok.

@vext01
Copy link
Contributor

vext01 commented Feb 13, 2025

This is at the edge of my understanding of mt.rs, but seems ok to me.

Perhaps add some more info to the commit message+PR description about why we are tracking what locations we've seen, as that wasn't immediately obvious to me (it allows us to detect back edges that aren't for the loop we are tracing).

@ltratt
Copy link
Contributor Author

ltratt commented Feb 13, 2025

Maybe I'm too close to this, but the commit message feels to me like it explains this pretty thoroughly.

@vext01
Copy link
Contributor

vext01 commented Feb 13, 2025

I leave it up to you.

@ltratt
Copy link
Contributor Author

ltratt commented Feb 13, 2025

I think it's good to go myself. It could always be improved, I suppose, but I feel it accurately describes the problem and the fix. And this is the first of (probably) a number of fixes we'll be making in this regard, so now we've all got some context as to the general problem, I think future PRs will be easier to grok.

@vext01 vext01 added this pull request to the merge queue Feb 13, 2025
Merged via the queue into ykjit:master with commit 690567e Feb 13, 2025
2 checks passed
@ltratt ltratt deleted the spot_inner_loops branch February 14, 2025 17:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants