Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More floating point codegen #1599

Merged
merged 2 commits into from
Feb 11, 2025
Merged

Conversation

ltratt
Copy link
Contributor

@ltratt ltratt commented Feb 10, 2025

We previously didn't handle select with floating point values -- which I've now seen in the wild! Please check the codegen properly for this: it made my head hurt.

Conditional moves for xmm registers confused me and since this is not a
common thing we yet see (I've seen it once in real code), I'm not too
worried about making it perfectly fast yet.
The first of these caught a genuine bug (that we didn't handle `select`
and floats properly), so it's worth running them more often.
match inst.trueval(self.m).bitw(self.m) {
32 => {
dynasm!(self.asm
; bt Rd(cond_reg.code()), 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would the fcmov instruction help here? That would bring this in parity with the integer version.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You tell me. I ended up baffled by this; the documentation on this stuff is much sparser -- and frankly, much worse -- than for general purpose registers. Personally I'd be happy to get something we're confident is correct in and optimise it later.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now I look at it, fcmov uses the old fashioned floating point stack and not the xmm registers, so we can't (and shouldn't) use it.

Your codegen looks correct to me. AI tells me there is no xmm equivalent to fcmov and gave me a technique similar to what you have done here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds like we can merge this then?

@vext01 vext01 added this pull request to the merge queue Feb 11, 2025
Merged via the queue into ykjit:master with commit 9ede83a Feb 11, 2025
2 checks passed
@ltratt ltratt deleted the more_floating_point_codegen branch February 14, 2025 17:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants