-
Notifications
You must be signed in to change notification settings - Fork 66
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Channel3.getSample computes non-integer typed array index, leading to recompilation loop in Firefox #216
Comments
Oh wow! Super rad the article / project reached you guys at Mozilla! 😄 Ahhh! So to be honest, the Typescript compiled version did have an issue with Sound, and I knew it was because I wasn't sanitizing numbers (I think is the right term?) correctly, but the JS version wasn't meant to be playable as a user, just run the games so I figured it would be fine for now. But, I'll definitely go through and fix this! I'll also be sure to add this issue to the article! I knew something was funny when I saw Firefox performance, and I was careful not too call out any engine being "better" because of stuff like this. Thanks for the bug! I'll try to fix this and I'll update the article now. Thanks! Edit: article updated in multiple places to link the bug, and left an "Edit log" at the bottom of the article 😄 |
Not sure if this is a 100% correct fix, but coercing the load/store offsets to ints significantly improve the benchmark results in Firefox version 64 for me (running the tobu tobu girl game).
|
@cosinusoidally Yeah totally! Thanks for looking into this! Currently, I'm not at my home laptop, so I'll probably open up a PR later tonight to do all the coercing and stuff using the "portable" functions I have to handle all of the type differences between running with AssemblyScript and running with JS. |
@torch2424 well all credit to @anba for tracking down the issue. I just spotted his comment here https://bugzilla.mozilla.org/show_bug.cgi?id=1515620#c3 and thought "hmm, I wonder if I can just tweak the generated JS to avoid this performance cliff" |
Btw, the recommended portable way of writing this with AS is let positionIndexToAdd = i32(Channel3.waveTablePosition / 2); which is a nop in WASM and a |
So I went ahead and opened #218 Which as you can see definitely fixes the FireFox JS performance! But it still seems like something is off after we closure compile it. In the meantime, I'll go ahead and push up the fix for anyone who happens to run the benchmark. 😄 |
relates to #216 This fixes the sound inconsistencies between Wasm and JS, as well as fixes the JS performance of Firefox. Also, snuck in some changes for faster travis and deploying, for less redundant building of the lib. **Example of new firefox results for JS core** <img width="827" alt="screen shot 2018-12-21 at 10 48 00 pm" src="https://user-images.githubusercontent.com/1448289/50371655-0503ef80-0574-11e9-8932-efb0bed39fc6.png"> **Old Travis Build Time** <img width="961" alt="screen shot 2018-12-21 at 11 37 43 pm" src="https://user-images.githubusercontent.com/1448289/50371952-8742e280-0579-11e9-9fae-1ed8e7a57cc3.png"> **New Travis Build Time** <img width="960" alt="screen shot 2018-12-21 at 11 37 37 pm" src="https://user-images.githubusercontent.com/1448289/50371953-8f028700-0579-11e9-8756-a89de78e7eaf.png">
\o/
Hmm, that's strange. The Closure-compiled version is definitely faster than the non-compiled one for me. (Tested in under Windows and Linux, with Release/Beta/Nightly versions of Firefox.) I did a bit of additional profiling (with the Firefox profiler addon) to see if there are any other easy optimization targets for the JS/TypeScript version. And the profile showed that The profile also showed that In the TypeScript code At one point during the benchmark run, So, I could be interesting to see if properly converting the value back to an int32 has any effects in Safari or Chrome, because when I ran a simple standalone version of that code in the shell, the negative effects of using double instead of int32 were even worse in JavaScriptCore and V8. I guess these two results were the most interesting ones from the profile. Oh, and there also seems to be an off-by-one error in https://github.com/torch2424/wasmBoy/blob/806cd8236e16f4a8bbb51c158218ec75a8a0c8de/demo/benchmark/loadrom.js#L60 I noticed when looking if there are any frequent bail-outs from the optimizing compiler (which I didn't see). |
for (let i = 0; i <= coreObject.core.byteMemory.length; i++) {
coreObject.core.byteMemory[i] = 0;
} better replace to: coreObject.core.byteMemory.fill(0); and yaeh |
Huh, that is strange! I ran the benchmark, but maybe forgot to recompile the closure version that night? I'll try again once I get the chance 😄
Wow! That's such a deep dive thank you so much! I've been fighting with performance on the emulator itself for a while, so I'm super stoked to see this 🎉 I'll definitely go ahead and implement all of these things once I get the chance. Thank you again! I'm sure that took a lot of digging to find, and I very much appreciate it! Thanks for the tip! I'll definitely go through and do a bunch of refactoring to use |
@anba My apologies for falling so far behind on this. If you noticed I've had this issue pinned all this time, and left a lot of references to it in the article. Just a quick status update, Regarding the closure compiler thing, yeah it was definitely my fault haha! 🙃 And I'm currently taking a break from my roadmap: #197 , so I'll try and go through and implement the lingering issues here 😄 Thanks again for all the help, it is very much appreciated! |
So everything here is implemented now. Going to keep this open a little while longer in case any more feedback. And also, as a reminder to update the article. |
https://bugzilla.mozilla.org/show_bug.cgi?id=1515620 was filed yesterday against SpiderMonkey, based on the benchmark results at https://medium.com/@torch2424/webassembly-is-fast-a-real-world-benchmark-of-webassembly-vs-es6-d85a23f8e193. When investigating why SpiderMonkey was so bad at this particular benchmark, it turned out that the abysmal performance is caused by a recompilation loop in SpiderMonkey's optimizing compiler when inlining typed array accesses. This
load
functionhttps://github.com/torch2424/wasmBoy/blob/693b160ecd9ae084b4459a08a930b2ef9108b255/core/portable/wasmMock.js#L18-L20
is currently called with non-integer inputs, like
59853.5
, where the fractional part.5
is caused by this division (*):https://github.com/torch2424/wasmBoy/blob/693b160ecd9ae084b4459a08a930b2ef9108b255/core/sound/channel3.ts#L185
While the recompilation loop is certainly an issue which should be fixed in SpiderMonkey, the miscomputed typed array index looks like a bug in wasmBoy to me, too.
(*) I haven't checked if there are additional callers to
load
which also pass in non-integer indices.The text was updated successfully, but these errors were encountered: