-
-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Large inline images use a lot of memory #10075
Comments
This does seem to have to do with inline (base64-encoded) images specifically. I tried the same file but with a linked image, and it only used 30 MB. |
Odd that even |
If we do I'd need to do some profiling to track this down further. |
Profiling, first three entries for
|
It seems that We could also think about whether we could avoid calling it. |
Yeah, here's the problem ( The URI parser parses multiple segment :: URIParser String
segment =
do { ps <- many pchar
; return $ concat ps
} This parses many small strings, one for each This should be pretty easy to optimize. I'm surprised nobody has run into this before, as this is a widely used package! For reference, here is pchar:
|
I've made the patch to parseURI, so you'll notice a new difference once a new version of network-uri is released; but it's not going to make a HUGE difference, because that function is still fairly inefficient. We could think about trying a different URI library, maybe uri-bytestring. |
Thanks @jgm, really nice explanation |
Here is an example HTML file (10Mb) with one embedded JPEG image (7.6Mb).
Memory usage:
html
tomd
uses 2985Mmd
todocx
uses 3435Mhtml
todocx
uses 4350MTest examples:
OS: macOS 14.14.1, m3/arm
The text was updated successfully, but these errors were encountered: