-
Notifications
You must be signed in to change notification settings - Fork 161
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Synchronous WritableStream Writer to avoid heap allocations from Promises returned by WritableStreamDefaultWriter's write #1276
Comments
I don't think it's the promises that are causing you to run out of memory, but the large number of pending writes holding onto buffers. So awaiting the results of your writes (or waiting Synchronous blocking operations are harmful in a browser context, so we would prefer not to add them to the standard. |
I can confirm that the promises have an effect. I went ahead & implemented a synchronous buffered write using Here is an article that reifies the claim the Promises cause excessive heap allocations.
Edit: I was able to achieve the same performance as Nodejs writer write using the synchronous buffered write. When I first wrote this post, I was excessively calling Uint8Array slice.
I agree that synchronous blocking operations are harmful on the main thread in a Web UI. Though blocking synchronous operations are the most time efficient approach for a CPU bound export processes on a server/dev machine or on a WebWorker. I'm looking for a fire-and-forget approach. It seems like this has been achieved by creating the synchronous buffered write function. In case you are interested, here is the source code: import { queue_ } from '@ctx-core/queue'
import { createWriteStream } from 'fs'
import { Writable } from 'stream'
export const stream__promise_ = web__stream__promise_
async function web__stream__promise_(
path:string,
fn:(write:(content:string)=>void)=>any
):Promise<void> {
console.info(performance.now(), path, 'start!')
const writer = Writable.toWeb(
createWriteStream(path)
).getWriter()
const buffer_length = 512_000
let buffer = new ArrayBuffer(buffer_length)
let view = new Uint8Array(buffer)
const encoder = new TextEncoder()
let byte_idx = 0
let total__byte_idx = 0
const queue = queue_(1)
function write(content:string) {
if (!content) return
const encoded = encoder.encode(content)
const encoded_length = encoded.length
if (byte_idx + encoded_length > buffer_length) {
queue__add(view, byte_idx, total__byte_idx)
buffer = new ArrayBuffer(buffer_length)
view = new Uint8Array(buffer)
byte_idx = 0
}
view.set(encoded, byte_idx)
byte_idx += encoded_length
total__byte_idx += encoded_length
}
fn(write)
// write the remaining bytes in buffer
queue__add(view, byte_idx, total__byte_idx)
await queue.close()
console.info(performance.now(), path, 'done!')
function queue__add(
view:Uint8Array,
byte_idx:number,
total__byte_idx:number
) {
queue.add(()=>
writer.write(view.slice(0, byte_idx))
.then(()=>
console.info(performance.now(), path, total__byte_idx, 'bytes written'))
).then()
}
}
async function node__stream__promise_(
path:string,
fn:(writer:(content:string)=>void)=>any
):Promise<void> {
console.info(performance.now(), path, 'start!')
const writer = createWriteStream(path)
function write(content:string) {
if (!content) return
writer.write(content)
}
fn(write)
await new Promise(res=>
writer.end(()=>{
console.info(performance.now(), path, 'done!')
res(null)
}))
} |
I have a sychronous export process for a large xlsx file where many writes occur. The export previously used nodejs's
Writable
thewrite
method is synchronous. I converted the streaming over toWritableStream
. This resulted in aFATAL ERROR: Reached heap limit Allocation failed
.Since Promises create heap allocations & synchronous programs do not benefit from Promises, it would be more memory efficient to have a
WritableStreamSynchronousWriter
or aWritableStreamBufferedWriter
.I believe a
WritableStreamBufferedWriter
could be a library, which would presumably delay heap limit errors with synchronous programs.WritableStreamSynchronousWriter
does not seem possible without adding support to the Web Streams API. A synchronous function could also run into memory limit issues or need to be drained at a later time than an async function if the reads are faster than the writes.Is this use case something to be considered for Web Streams API?
The text was updated successfully, but these errors were encountered: