Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Readable stream from chunked transfer encoding seems to miss some chunks #20

Open
bmatthieu3 opened this issue Apr 26, 2023 · 5 comments

Comments

@bmatthieu3
Copy link

Hi @MattiasBuelens ,

We are passing by a proxy to get files that are coming from a server which does not have CORS headers. Requesting the file from our proxy have this headers:

> GET /cgi/JSONProxy?url=https%3A%2F%2Fcdsarc.cds.unistra.fr%2Fsaadavizier%2Fdownload%3Foid%3D864974746620526595 HTTP/1.1
> Host: alasky.cds.unistra.fr
> Accept-Encoding: deflate, gzip
> Accept: */*
> Accept-Language: fr-FR,fr;q=0.9,en-US;q=0.8,en;q=0.7
> Connection: keep-alive
> DNT: 1
> Origin: http://localhost:8080
> Referer: http://localhost:8080/
> Sec-Fetch-Dest: empty
> Sec-Fetch-Mode: cors
> Sec-Fetch-Site: cross-site
> User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/112.0.0.0 Safari/537.36
> sec-ch-ua: "Chromium";v="112", "Google Chrome";v="112", "Not:A-Brand";v="99"
> sec-ch-ua-mobile: ?0
> sec-ch-ua-platform: "macOS"
> 
* Mark bundle as not supporting multiuse
< HTTP/1.1 200 OK
< Date: Wed, 26 Apr 2023 09:09:50 GMT
< Server: Apache/2.4.25 (Debian)
< Vary: Accept-Encoding
< Content-Encoding: gzip
< Access-Control-Allow-Headers: *
< Access-Control-Allow-Methods: GET, OPTIONS
< Access-Control-Allow-Origin: *
< Keep-Alive: timeout=5, max=100
< Connection: Keep-Alive
< Transfer-Encoding: chunked
< Content-Type: application/fits

I see that the transfer-encoding is "chunked", maybe it is the reason why I am seeing that:
image

If I download the file manually and then drag and drop/open it in my application I get the correct image:
image

It is like some chunks are missing but other are correctly received and parsed.

The code used behind one or the other method is similar:

  • I fetch the url, in the first case, it is the one using the proxy, on the other, it is an url computed from the local File object opened
  • Then the code is similar, I get the ReadableStream from the response body and convert it using wasm-streams to an async reader.

Thank you very much, maybe you have an idea on what may cause this issue.

@bmatthieu3 bmatthieu3 changed the title Readable stream from chunked transfer encoding seem to miss some chunks Readable stream from chunked transfer encoding seems to miss some chunks Apr 26, 2023
@MattiasBuelens
Copy link
Owner

Without seeing any code, there's very little I can do to investigate. Is there a way for me to reproduce your issue on my end? A repository I can clone, or a code snippet I can run?

Does the issue still occur if you fetch the entire response as an ArrayBuffer, without streaming it? Something like:

use wasm_bindgen::{prelude::*, JsCast};
use wasm_bindgen_futures::JsFuture;
use web_sys::{console, window, Response};
use js_sys::Uint8Array;
use std::io::Cursor;
use futures_util::io::{AsyncRead, AsyncReadExt};

async fn main() -> Result<(), Box<dyn std::error::Error>> {
    // Make a fetch request
    let url = "https://rustwasm.github.io/assets/wasm-ferris.png";
    let window = window().unwrap_throw();
    let response_value = JsFuture::from(window.fetch_with_str(url)).await?;
    let response: Response = response_value.dyn_into().unwrap_throw();

    // Read the response into a JS ArrayBuffer
    let buffer_value = JsFuture::from(response.array_buffer()?).await?;
    // Copy the ArrayBuffer to a Rust Vec
    let vec = Uint8Array::new(buffer_value).to_vec();

    // Consume the Vec as an AsyncRead
    let cursor = Cursor::new(&vec[..]);
    let mut buf = [0u8; 100];
    loop {
        let n = cursor.read(&mut buf).await?;
        if n == 0 {
            break
        }
        let bytes = &buf[..n];
        // Do something with bytes
    }
}

That way, we can figure out if the issue is really caused by wasm-streams, or if something else is going on.

@bmatthieu3
Copy link
Author

bmatthieu3 commented Apr 26, 2023

Here is the code to convert the readablestream to the async reader:

          use wasm_streams::ReadableStream;
          use js_sys::Uint8Array;
          use web_sys::Response;
          use web_sys::window;
          use crate::renderable::image::Image;
          use futures::TryStreamExt;
            use futures::future::Either;
            use web_sys::{Request, RequestInit, RequestMode, Headers};

            let mut opts = RequestInit::new();
            opts.method("GET");
            //opts.mode(RequestMode::Cors);

            let window = window().unwrap();
            let request = Request::new_with_str_and_init(&url, &opts)?;

            let resp_value = JsFuture::from(window.fetch_with_request(&request))
                .await?;
            let resp: Response = resp_value.dyn_into()?;

            // Get the response's body as a JS ReadableStream
            let raw_body = resp.body().unwrap();
            let body = ReadableStream::from_raw(raw_body.dyn_into()?);

            // Convert the JS ReadableStream to a Rust stream
            let bytes_reader = match body.try_into_async_read() {
                Ok(async_read) => Either::Left(async_read),
                Err((_err, body)) => Either::Right(
                    body
                        .into_stream()
                        .map_ok(|js_value| js_value.dyn_into::<Uint8Array>().unwrap_throw().to_vec())
                        .map_err(|_js_error| std::io::Error::new(std::io::ErrorKind::Other, "failed to read"))
                        .into_async_read(),
                ),
            };

            let mut reader = BufReader::new(bytes_reader);

Does the issue still occur if you fetch the entire response as an ArrayBuffer, without streaming it? Something like:

It even works with streaming without passing by the proxy. But when streaming from the proxy, it produces the above problem. Streaming from a local file (getting a File object and creating an url to fetch from it) produces the good result as well (2nd pic of my last post)

@MattiasBuelens
Copy link
Owner

It even works with streaming without not passing by the proxy. But when streaming from the proxy, it produces the above problem.

This sounds like an issue with your proxy, rather than with wasm-streams. I recommend you take a closer look at the received HTTP response with a tool such as Wireshark, and check if the response might have been corrupted.

I'm afraid I can't help you much further though. 🤷‍♂️

@bmatthieu3
Copy link
Author

bmatthieu3 commented Apr 26, 2023

I suspect this is the "transfer-encoding: chunked" thing that may make the things go wrong because it seems I do not receive all the chunks, like one every two chunks.
One thing you can do to maybe reproduce the problem is to query the ferris image using streaming and through the proxy and plot it (e.g. drawing it on the canvas) to see what happens.

The url is to query through the proxy is: https://alasky.cds.unistra.fr/cgi/JSONProxy?url=https://rustwasm.github.io/assets/wasm-ferris.png

It seems to be a streaming problem because it works fetching it using array buffer and it also simply work in the browser. By following the url, I can download the ferris image with no corruption in the bytes. That is why I opened this issue.

@MattiasBuelens
Copy link
Owner

It looks like the proxy only uses chunked transfer if the original response also uses chunked transfer. Which is not the case for the Ferris image...

bmatthieu3 pushed a commit to cds-astro/aladin-lite that referenced this issue May 3, 2023
…ens/wasm-streams#20 is solved). Tell the user that fetching may fail because of CORS headers not set and that he can open its fits file by first manually download it and then open it in aladin lite
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants