When an SOC is overwritten, the old version is served from cache? #4983

Cafe137 · 2025-02-04T19:03:39Z

Context

Bee 2.4.0

Summary

My understanding is that SOCs can be overwritten. However, when updating an existing SOC, old versions are served, likely from cache. I suspect cache, but not 100% on it. Getting the latest version requires workarounds such as disabling cache[0], or waiting a long time.

Additionally, there exists an intermediate period after update, where new and old versions are both served randomly, based on which nodes are chosen. This probably assumes that local cache is disabled.

[0] While disabling local cache has worked in some cases, remote nodes can still cache previous versions.

Expected behavior

After updating an SOC, its new version should quickly propagate in the network and replace previous versions.

Actual behavior

Please see summary.

Steps to reproduce

Write an SOC; update same SOC; fetch SOC => 1st version is returned

Possible solution

When a new version of an SOC is stored, Bee should clear the previous version from the cache, if exists.

istae · 2025-02-04T20:49:40Z

You are right. The cache as it is currently implemented will not update itself with the latest SOC, this is a known problem.

NoahMaizels · 2025-02-05T09:40:51Z

My understanding is that SOCs can be overwritten. However, when updating an existing SOC, old versions are served, likely from cache. I suspect cache, but not 100% on it. Getting the latest version requires workarounds such as disabling cache[0], or waiting a long time.

If I understand correctly based on an earlier call with @istae, you would need that every node along the path the chunk takes hopping along would need to disable cache, not only the receiving node itself.

NoahMaizels · 2025-02-05T09:43:05Z

My understanding is that SOCs can be overwritten. However, when updating an existing SOC, old versions are served, likely from cache. I suspect cache, but not 100% on it. Getting the latest version requires workarounds such as disabling cache[0], or waiting a long time.

If I understand correctly based on an earlier call with @istae, you would need that every node along the path the chunk takes hopping along would need to disable cache, not only the receiving node itself.

This would also mean increased latency, but I think it's not an issue when it comes to single chunk retrievals

Cafe137 · 2025-02-05T12:22:51Z

These are the options I naively see, sharing just for brainstorming purposes:

Never cache an SOC, simple but very inefficient.
Have a "cache busting" protocol that signals to the whole network that a new version of an SOC is available.
Have a request flag that requests a fresh version of an SOC, by-passing cache for all hopped nodes.

mfw78 · 2025-02-10T09:37:45Z

Another option is to expand the SOC so that it has a cache_ttl field in it's chunk definition. Then could just take the option of if the SOC doesn't have a cache_ttl, don't cache it.

acha-bill · 2025-03-17T13:05:53Z

Have a request flag that requests a fresh version of an SOC, by-passing cache for all hopped nodes.

This option looks good to me. The client controls when they want consistency (latest chunk) vs performance (maybe latest). Also a pragmatic approach

Have a "cache busting" protocol

Inefficient as you have to notify everyone on the network starting from the node that stored the update. Also, in case of network partioning, you still end up with nodes that didn't invalidate their cached entry. Also harder to implement. lol

Another option is to expand the SOC so that it has a cache_ttl

Since the TTL is set by the uploader but the cache problem is experienced by downloaders who may be unrelated, there's a misalignment of control and needs

mfw78 · 2025-03-18T05:30:48Z

Since the TTL is set by the uploader but the cache problem is experienced by downloaders who may be unrelated, there's a misalignment of control and needs

The intent would be to not make it binding, but to make it a best efforts basis. More specifically, this provides hints to intermediate nodes as to how to best effectively manage their cache, which would maximise their potential revenue from bandwidth incentives.

NoahMaizels · 2025-03-18T20:00:17Z

Maybe the nodes which cache the SOC could record when the last update was, and requesting nodes can specify how fresh of an update they are looking for?

Like, they could specify 5 minutes in the request, and then if the intermediate node has a SOC update that is less than 5 minutes old it will serve that one, and if it's older than 5 minutes it will attempt to fetch a newer one and update its cache with that one if it exists?

Cafe137 added the needs-triaging new issues that need triaging label Feb 4, 2025

acha-bill self-assigned this Mar 17, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

When an SOC is overwritten, the old version is served from cache? #4983

When an SOC is overwritten, the old version is served from cache? #4983

Cafe137 commented Feb 4, 2025

istae commented Feb 4, 2025

NoahMaizels commented Feb 5, 2025 •

edited

Loading

NoahMaizels commented Feb 5, 2025

Cafe137 commented Feb 5, 2025

mfw78 commented Feb 10, 2025 •

edited

Loading

acha-bill commented Mar 17, 2025

mfw78 commented Mar 18, 2025

NoahMaizels commented Mar 18, 2025 •

edited

Loading

When an SOC is overwritten, the old version is served from cache? #4983

When an SOC is overwritten, the old version is served from cache? #4983

Comments

Cafe137 commented Feb 4, 2025

Context

Summary

Expected behavior

Actual behavior

Steps to reproduce

Possible solution

istae commented Feb 4, 2025

NoahMaizels commented Feb 5, 2025 • edited Loading

NoahMaizels commented Feb 5, 2025

Cafe137 commented Feb 5, 2025

mfw78 commented Feb 10, 2025 • edited Loading

acha-bill commented Mar 17, 2025

mfw78 commented Mar 18, 2025

NoahMaizels commented Mar 18, 2025 • edited Loading

NoahMaizels commented Feb 5, 2025 •

edited

Loading

mfw78 commented Feb 10, 2025 •

edited

Loading

NoahMaizels commented Mar 18, 2025 •

edited

Loading