Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NioDirectBufferPool does not manage the off-heap memory well enough #18676

Open
liuaer opened this issue Aug 28, 2024 · 0 comments
Open

NioDirectBufferPool does not manage the off-heap memory well enough #18676

liuaer opened this issue Aug 28, 2024 · 0 comments
Labels
type-bug This issue is about a bug

Comments

@liuaer
Copy link

liuaer commented Aug 28, 2024

Alluxio Version:
2.10.0-SNAPSHOT and master branch

Describe the bug

Problem1.

  1. When the acquire and release methods of BUF_POOL are frequently called, a large number of ByteBuffers are created, which occupy direct memory and are not released.

  2. Assuming that direct memory left 2 , and the BUF_POOL currently only contains the following entries:

    10 -> {}
    20 -> {ByteBuffer,ByteBuffer} 
    
  3. Subsequently, when a request for acquire(8) arrives, it will satisfy the condition entry == null || entry.getValue().size() == 0, triggering the allocation of a new ByteBuffer with ByteBuffer.allocateDirect(8). However, since there are only 2 units of direct memory available, this will result in an OutOfMemoryError (OOM).

  public static synchronized ByteBuffer acquire(int length) {
    Map.Entry<Integer, LinkedList<ByteBuffer>> entry = BUF_POOL.ceilingEntry(length);
    if (entry == null || entry.getValue().size() == 0) {
      return ByteBuffer.allocateDirect(length);
    }
    ......
    return buffer;
  }

To Reproduce

  1. alluxio version as 2.10.0-SNAPSHOT;
  2. Execute the command alluxio fs load xxxxx (large directory) --submit;

Expected behavior

For Problem1

After executing entry.getValue().pop(), if entry.getValue()==0 , then the entry should be removed from the BUF_POOL.

  public static synchronized ByteBuffer acquire(int length) {
    Map.Entry<Integer, LinkedList<ByteBuffer>> entry = BUF_POOL.ceilingEntry(length);
    if (entry == null || entry.getValue().size() == 0) {
      return ByteBuffer.allocateDirect(length);
    }
    ByteBuffer buffer = entry.getValue().pop();
    
    #### change start ###
    if(entry.getValue().size()==0){
      BUF_POOL.remove(entry.getKey());
    }  
    ####change end ###
    ......
    return buffer;
  }

Urgency
If your cluster has ever run out of off-heap memory, you can check for this problem;

Are you planning to fix it

Additional context

@liuaer liuaer added the type-bug This issue is about a bug label Aug 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type-bug This issue is about a bug
Projects
None yet
Development

No branches or pull requests

1 participant