Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Attempt 1 failed: BrowserType.launch: ENOSPC: no space left on device, mkdtemp '/tmp/playwright-artifacts-aNE1l9' #901

Open
dejoma opened this issue Jan 20, 2025 · 1 comment
Labels
bug Something isn't working

Comments

@dejoma
Copy link

dejoma commented Jan 20, 2025

Describe the bug 🐛

Attempt 1 failed: BrowserType.launch: ENOSPC: no space left on device, mkdtemp '/tmp/playwright-artifacts-aNE1l9'

The browser is not closed after running. So I am running a lambda function, it gets called multiple times. And then it runs out of memory.

See here the Github issue link and fix in the comments:
microsoft/playwright-java#526

My code 💻 🐊

SCRAPE_CONFIG = {
    "llm": {
        "api_key": os.environ["OPENAI_API_KEY"],
        "model": "openai/gpt-4o-mini",
    },
    "search_engine": "serper",
    "serper_api_key": os.environ["SERPER_API_KEY"],
    # "num_results": 5,
    "loader_kwargs": {
        # https://github.com/microsoft/playwright/issues/14023
        "args": ["--single-process", "--disable-gpu", "--disable-dev-shm-usage"],
    },
    "force": True,
    "verbose": True,
    "headless": True,
}


scraper = SearchGraph(prompt=prompt, config=SCRAPE_CONFIG, schema=ScraperOutput)  # type: ignore
scrape_results = scraper.run()

Hotfix update 🧯

So I've tried to empty some directories when they surpass 1.0GB, and it seems to work for now.

def cleanup_temp_files():
    """Clean up large temporary files and directories in /tmp.

    This function checks for files/directories larger than 1GB in /tmp
    and removes them to prevent disk space issues.
    """
    cleaned_paths = []
    ONE_GB = 1024 * 1024 * 1024  # 1GB in bytes

    try:
        # Get all items in /tmp
        tmp_items = os.listdir("/tmp")

        for item in tmp_items:
            full_path = os.path.join("/tmp", item)
            try:
                # Get size of file/directory
                if os.path.isdir(full_path):
                    total_size = sum(
                        os.path.getsize(os.path.join(dirpath, filename))
                        for dirpath, _, filenames in os.walk(full_path)
                        for filename in filenames
                    )
                else:
                    total_size = os.path.getsize(full_path)

                # Remove if larger than 1GB
                if total_size > ONE_GB:
                    if os.path.isdir(full_path):
                        shutil.rmtree(full_path)
                    else:
                        os.remove(full_path)
                    cleaned_paths.append(f"{full_path} ({total_size / ONE_GB:.2f}GB)")

            except Exception as e:
                print(f"Failed to process {full_path}: {e}")

    except Exception as e:
        print(f"Error accessing /tmp directory: {e}")

    if cleaned_paths:
        print(f"Cleaned up {len(cleaned_paths)} large files/directories: {cleaned_paths}")
    return cleaned_paths

My log file:

Cleaned up 2 large files/directories: ['/tmp/core.headless_shell.5405 (1.01GB)', '/tmp/core.headless_shell.5910 (1.01GB)']
@dosubot dosubot bot added the bug Something isn't working label Jan 20, 2025
@VinciGit00
Copy link
Collaborator

ok @dejoma can you make the pull request please?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants