Sandboxes - Buildfunctions Docs

Sandboxes are short-lived, isolated environments that you can spin up instantly for code execution. Sandboxes can be deployed within Buildfunctions (via CPU or GPU Functions) or deployed in app code anywhere else (e.g., local scripts, Next.js apps, external workers).

Core Concepts

Simple-to-use: Sandboxes are created, used, and destroyed seamlessly.
Secure: They provide a safe boundary for running untrusted AI actions, like executing AI-generated code.
Nested: You can run a Sandbox inside a data processing pipeline or an AI agent workflow.

Supported Runtimes

Runtime	Supported Sandboxes
Python	CPU, GPU
Go	CPU Only
Node.js	CPU Only
Deno	CPU Only
Bash	CPU Only

CPU Sandboxes

CPUSandbox is ideal for running lightweight code, data processing, or executing user-submitted scripts securely.

Create Hardware-Isolated Sandbox and Run Code

import { CPUSandbox } from 'buildfunctions';

// Create a CPU Sandbox
const cpuSandbox = await CPUSandbox.create({
    name: "text-analyzer",
    runtime: "node",
    code: "console.log('Hello from Sandbox!');",
    memory: "512MB",
    timeout: 120
});

try {
    const result = await cpuSandbox.run();
    console.log(result.stdout);
} finally {
    // Manually clean up
    await cpuSandbox.delete();
}

GPU Sandboxes

GPUSandbox provides instant access to secure, hardware-isolated VMs with GPUs. They include automatic storage for self-hosted models (perfect for agents) and support concurrent requests on the same GPU for significant cost savings.

Run Inference

You can execute scripts directly on the GPU by providing a code file or script in the create method.

...
// Create a GPU Sandbox
const sandbox = await GPUSandbox.create({
  name: 'secure-agent-action',
  memory: 10000,
  timeout: 300,
  vcpus: 6,
  language: 'python',
  requirements: ['transformers', 'torch', 'accelerate'],
  model: '/path/to/models/Qwen/Qwen3-8B',
  code: `python inference_script.py "${prompt}"`,
})

// Run script in a hardware-isolated virtual machine with full GPU access
const result = await sandbox.run()
...

Providing Code and Models

There are three ways to provide the code and models you want the Sandbox to use:

Code

1. Inline Code Pass the code directly as a string. Best for short, dynamic scripts.

const sandbox = await CPUSandbox.create({ 
    code: `console.log("Hello World")`,
    ... 
});
await sandbox.run();

2. Relative Path Reference a file relative to your current working directory.

// Looks for ./inference.py in your current project folder
const sandbox = await CPUSandbox.create({ 
    code: './inference.py',
    ... 
});
await sandbox.run();

3. Absolute Path Reference a file using a full system path.

// Uses a specific absolute path
const sandbox = await CPUSandbox.create({ 
    code: '/path/to/my/scripts/inference.py',
    ... 
});
await sandbox.run();

Models

Models can also be referenced by path when creating a GPU Sandbox. 1. Relative Path

// Looks for ./models/Qwen in your current project folder
const sandbox = await GPUSandbox.create({ 
  model: './models/Qwen' 
  ... 
});

2. Absolute Path

// Uses a specific absolute path on the host system
const sandbox = await GPUSandbox.create({ 
  model: '/path/to/models/Qwen' 
  ... 
});

Sandbox Management

Delete and Timeouts

You have the option to manually call delete() to clean up a Sandbox when you’re ready. If you don’t call delete(), the sandbox will be automatically cleaned up after the period you set for the timeout argument.

Default Timeout: If you don’t set a timeout argument, the default is 1 minute.
Auto-Cleanup: The sandbox is destroyed automatically after the timeout expires.

JavaScript

...
} finally {
    await gpuSandbox.delete();
}

Sandbox Configuration

You can customize the resources and environment for your sandboxes.

Parameters

GPU Sandbox (Python SDK)

language: python (more coming soon).
memory: RAM allocation (e.g., "65536MB").
gpu: GPU Type (e.g., T4).
requirements: List of Python packages (e.g., ['transformers']).
model: Path to model can be local or remote (e.g., Hugging Face Qwen/Qwen3-8B).

CPU Sandbox (Node.js SDK)

runtime: (e.g., node, python).
memory: RAM allocation.
timeout: Max execution time in seconds.

Runtime Specifics

Python Requirements You can specify dependencies in your code or via a requirements.txt.

transformers==4.47.1
accelerate

Deno Permissions For Deno, you can pass run flags in your command:

deno run --allow-ffi my_script.ts

Nested Sandboxes

One of the most powerful features of Buildfunctions is Nested Orchestration. You can deploy a top-level Function (e.g., a Node.js API) that spins up child Sandboxes (e.g., Python GPU workers) to handle requests.

Example Architecture

Top-Level Function: Receives an HTTP request.
Child Sandbox: The function spins up a GPUSandbox to run a customized model.
Result: The sandbox returns the inference result to the function, which responds to the user.
Cleanup: The sandbox is destroyed, ensuring clean resource usage.

Advanced Example: Python Agent

This example demonstrates an advanced agentic workflow: Code Generation with Reward Scoring. The agent uses Claude to generate a Python function, then immediately spins up a secure CPUSandbox to test the code against a set of unit tests (the reward function). This allows the agent to verify the correctness of its output before proceeding.

This example requires an ANTHROPIC_API_KEY to be available in your environment.

import os
import re
import time
from pathlib import Path

import anthropic
import pytest
from dotenv import load_dotenv

from buildfunctions import Buildfunctions, CPUSandbox

load_dotenv()

API_TOKEN = os.environ.get("BUILDFUNCTIONS_API_TOKEN", "")
HANDLER_TEMPLATE = (Path(__file__).parent / "reward_handler.py").read_text()


def strip_markdown_fences(text: str) -> str:
    return re.sub(r"^```[\w]*\n|```$", "", text.strip(), flags=re.MULTILINE).strip()


@pytest.mark.asyncio
async def test_code_generation_with_reward():
    if not API_TOKEN:
        pytest.skip("Set BUILDFUNCTIONS_API_TOKEN in .env file")

    print("Testing Code Generation with Reward Scoring...\n")

    sandbox = None

    try:
        # Step 1: Authenticate
        print("1. Authenticating...")
        client = await Buildfunctions({"apiToken": API_TOKEN})
        print(f"   Authenticated as: {client.user.username}")

        # Step 2: Generate code with Claude
        print("\n2. Generating sorting function with Claude...")
        claude = anthropic.Anthropic(api_key=os.getenv("ANTHROPIC_KEY"))

        response = claude.messages.create(
            model="claude-opus-4-6",
            max_tokens=512,
            messages=[{
                "role": "user",
                "content": (
                    "Write a Python function called `sort_list` that takes a list and "
                    "returns it sorted. Handle edge cases like empty lists, single elements, "
                    "and mixed types. Return ONLY the code, no markdown, no explanations."
                ),
            }],
        )

        generated_code = strip_markdown_fences(response.content[0].text)
        print(f"   Generated code:\n{generated_code}\n")

        # Step 3: Create CPU Sandbox with reward function
        print("3. Creating CPU Sandbox with reward function...")
        handler_code = HANDLER_TEMPLATE.format(generated_code=generated_code)

        sandbox = await CPUSandbox.create({
            "name": f"reward-eval-{int(time.time())}",
            "language": "python",
            "code": handler_code,
            "memory": "512MB",
            "timeout": 30,
        })
        print(f"   CPU Sandbox created: {sandbox.name}")

        # Step 4: Run the reward function
        print("\n4. Running reward function...")
        result = await sandbox.run()
        print(f"   Result: {result.response}")

        # Step 5: Clean up
        print("\n5. Deleting CPU Sandbox...")
        await sandbox.delete()
        print("   CPU Sandbox deleted")

        print("\nCode generation with reward scoring test completed!")

    except Exception:
        if sandbox and sandbox.delete:
            try:
                await sandbox.delete()
            except Exception as e:
                print(f"Cleanup failed: {e}")
        raise


if __name__ == "__main__":
    import asyncio
    asyncio.run(test_code_generation_with_reward())

Getting started

​Core Concepts

​Supported Runtimes

​CPU Sandboxes

​Create Hardware-Isolated Sandbox and Run Code

​GPU Sandboxes

​Run Inference

​Providing Code and Models

​Code

​Models

​Sandbox Management

​Delete and Timeouts

​Sandbox Configuration

​Parameters

​Runtime Specifics

​Nested Sandboxes

​Example Architecture

​Advanced Example: Python Agent

Core Concepts

Supported Runtimes

CPU Sandboxes

Create Hardware-Isolated Sandbox and Run Code

GPU Sandboxes

Run Inference

Providing Code and Models

Code

Models

Sandbox Management

Delete and Timeouts

Sandbox Configuration

Parameters

Runtime Specifics

Nested Sandboxes

Example Architecture

Advanced Example: Python Agent