Automify documentation

Build AI computer use for a browser, a shell, a container, or a desktop.

Start with a provider, create the surface you need, pass task data, and get back status, steps, screenshots, text, or structured JSON.

Start

Install

Most projects only need three decisions.

Pick a provider, choose a surface, and decide which guardrails should apply before the model can act.

Terminal
> npm install automify

Automify requires Node.js 20.12.2 or newer. Playwright is a dependency because browser automation is a built-in surface, and the package postinstall step installs Chromium automatically.

# Ubuntu 26.04 only, if Playwright blocks Chromium install
PLAYWRIGHT_HOST_PLATFORM_OVERRIDE=ubuntu24.04-x64 npm install automify
AUTOMIFY_SKIP_BROWSER_INSTALL=1 npm install automify

Use the environment variable above when your CI image or deployment environment manages browsers separately.

Zod support is optional. Install Zod only when you want to define structured output with Zod schemas:

npm install zod

The main `automify` package works without Zod. The Zod helper lives behind the separate `automify/zod` import and uses Zod as an optional peer dependency.

Optional Docker setup

Docker is required only for `automify.dockerCli()` and `automify.dockerComputer()`. On macOS and Windows, install Docker Desktop from the official Docker website: Docker Desktop for Mac or Docker Desktop for Windows.

On Ubuntu, install Docker from the Ubuntu repositories:

sudo apt-get update
sudo apt-get install -y docker.io

Use `docker.io`, not the `docker` package. In Ubuntu packages, `docker.io` provides the Docker Engine/runtime and CLI.

Start Docker on Ubuntu and enable it after reboot:

sudo systemctl enable --now docker
sudo docker run hello-world

To run Docker commands without `sudo`, add your user to the `docker` group, then log out and back in:

sudo usermod -aG docker $USER

Core idea

Initialize once, then create the surface you need.

import { initAutomify } from "automify";

const automify = initAutomify({
  provider: {
    type: "openai",
    apiKey: process.env.OPENAI_API_KEY,
    model: "gpt-5.5"
  }
});

const browser = await automify.browser({
  startUrl: "https://example.com"
});

try {
  const result = await browser.do("Summarize what you see on the page.");
  console.log(result.text);
} finally {
  await browser.close();
}

`initAutomify()` creates shared provider configuration and returns factory methods for browser, CLI, desktop, and custom computer runs. Every surface exposes `.do(instruction, run?)`, where run data lives under `data` and output/options live beside it in the same object.

Instruction The first `.do()` argument. Tell the model what outcome you want, not a brittle click-by-click script.
`data` Structured input for the run: records, form values, command names, search queries, customer IDs, or other app context.
`output` Optional response shape. Use `jsonOutput()` when application code should consume the result.
Guardrails Domain rules, command rules, safety callbacks, screenshot redaction, and max-step limits.

Arguments

Use grouped options for new code.

`.do()` accepts a single run object. Prefer grouped options so input, evaluation files, screenshots, safety, hooks, limits, and CLI command policy stay easy to scan.

await automify.cli({
  command: {
    cwd: process.cwd(),
    allow: ["npm test", "npm run build"],
    block: [/^rm\b/]
  }
}).do("Run tests and summarize failures", {
  data: { packageName: "automify" },
  evaluate: [{ path: "/tmp/screenshot.png", detail: "high" }],
  output: jsonOutput("test_report", { passed: "boolean", summary: "string" }),
  limits: { steps: 20 },
  screenshots: {
    final: "/tmp/automify-final.png",
    actions: "/tmp/automify-actions"
  },
  screenshot: {
    detail: "high",
    maxWidth: 1600
  },
  safety: {
    domains: ["example.com"],
    onCheck: async ({ checks }) => checks.every((check) => check.code !== "external_form_submit")
  },
  hooks: {
    step: (event) => console.log(event.phase),
    complete: ({ result }) => console.log(result.status)
  }
});

The grouped names map to the flat names: `evaluate` maps to `filesToEvaluate`, `limits.steps` to `maxSteps`, `screenshots.final` to `finalScreenshot`, `command.allow` to `allowedCommands`, and `safety.domains` to `allowedDomains`.

Adapter factories use the same grouped shape. Browser uses `startUrl`, `viewport`, `launch`, `context`, and `navigation`; CLI uses `command`; Docker-backed adapters use `container`, `desktop`, `shared`, and `sharedFiles`; local desktop uses `viewport`, `mouse`, `keyboard`, `calibration`, and `virtualDisplay`. Linux local desktop requires X11/Xorg or Xvfb; Wayland is not supported. Automify validates adapter option names at runtime and suggests likely fixes. Docker-backed APIs are named explicitly; older virtual aliases are still exported for existing code. The generated Markdown reference is `docs/argument-reference.md`.

Browser `preset`, `startUrl`, `browser`, `viewport`, `launch`, `context`, `navigation`, `actionDelayMs`
CLI `preset`, `command: { cwd, env, shell, timeoutMs, approval, allow, block, confirm }`
Docker CLI `preset`, `container`, `workdir`, `shared`, `sharedFiles`, `command`; requires Docker to be running
Docker computer `preset`, `container`, `viewport`, `desktop`, `shared`, `sharedFiles`; requires Docker to be running
Local desktop `viewport`, `mouse`, `keyboard`, `calibration`, `virtualDisplay`
Docker desktop `preset`, `container`, `viewport`, `desktop`, `shared`, `sharedFiles`; requires Docker to be running

Presets are optional shortcuts: `repo` for CLI/Docker CLI, `browser-review` for browser inspection, and `desktop-review` for a review-friendly Docker desktop viewport and settle timing.

Status

Handle automation errors with the Promise.

A resolved run means the requested model tool loop finished without provider, adapter, safety, policy, or max-step errors. If one of those fails, `.do()` rejects so regular `try`/`catch` is the primary error boundary.

try {
  const run = await browser.do("Submit the lead form");

  console.log("Actions executed:", run.steps.length);
} catch (error) {
  console.error("Automation failed:", error);
}

Treat returned structured data as the business result you asked the model to produce. If you need to prove a page or system actually changed, read back the saved data and compare it with your expected values.

Environment

Environment describes the surface the model is operating.

Browser runs default to `environment: "browser"`. Desktop adapters usually provide values such as `mac`, `windows`, or `linux`. The environment is passed through the computer tool payload so the model knows what kind of system it is controlling.

Local desktop computer use is optional and needs native desktop dependencies that are not installed by default. When you use the local desktop adapter, your OS may ask for permission to control the desktop.

Before running `npx automify-install-desktop`, install the native build tools for your OS. On Windows, install Visual Studio 2022 C++ Build Tools with `winget install --id Microsoft.VisualStudio.2022.BuildTools --exact --override "--passive --add Microsoft.VisualStudio.Workload.VCTools --includeRecommended"`, and install CMake on PATH with `winget install --id Kitware.CMake --exact --source winget`; the VS Code CMake Tools extension is not enough by itself, and Visual Studio 2026 is not currently recognized by the native build chain used by nut.js. On macOS, install Homebrew first with `/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"` if `brew` is not available, then install Xcode Command Line Tools with `xcode-select --install` and CMake with `brew install cmake`. On Debian/Ubuntu Linux, install `git`, `build-essential`, `cmake`, `pkg-config`, `libx11-dev`, `libxtst-dev`, and `libpng++-dev`. Install the full Linux package list before running the desktop installer; it checks for command-line build tools but does not verify every native library package. On Fedora Linux, install `gcc-c++`, `make`, `cmake`, `libXtst-devel`, and `libpng-devel`. On Arch Linux, install `base-devel`, `cmake`, `libxtst`, and `libpng`. Linux local desktop capture is X11-based: use Xorg/X11 or Xvfb, not Wayland. On headless Linux hosts, also install `xvfb` unless you manage `DISPLAY` yourself.

const browser = await automify.browser({
  startUrl: "https://example.com",
  environment: "browser"
});

// Reminder: local desktop support requires `npx automify-install-desktop` once for this project.
const desktop = await automify.localComputer();

You can use any string if your adapter targets a different surface, but the built-in browser and desktop flows are designed around browser, macOS, Windows, and Linux X11/Xorg-style environments. Linux Wayland is not supported by the built-in local desktop adapter.

Defaults

Runtime defaults are optimized for practical automation.

`provider` / `model` No default. `provider` is required in `initAutomify()`.
Browser `headless` `true`
Browser `safety.domains` All domains unless you set an allowlist.
Browser/Desktop `limits.steps` `1000`
CLI `limits.steps` `1000`
CLI `approval` `"never"`
CLI command policy All commands allowed unless you configure `command.allow` or `command.block`.

Production checklist

Before you let an automation touch real accounts.

Scope the workspace Use dedicated accounts, test tenants, narrow browser domains, and minimal filesystem or Docker access.
Prefer structured data Use `data` for inputs and `jsonOutput()` for outputs so your application is not parsing prose.
Record what happened Attach `onStep`, `onComplete`, final screenshots, and `trace: true` for workflows that need auditability.
Add human gates Use CLI approval, safety callbacks, or application-level review before destructive or externally visible actions.

Providers

Choose a provider and keep the same surface API.

OpenAI

const automify = initAutomify({
  provider: {
    type: "openai",
    apiKey: process.env.OPENAI_API_KEY,
    model: "gpt-5.5"
  }
});

Anthropic

const automify = initAutomify({
  provider: {
    type: "anthropic",
    apiKey: process.env.ANTHROPIC_API_KEY,
    model: "claude-sonnet-4-20250514"
  }
});

Use `provider.type: "custom"` for another vendor, a model gateway, internal routing, mocks, or a provider release that Automify does not support directly yet.

`provider` is required in `initAutomify()`. Keep `apiKey` and `model` inside the provider object so there is no ambiguity about which provider owns the model string. Use direct factories such as `createAutomify()` only when you need the lower-level client API.

Structured output

Use JSON for application data.

`.do()` returns run metadata plus the raw provider response. That is useful for tracing, steps, and callbacks, but your actual business data should usually come from structured model output.

Ask for the exact shape you want back. `jsonOutput()` expands the compact field map into a strict JSON schema-style output configuration.

const run = await browser.do(
  "Extract the support email.",
  {
    output: jsonOutput("support_contact", { email: "string" })
  }
);
console.log(run.parsed.email);

Compact shape

The compact form is designed for the common case: a flat object where every listed field is required.

jsonOutput("created_lead", {
  recordId: "string",
  portalUrl: "string"
})

That expands to a strict object schema with `recordId` and `portalUrl` in `required`, and `additionalProperties: false`.

Optional fields

Compact fields are required by default. For optional fields and richer validation, pass full JSON Schema.

const run = await browser.do(
  "Create the lead.",
  {
    output: jsonOutput("created_lead", {
      type: "object",
      properties: {
        recordId: { type: "string" },
        note: { type: "string" }
      },
      required: ["recordId"],
      additionalProperties: false
    })
  }
);

console.log(run.parsed.recordId, run.parsed.note);

Zod schemas

Projects using Zod 4 can use the optional adapter from `automify/zod`. This is separate from the main `automify` import so applications that only use `jsonOutput()` do not need to install Zod.

npm install zod
import { z } from "zod";
import { zodOutput } from "automify/zod";

const Lead = z.object({
  id: z.string(),
  firstName: z.string(),
  lastName: z.string()
});

const run = await browser.do("Create the lead.", {
  output: zodOutput("lead", Lead)
});

console.log(run.parsed.id);

`zodOutput()` converts the schema to JSON Schema with Zod 4's `z.toJSONSchema()` and sends that schema to the model. After the model returns JSON, Automify parses the response and validates it with the original Zod schema's `schema.parse()` before setting `run.parsed`.

Pass `{ parse: false }` to request the Zod-derived JSON Schema while skipping automatic JSON parsing and Zod validation of the final response.

Richer schemas

Full JSON Schema lets you use enums, arrays, nested objects, number constraints, string formats, and provider-specific schema features.

For an array of objects, prefer returning an object with a named array property. That keeps `jsonOutput()` in its object-shaped path and gives your application a stable place to read from `run.parsed`.

const run = await browser.do(
  "Analyze the order.",
  {
    output: jsonOutput("order_summary", {
      type: "object",
      properties: {
        priority: {
          type: "string",
          enum: ["low", "normal", "high"]
        },
        items: {
          type: "array",
          items: {
            type: "object",
            properties: {
              sku: { type: "string" },
              quantity: { type: "integer", minimum: 1 }
            },
            required: ["sku", "quantity"],
            additionalProperties: false
          }
        }
      },
      required: ["priority", "items"],
      additionalProperties: false
    })
  }
);
console.log(run.parsed.items);

Top-level arrays

If your application needs `run.parsed` itself to be an array, pass the `json_schema` output format directly. `jsonOutput()` intentionally accepts object-shaped schemas, while `output` can receive the lower-level structured output configuration.

const run = await browser.do(
  "Extract the products.",
  {
    output: {
      type: "json_schema",
      name: "products",
      strict: true,
      schema: {
        type: "array",
        items: {
          type: "object",
          properties: {
            sku: { type: "string" },
            title: { type: "string" },
            price: { type: "number" }
          },
          required: ["sku", "title", "price"],
          additionalProperties: false
        }
      }
    }
  }
);

console.log(run.parsed[0].sku);

Zod arrays

Zod schemas can describe arrays too. Use an object with a named array field when you want the same application-friendly shape as `jsonOutput()`.

import { z } from "zod";
import { zodOutput } from "automify/zod";

const ProductList = z.object({
  products: z.array(
    z.object({
      sku: z.string(),
      title: z.string(),
      price: z.number()
    })
  )
});

const run = await browser.do("Extract the products.", {
  output: zodOutput("product_list", ProductList)
});

console.log(run.parsed.products);

Provider compatibility

OpenAI Automify sends `jsonOutput()` as the Responses API `text.format` configuration, so the provider receives the schema directly.
Anthropic The built-in Anthropic adapter converts the same schema into JSON-only instructions appended to the user message. `run.parsed` still works when Claude returns valid JSON.
Custom adapters Automify passes the requested format in `payload.text`. Your adapter can send it to a provider-native structured-output API, convert it into prompt text, or ignore it intentionally.

The practical rule: with OpenAI, structured output is provider-enforced. With Anthropic, Automify makes the request explicit and parses the JSON response, but final enforcement depends on the model following the instruction.

If you want plain text instead, use `run.text` or `getOutputText(run.response)`.

import { parseOutputJson } from "automify";

// Helper-based fallback
const run = await browser.do(
  "Extract the support contact as JSON."
);

const support = parseOutputJson(run.response);
console.log(support.email);

Anthropic details

Anthropic computer-use settings stay explicit and overridable.

With the built-in Anthropic provider, a model such as `claude-sonnet-4-20250514` uses Automify's current Anthropic computer-use defaults unless you override them.

const automify = initAutomify({
  provider: {
    type: "anthropic",
    apiKey: process.env.ANTHROPIC_API_KEY,
    model: "claude-sonnet-4-20250514",
    betas: ["computer-use-2025-01-24"],
    computerToolType: "computer_20250124"
  }
});

If Anthropic updates the beta header or computer tool schema for the model you want, set both fields together so the request stays internally consistent.

Browser

Use Playwright-backed browser automation.

const browser = await automify.browser({
  startUrl: "https://example.com",
  viewport: { width: 1280, height: 800 },
  headless: true,
  safety: {
    domains: ["example.com"]
  }
});

try {
  const result = await browser.do(
    "Summarize what you see on the page.",
    { data: { audience: "new visitor" } }
  );
} finally {
  await browser.close();
}

Browser runs take screenshots, feed them back to the model, and execute computer actions such as click, double click, move, drag, scroll, keypress, type, wait, and screenshot.

CLI

Run model-requested commands with explicit policies.

const cli = automify.cli({
  logFile: "/tmp/automify-cli.jsonl",
  command: {
    cwd: process.cwd(),
    allow: ["npm test", "npm run build", "ls", "pwd"]
  }
});

await cli.do("Run tests and summarize any failures");

CLI approval is `never` by default. Add `approval: "always"` or `confirmCommand` when you want human approval or policy logic before a command executes.

const cli = automify.cli({
  command: {
    approval: "always",
    confirm: async ({ command }) => {
      return command.command.startsWith("npm test");
    }
  }
});

Use `automify.dockerCli()` when commands should run in an isolated Docker container instead of the host shell. Docker must be installed and running before you create the adapter. It supports the same command policies and can mount a shared folder into the container. See Optional Docker setup if Docker is not installed yet.

import { mkdir, mkdtemp, readFile, writeFile } from "node:fs/promises";
import { join } from "node:path";
import { tmpdir } from "node:os";
import { initAutomify, jsonOutput } from "automify";

const sharedDir = await mkdtemp(join(tmpdir(), "automify-docker-cli-"));
const dataDir = join(sharedDir, "data");
const reportPath = join(dataDir, "report.csv");
const summaryPath = join(dataDir, "summary.json");

await mkdir(dataDir, { recursive: true });
await writeFile(
  reportPath,
  "region,customer,revenue\n" +
    "North,Ada Corp,1250\n" +
    "South,Byron Ltd,980\n" +
    "North,Lovelace Labs,2230\n"
);
await writeFile(summaryPath, "{}\n");

const automify = initAutomify({
  provider: {
    type: "openai",
    apiKey: process.env.OPENAI_API_KEY,
    model: "gpt-5.5"
  }
});

const cli = automify.dockerCli({
  logFile: "/tmp/automify-docker-cli.jsonl",
  additionalAptPackages: ["coreutils", "nodejs"],
  shared: { hostPath: sharedDir, containerPath: "/workspace" }
});

try {
  const run = await cli.do("Read data/report.csv, use a Node.js script to calculate revenue by region, update data/summary.json with the result, and return the top region.", {
    output: jsonOutput("report_summary", {
      topRegion: "string",
      totalRevenue: "number",
      outputFile: "string",
      summary: "string"
    })
  });

  const summaryFile = JSON.parse(await readFile(summaryPath, "utf8"));
  console.log(run.parsed, summaryFile);
  console.log("Shared output file:", summaryPath);
} finally {
  await cli.close();
}

Desktop

Desktop automation supports local and sandboxed Linux sessions.

There are two desktop modes. `localComputer()` controls the native desktop on the machine running your Node.js process, so it targets macOS, Windows, or Linux depending on the host. On Linux, this means X11/Xorg or Xvfb; Wayland is not supported for local desktop control. `dockerComputer()` requires Docker to be installed and running, and can be started from any host that can run Docker Linux containers, but the desktop it controls is always a Linux desktop inside the container. See Optional Docker setup if Docker is not installed yet.

`localComputer()` Controls the current host desktop: macOS, Windows, or Linux X11/Xorg. Linux Wayland is not supported.
`dockerComputer()` Controls an isolated Linux desktop in Docker, using Xvfb, a window manager, and `xdotool`. Requires Docker to be running.
import { initAutomify, jsonOutput } from "automify";

const automify = initAutomify({
  provider: {
    type: "openai",
    apiKey: process.env.OPENAI_API_KEY,
    model: "gpt-5.5"
  }
});

// Reminder: local desktop support requires `npx automify-install-desktop` once for this project.
const desktop = await automify.localComputer();

try {
  const run = await desktop.do("Open the Calendar app installed on this computer, find the next event, and return its title and start time. Do not create or edit events.", {
    screenshots: {
      initial: "/tmp/automify-local-desktop-initial.png",
      final: "/tmp/automify-local-desktop-final.png"
    },
    output: jsonOutput("next_event", {
      title: "string",
      startsAt: "string"
    })
  });

  console.log(run.parsed);
} finally {
  await desktop.close();
}

The desktop adapter is installed manually with `npx automify-install-desktop`; it is not installed by postinstall. The command builds nut.js, libnut-core, and macOS permission helpers from source, so it can take some time. If the command is run again and the cached runtime still matches the current platform, CPU architecture, Node ABI, and pinned native refs, Automify prints a skip message and exits without rebuilding. Use `npx automify-install-desktop --force` to rebuild a compatible cache anyway. Local desktop control may require OS permissions. The adapter provides `execute(action, context)` and `screenshot(context)` for native desktop control.

Before running the installer, install the native build tools for your OS. On Windows, install Visual Studio 2022 C++ Build Tools with `winget install --id Microsoft.VisualStudio.2022.BuildTools --exact --override "--passive --add Microsoft.VisualStudio.Workload.VCTools --includeRecommended"`, and install CMake on PATH with `winget install --id Kitware.CMake --exact --source winget`; the VS Code CMake Tools extension is not enough by itself, and Visual Studio 2026 is not currently recognized by the native build chain used by nut.js. On macOS, install Xcode Command Line Tools with `xcode-select --install` and CMake with `brew install cmake`. On Debian/Ubuntu Linux, install `git`, `build-essential`, `cmake`, `pkg-config`, `libx11-dev`, `libxtst-dev`, and `libpng++-dev`. Install the full Linux package list before running the desktop installer; it checks for command-line build tools but does not verify every native library package. On Fedora Linux, install `gcc-c++`, `make`, `cmake`, `libXtst-devel`, and `libpng-devel`. On Arch Linux, install `base-devel`, `cmake`, `libxtst`, and `libpng`. Linux local desktop capture is X11-based: use Xorg/X11 or Xvfb, not Wayland. On headless Linux hosts, also install `xvfb` unless you manage `DISPLAY` yourself.

`createLocalDesktopComputer()` is the local desktop mode. On Linux servers without `DISPLAY`, it starts an in-memory Xvfb display automatically before loading nut.js. Install `xvfb` on headless hosts, or pass `virtualDisplay: false` if you manage an X11 `DISPLAY` outside Automify. Wayland `DISPLAY`/XWayland sessions are not supported for local desktop capture and can fail with native X11 errors such as `BadMatch` / `X_GetImage`. On macOS and Windows, it controls the host desktop directly through the native nut.js backend.

`createLocalDesktopComputer()` takes an exclusive cross-process lock while the adapter is alive. This prevents two scripts, including scripts launched from separate terminals, from controlling the same real mouse, keyboard, and screen at the same time. Always call `computer.close()` in a `finally` block so Automify can release the lock and stop any virtual display it started. If a process crashes, Automify stores lock ownership under the system temp directory at `automify-locks`, checks whether the recorded PID is still alive, and removes stale locks automatically when the owning process has exited. If the owner still appears to be running, a second adapter fails with an error that includes the lock path.

import { initAutomify, jsonOutput } from "automify";

const automify = initAutomify({
  provider: {
    type: "openai",
    apiKey: process.env.OPENAI_API_KEY,
    model: "gpt-5.5"
  }
});

const desktop = await automify.dockerComputer({
  desktop: { startupCommand: "xterm" }
});

try {
  const run = await desktop.do("Use the open terminal to run 'uname -a', then return the kernel name and machine architecture shown on screen.", {
    screenshots: {
      initial: "/tmp/automify-docker-desktop-initial.png",
      final: "/tmp/automify-docker-desktop-final.png"
    },
    output: jsonOutput("system_info", {
      kernelName: "string",
      machine: "string",
      summary: "string"
    })
  });

  console.log(run.parsed);
} finally {
  await desktop.close();
}

`createDockerDesktopComputer()` creates a Docker-backed Linux desktop. By default it creates a fresh container from the public `debian:bookworm-slim` image, keeps Docker networking enabled, installs a minimal desktop package set with apt on startup, and runs Xvfb, a lightweight window manager, `xdotool` input, and PNG screenshots. There is no `automify-docker-desktop:latest` image to pull by default: Automify starts from Debian and prepares the desktop inside that container unless you pass a custom `image`. Docker desktop does not control the host operating system desktop and does not use `automify-install-desktop`; it needs a running Docker daemon and a startup command instead.

Docker desktop requires an initial app command. Pass `startupCommand` or `desktop.startupCommand`, for example `desktop.startupCommand: "xterm"`.

The default Debian setup installs `xvfb`, `openbox`, `xterm`, `x11-utils`, `xdotool`, `imagemagick`, `scrot`, and `ca-certificates`. Browser packages are opt-in; pass `desktop.additionalAptPackages: ["chromium"]` when you want Chromium installed during container startup. Pass `container.image` to use another apt-based distro or a prebuilt image. When `desktop.installDependencies: false`, the image must already include the required desktop tools, plus Chromium only if you plan to launch it.

Pass `container.network: false` or `container.network: "none"` when you want to disable Docker networking. Use `desktop.packages` to replace the base package list, or `desktop.additionalAptPackages` to append packages to the default list. Use `desktop.startupCommand: "xterm"` to start with a terminal, or pass `desktop.additionalAptPackages: ["chromium"]` with `desktop.startupCommand: "chromium --no-sandbox"` if you want Chromium open at startup.

Docker desktop locks are scoped to the container name. Default Docker desktop adapters use a random container name, so separate calls create separate containers and can run in parallel. If you pass `containerName` or `container.name`, Automify takes a cross-process lock for that exact name until `desktop.close()` or `computer.close()` releases it. Two adapters with different names can run at the same time; two adapters with the same name fail fast instead of racing Docker or driving the same display. The same rule applies when using `existingContainer`: only one adapter should control a given existing container at a time.

Docker desktop and Docker CLI sessions both accept `shared` and `sharedFiles`. Pass `shared` to mount a host directory, or pass `sharedFiles` to copy specific host files into a temporary shared directory before the run starts.

Pass `screenshots.initial` and/or `screenshots.final` on a `do()` call to save screenshots at the beginning and end of a run. Pass `screenshots.actions: "/tmp/automify-actions"` to save a before and after PNG for every browser, desktop, or custom computer action. Action screenshots are named by step and action index, and each step includes the saved paths under `step.actionScreenshots`.

node examples/desktop-docker.js

On macOS, `createLocalDesktopComputer()` calibrates screenshot pixels to native mouse coordinates. For unusual monitor scaling, pass `pixelScale`, `mouseScaleX`, `mouseScaleY`, `mouseOffsetX`, or `mouseOffsetY`.

Pass `debug: true` to log useful progress to stderr: model request phases, step/action execution, full CLI command results, browser input, desktop screenshots, and Docker container setup. Pass `debug: (message, details) => { ... }` to collect the same events yourself. On `automify.cli()` and `automify.dockerCli()`, pass `logFile: "/tmp/automify-cli.jsonl"` to append debug events as JSON Lines without enabling stderr logs.

For desktop speed, the nut.js adapter disables nut.js mouse and keyboard auto-delays by default. You can override this with `mouseAutoDelayMs` or `keyboardAutoDelayMs` if a specific application needs slower input. Screenshot uploads default to `detail: "auto"` and are resized to fit within 1440x1440 before upload when the optional image pipeline is available; Automify maps model coordinates back to the original screenshot space before executing actions. Pass `screenshotDetail`, `screenshotMaxWidth`, or `screenshotMaxHeight` to `initAutomify()`, `createAutomify()`, or a single `do()` call when you want to tune precision, image cost, and latency. Pass `trace: true` to include response/action/screenshot timings in the run result.

Desktop source builds are pinned to known working nut.js, libnut-core, and node-mac-permissions commits. Override with `AUTOMIFY_DESKTOP_NUT_REF`, `AUTOMIFY_DESKTOP_LIBNUT_CORE_REF`, or `AUTOMIFY_DESKTOP_MAC_PERMISSIONS_REF` when testing a newer upstream revision. Changing those refs changes the desktop runtime compatibility key, so the next `npx automify-install-desktop` build will not be skipped.

Files

Pass files as data or mount them into virtual environments.

Use `fileToData()` or `filesToData()` when you want file metadata, text, base64, or data URLs inside `data`. Use `evaluate` on `.do()` when the model should evaluate files directly; images are sent as image input and text-like files are sent as text input. Use `shared` or `sharedFiles` when the Docker CLI or Docker desktop also needs filesystem access to those files.

`data` Structured JSON values the model should follow or copy into a workflow.
`evaluate` Images and text files the model should inspect directly as model input.
`shared` / `sharedFiles` Host folders or files that a Docker CLI or Docker desktop must access through the container filesystem.

Factory options define runner defaults. Per-run `.do()` options override those defaults for one run, so a CLI can have a default command policy and still narrow or change it for a specific task.

await automify.cli().do("Evaluate this screenshot and list visible issues.", {
  evaluate: [
    { path: "/tmp/screenshot.png", detail: "high" }
  ]
});
import { mkdir, mkdtemp, readFile, writeFile } from "node:fs/promises";
import { join } from "node:path";
import { tmpdir } from "node:os";
import { filesToData, initAutomify } from "automify";

const sharedDir = await mkdtemp(join(tmpdir(), "automify-docker-cli-"));
const dataDir = join(sharedDir, "data");
const reportPath = join(dataDir, "report.csv");
const summaryPath = join(dataDir, "summary.json");

await mkdir(dataDir, { recursive: true });
await writeFile(
  reportPath,
  "region,customer,revenue\n" +
    "North,Ada Corp,1250\n" +
    "South,Byron Ltd,980\n" +
    "North,Lovelace Labs,2230\n"
);
await writeFile(summaryPath, "{}\n");

const automify = initAutomify({
  provider: {
    type: "openai",
    apiKey: process.env.OPENAI_API_KEY,
    model: "gpt-5.5"
  }
});

const cli = automify.dockerCli({
  additionalAptPackages: ["coreutils", "nodejs"],
  shared: { hostPath: sharedDir, containerPath: "/workspace" }
});

try {
  await cli.do("Read data/report.csv, use a Node.js script to calculate revenue by region, update data/summary.json with the result, and summarize the result.", {
    data: {
      files: await filesToData(reportPath, { format: "metadata" })
    }
  });
  console.log(JSON.parse(await readFile(summaryPath, "utf8")));
  console.log("Shared output file:", summaryPath);
} finally {
  await cli.close();
}

`shared` mounts a host folder into the container, so files the command creates or edits under `/workspace` are visible on the host after the run. `sharedFiles` is still useful when you only need to copy specific inputs into a temporary shared directory before the first request.

import { createDockerDesktopComputer } from "automify";

const computer = await createDockerDesktopComputer({
  desktop: { startupCommand: "xterm" },
  shared: {
    hostPath: "/tmp/automify-work",
    containerPath: "/workspace"
  }
});

console.log(computer.sharedFolder.containerPath);

Input

Pass task input the same way on every adapter.

The first `.do()` argument is the instruction. Structured values go under `data`, and adapter setup goes in the factory options. The model receives both the instruction and the JSON data.

import { mkdir, mkdtemp, readFile, writeFile } from "node:fs/promises";
import { join } from "node:path";
import { tmpdir } from "node:os";
import { initAutomify } from "automify";

const automify = initAutomify({
  provider: {
    type: "openai",
    apiKey: process.env.OPENAI_API_KEY,
    model: "gpt-5.5"
  }
});

const sharedDir = await mkdtemp(join(tmpdir(), "automify-docker-cli-"));
const dataDir = join(sharedDir, "data");
const reportPath = join(dataDir, "report.csv");
const summaryPath = join(dataDir, "summary.json");

await mkdir(dataDir, { recursive: true });
await writeFile(
  reportPath,
  "region,customer,revenue\n" +
    "North,Ada Corp,1250\n" +
    "South,Byron Ltd,980\n" +
    "North,Lovelace Labs,2230\n"
);
await writeFile(summaryPath, "{}\n");

// Browser
const browser = await automify.browser({ startUrl: "https://aldovincenti.github.io/automify/demo.html" });
await browser.do("Create the lead from data.", {
  data: { firstName: "Ada", lastName: "Lovelace", email: "ada@example.com" }
});

// CLI
const cli = automify.cli({ command: { cwd: process.cwd(), allow: ["npm test"] } });
await cli.do("Run the requested script and summarize the result.", {
  data: { script: "npm test" }
});

// Docker CLI
const dockerCli = automify.dockerCli({
  additionalAptPackages: ["coreutils", "nodejs"],
  shared: { hostPath: sharedDir, containerPath: "/workspace" }
});
try {
  await dockerCli.do("Read data/report.csv, use a Node.js script to calculate revenue by region, update data/summary.json with the result, and report the top region.", {
    data: { report: "data/report.csv" }
  });
  console.log(JSON.parse(await readFile(summaryPath, "utf8")));
  console.log("Shared output file:", summaryPath);
} finally {
  await dockerCli.close();
}


// Local desktop
// Reminder: local desktop support requires `npx automify-install-desktop` once for this project.
const localDesktop = await automify.localComputer();
try {
  await localDesktop.do("Open the Calendar app installed on this computer, find the next event, and summarize it. Do not create or edit events.", {
    screenshots: { final: "/tmp/local-final.png" }
  });
} finally {
  await localDesktop.close();
}

// Docker desktop
const dockerDesktop = await automify.dockerComputer({
  desktop: {
    startupCommand: "xterm"
  }
});
try {
  await dockerDesktop.do("Use the open terminal to run 'uname -a' and summarize the system information shown on screen.", {
    screenshots: { final: "/tmp/docker-final.png" }
  });
} finally {
  await dockerDesktop.close();
}

Completion

Use the Promise result, `onComplete`, or both.

Every `.do()` returns a Promise that resolves when the requested tool loop succeeds and rejects when the provider, adapter, safety, policy, or step limit fails. Use the resolved result for `run.status`, `run.steps`, `run.text`, and optional `run.parsed`. Register `onComplete` globally, per surface, or per run when your app should react to completion together with the original input data.

const automify = initAutomify({
  provider: {
    type: "openai",
    apiKey: process.env.OPENAI_API_KEY,
    model: "gpt-5.5"
  },
  onComplete: async ({ surface, data, result, response }) => {
    await auditLog.write({
      surface,
      customerId: data.customer?.id,
      status: result.status,
      responseId: response.id
    });
  }
});

const result = await automify.browser({ startUrl: "https://example.com" }).do(
  "Summarize the visible page for this customer record.",
  { data: { customer: { id: "cus_123", email: "ada@example.com" } } }
);

Adapters

Custom adapters keep provider integration small and ergonomic.

import {
  initAutomify,
  response,
  message,
  computerCall,
  runCommandCall,
  getInputText,
  getComputerTool,
  testModelAdapter
} from "automify";

const automify = initAutomify({
  provider: {
    type: "custom",
    model: "my-model",
    adapter: {
      async respond(payload, context) {
        const text = getInputText(payload);

        if (context.surface === "cli") {
          return response({
            id: "cli_step_1",
            output: [runCommandCall("npm test")]
          });
        }

        if (getComputerTool(payload)) {
          return response({
            id: "computer_step_1",
            output: [computerCall({ type: "click", x: 120, y: 180 })]
          });
        }

        return response({
          id: "done",
          output: [message(`Handled: ${text}`)]
        });
      }
    }
  },
  requestOptions: {
    temperature: 0.2,
    metadata: { app: "ops-agent" }
  }
});

await testModelAdapter(automify.client);

Adapter context includes `surface`, `phase`, `step`, and `requestOptions`. `requestOptions` is the place for provider-specific request parameters like temperature, tracing IDs, routing hints, or gateway metadata.

Safety

Use explicit guardrails in production.

const browser = await automify.browser({
  startUrl: "https://example.com",
  safety: {
    domains: ["example.com"]
  },
  screenshot: {
    redact: async (screenshot) => screenshot
  }
});

await browser.do("Summarize the page without leaving the allowed domain.", {
  safety: {
    onCheck: async ({ checks, action }) => true
  }
});

Recommended production controls include domain allowlists, command allowlists, screenshot redaction, bounded `limits.steps`, isolated accounts, and human approval for sensitive operations.

Treat desktop and CLI runs as privileged automation surfaces. Keep credentials scoped, prefer short-lived sessions, and save screenshots only where your data policy allows it.

Observability

Inspect requests, responses, steps, and completion.

const automify = initAutomify({
  provider: {
    type: "openai",
    apiKey: process.env.OPENAI_API_KEY,
    model: "gpt-5.5"
  },
  onRequest: (payload, meta) => logRequest(payload, meta),
  onResponse: (response, meta) => logResponse(response, meta),
  onStep: (event) => logStep(event),
  onComplete: (event) => logCompletion(event),
  debug: true
});

Use hooks for traces, audit logs, policy decisions, action screenshots, full CLI logs, and provider debugging.

Development

Running tests

npm test
npm run test:e2e
OPENAI_API_KEY=... npm run test:live

`npm run test:live` runs `test/e2e/live-openai.e2e.test.js` with `RUN_OPENAI_E2E=1`. By default, it runs the live OpenAI CLI and Docker CLI checks and skips the browser and Docker desktop checks.

Variable Required Purpose
`OPENAI_API_KEY` Yes OpenAI API key used by the live Responses API tests.
`OPENAI_MODEL` No Model for live tests. Defaults to `OPENAI_TEXT_MODEL`, then `gpt-5.5`.
`OPENAI_TEXT_MODEL` No Fallback model when `OPENAI_MODEL` is not set.
`RUN_OPENAI_E2E=1` Yes for direct `node --test` runs Enables the base live OpenAI tests. The `test:live` npm script sets this for you.
`RUN_OPENAI_BROWSER_E2E=1` No Also runs the live browser demo test. Requires the Playwright Chromium install from `npm install`.
`RUN_OPENAI_VIRTUAL_DESKTOP_E2E=1` No Also runs the live Docker desktop tests. Requires Docker to be running.
`AUTOMIFY_VIRTUAL_DESKTOP_IMAGE` No Optional Docker desktop base image override. Defaults to `debian:bookworm-slim`.
`AUTOMIFY_LIVE_LOGS=1` No Shows live adapter logs while the tests run.

Run every live test through the npm script:

OPENAI_API_KEY=... \
RUN_OPENAI_BROWSER_E2E=1 \
RUN_OPENAI_VIRTUAL_DESKTOP_E2E=1 \
npm run test:live

The equivalent direct command is:

OPENAI_API_KEY=... \
RUN_OPENAI_E2E=1 \
RUN_OPENAI_BROWSER_E2E=1 \
RUN_OPENAI_VIRTUAL_DESKTOP_E2E=1 \
node --test test/e2e/live-openai.e2e.test.js

API map

Primary exports

`initAutomify(options)` Create shared provider configuration and surface factories.
`automify.browser(options)` Create a Playwright-backed browser automation runner.
`automify.withBrowser(options, run)` Open a browser for the duration of a callback and close it automatically.
`automify.cli(options)` Create a command-line automation runner.
`automify.dockerCli(options)` Create a Docker-backed command-line automation runner with optional shared files. Requires Docker to be running.
`automify.dockerComputer(options)` Create a Docker-backed Linux desktop runner with a required `startupCommand`, then close its container with `close()`. The host can be macOS, Windows, or Linux as long as Docker is running and can run Linux containers.
`automify.localComputer(options)` Create a local macOS, Windows, or Linux X11/Xorg desktop runner and release the desktop lock with `close()`. Linux Wayland is not supported.
`automify.computer(options)` Create a runner for a custom computer adapter.
`createLocalDesktopComputer(options)` Create a local desktop adapter backed by locally built nut.js for macOS, Windows, or Linux X11/Xorg. Linux Wayland is not supported. On Linux without `DISPLAY`, it can start Xvfb automatically.
`createDockerDesktopComputer(options)` Create a Docker-backed Linux desktop sandbox with Xvfb, a window manager, `xdotool` input, and PNG screenshots. Requires Docker to be running. `startupCommand` or `desktop.startupCommand` is required.
`createBrowserAutomify(options)` Construct a browser runner directly.
`createCliAutomify(options)` Construct a CLI runner directly.
`createDockerCliAutomify(options)` Construct a Docker CLI runner directly. Requires Docker to be running.
`createDockerComputerAutomify(options)` Construct a Docker desktop runner directly. Requires Docker to be running.
`fileToData(file, options)` / `filesToData(files, options)` Convert host files into data objects for `.do()` input.
`fileToEvaluate(file, options)` / `filesToEvaluate(files, options)` Convert host files into model content items for evaluation.
`createComputerAutomify(options)` Construct a custom computer runner directly.
`createModelAdapter(adapter)` Normalize adapter objects, factories, or `respond()` implementations.
`testModelAdapter(adapter)` Validate adapter response shape with common scenarios.