Back to Blog
engineering

How We Built the Completion Engine

A deep dive into the architecture behind Behalf's task completion engine — from browser automation to intelligent retry logic.

By Marcus Rivera
Server infrastructure

The problem

When we set out to build Behalf, we knew the hardest part wouldn't be understanding what users want — it would be reliably doing it. The gap between "draft an email" and "send the email, handle bounces, and follow up in three days" is enormous.

We call the system that bridges this gap the Completion Engine.

Architecture overview

The Completion Engine is built around three core concepts:

1. Task decomposition

Every user request gets broken down into a directed acyclic graph (DAG) of subtasks. A request like "schedule a meeting with Alice and Bob next week" becomes:

  • Check Alice's availability
  • Check Bob's availability
  • Find overlapping slots
  • Propose time to both parties
  • Handle responses and reschedule if needed
  • Send calendar invites

Each subtask is independently executable, retryable, and observable.

2. Browser automation

Many tasks require interacting with web applications that don't have APIs. Our browser automation layer uses a headless browser with intelligent element detection:

// Simplified example of our action executor
async function executeAction(action: BrowserAction) {
  const page = await acquirePage(action.context);
  const element = await findElement(page, action.selector);
  await element.click();
  await waitForNavigation(page);
  return captureState(page);
}

The key innovation is our state capture system. After every action, we take a structured snapshot of the page state, allowing the LLM to reason about what happened and decide the next step.

3. Intelligent retry logic

Things fail. Pages load slowly, CAPTCHAs appear, sessions expire. Our retry system handles these gracefully:

  • Transient failures (timeouts, rate limits) get exponential backoff
  • State failures (logged out, unexpected page) trigger recovery flows
  • Permanent failures (account locked, service unavailable) escalate to the user

We track success rates per domain and adjust our strategies accordingly.

Lessons learned

Building the Completion Engine taught us several things:

  1. Determinism is a spectrum. Web interactions are inherently non-deterministic. We embraced probabilistic completion rather than trying to guarantee exact outcomes.

  2. Observability is everything. Every action, decision, and state transition is logged. When something goes wrong, we can replay the exact sequence of events.

  3. Users want control, not automation. The most successful feature isn't full autonomy — it's the ability to pause, inspect, and redirect mid-task.

What's next

We're working on expanding the Completion Engine to handle multi-day tasks with complex dependency chains. Stay tuned for more technical deep dives.