Node.JS – bencoding

There is a recurring moment in agent design where a team realizes the model does not just need to reason. It needs to compute. It needs to transform JSON, run a formula, post-process extracted fields, normalize dates, build a dynamic object, or apply domain logic that is simply easier to express in JavaScript than in prompt text.

That is where most teams make a dangerous move. They reach for eval, Function, or Node’s vm module and tell themselves it is “sandboxed enough.”

It is not.

Node’s own documentation is explicit that node:vm is not a security mechanism and should not be used to run untrusted code. Worker threads are also not the right boundary for hostile code because they are designed for parallelism and can share memory. At the same time, Microsoft Agent Framework is built to let agents call external tools through function tools, so the clean pattern is not “run JavaScript inside the agent host.” The clean pattern is “make JavaScript execution a remote tool with a hardened execution boundary.” (Node.js)

That is the architecture this post covers:

Microsoft Agent Framework in .NET
A custom function tool exposed to the agent
A tRPC call from the tool to a separate Node.js execution service
Execution inside a locked-down isolate, not vm
Explicit whitelisting of namespaces and packages
Validation, time limits, memory limits, and auditable policy controls

The key design principle is simple: treat JavaScript execution as a privileged capability, not a convenience API.

The architecture

At a high level, the flow looks like this:

The agent decides it needs computation.
Microsoft Agent Framework calls a function tool.
The function tool sends a request over HTTP to a tRPC endpoint in Node.js.
The Node service validates the request with Zod.
The Node service creates an isolated execution environment.
Only approved globals and wrapped package facades are injected.
The code runs with strict limits for time, memory, and output shape.
The result is returned to the agent as tool output.

Microsoft Agent Framework supports function tools as first-class extensions to an agent, and tRPC gives you a type-safe RPC layer with input and output validation. That combination is ideal here because the .NET side stays thin and deterministic, while the execution policy lives in one place on the Node side. (Microsoft Learn)

First principle: “secure eval” is really “isolated execution”

It is important to be direct here. There is no magic secureEval() in Node.js. If you are executing model-authored or user-authored JavaScript, the safest practical pattern is:

out-of-process execution boundary
fresh isolate per run or per tenant pool
no ambient filesystem or network access
no raw require
whitelisted host-provided capabilities only
timeouts, memory ceilings, and payload size limits
container and OS-level restrictions around the service

Why not use node:vm? Because the Node docs explicitly say not to use it as a security boundary. Why not just use worker threads? Because workers are concurrency primitives, not isolation primitives. A better starting point for JavaScript isolation in Node is isolated-vm, which exposes V8 isolates and is designed for running code in fresh environments with no default Node runtime capabilities. Node’s permission model can also further restrict the Node process itself. (Node.js)

The important nuance is this: even isolated-vm should be one layer, not the only layer. The strongest production posture is to run the execution service in its own locked-down container or workload boundary and assume defense in depth.

Tool contract design

Do not let the model send arbitrary source code and a free-form module list with no governance. Give it a constrained contract.

A good request shape looks like this:

			
import { z } from "zod";
export const ExecuteJsInput = z.object({
  code: z.string().max(10_000),
  input: z.unknown().optional(),
  allowedNamespaces: z.array(z.string()).default([]),
  allowedPackages: z.array(z.string()).default([]),
  expectedResultSchema: z
    .object({
      type: z.enum(["json", "string", "number", "boolean", "array", "object"]),
    })
    .optional(),
  timeoutMs: z.number().int().min(50).max(3000).default(1000),
});

		

This matters for two reasons.

First, tRPC is designed around typed procedures, and Zod-driven validation makes the boundary explicit. Second, you now have a place to enforce policy before any code gets near an isolate. (trpc.io)

The Microsoft Agent Framework side

On the .NET side, the tool should be boring. That is the goal.

Microsoft Agent Framework lets you expose custom logic through function tools, including by creating an AIFunction from a C# method. The agent does not need to know how tRPC works. It just needs a tool description that makes the capability understandable to the model. (Microsoft Learn)

A simplified example:

			
using System.ComponentModel;
using System.Net.Http.Json;
public class JavaScriptExecutionTool
{
    private readonly HttpClient _httpClient;
    public JavaScriptExecutionTool(HttpClient httpClient)
    {
        _httpClient = httpClient;
    }
    [Description("Executes tightly sandboxed JavaScript for deterministic data transformation and calculation.")]
    public async Task<string> ExecuteSandboxedJavaScript(
        [Description("The JavaScript source to execute. Must return a serializable result.")] string code,
        [Description("Optional JSON input payload for the script.")] string? inputJson = null,
        [Description("Approved namespaces the script may access.")] string[]? allowedNamespaces = null,
        [Description("Approved package facades the script may access.")] string[]? allowedPackages = null)
    {
        var request = new
        {
            code,
            input = string.IsNullOrWhiteSpace(inputJson) ? null : System.Text.Json.JsonSerializer.Deserialize<object>(inputJson),
            allowedNamespaces = allowedNamespaces ?? Array.Empty<string>(),
            allowedPackages = allowedPackages ?? Array.Empty<string>(),
            timeoutMs = 1000
        };
        var response = await _httpClient.PostAsJsonAsync("/trpc/js.execute", request);
        response.EnsureSuccessStatusCode();
        return await response.Content.ReadAsStringAsync();
    }
}

		

Then you register it as a function tool with your agent. The architectural point is more important than the exact setup syntax: the agent host never evaluates code locally. It delegates execution to the hardened service. (Microsoft Learn)

The tRPC boundary

tRPC is a strong fit because it gives you typed procedures, validation, and a clean contract between the .NET caller and Node service. Even though .NET is not consuming generated TypeScript types directly, the Node service still benefits from strict schemas and a maintainable procedure surface. (trpc.io)

Example router:

			
import { initTRPC } from "@trpc/server";
import { z } from "zod";
import { ExecuteJsInput } from "./schemas";
import { runSandboxedScript } from "./sandbox";
const t = initTRPC.create();
export const appRouter = t.router({
  js: t.router({
    execute: t.procedure
      .input(ExecuteJsInput)
      .mutation(async ({ input, ctx }) => {
        return await runSandboxedScript(input, ctx.policyStore);
      }),
  }),
});
export type AppRouter = typeof appRouter;

		

This is where you can also add authentication, tenant context, rate limiting, audit metadata, and policy lookup.

The secure execution service

This is the heart of the design.

The mistake many teams make is trying to whitelist modules by exposing require. Do not do that. If you expose require, you are recreating Node inside the sandbox and dramatically expanding the attack surface.

Instead, preload and wrap approved capabilities in the host, then inject only those facades into the isolate.

That means your whitelist is not “the sandbox may import lodash.” It is “the sandbox may access a safe facade called packages.lodash that exposes only get, pick, and omit.”

That is a much better boundary.

Example policy registry

			
type NamespaceFactory = () => Record<string, unknown>;
type PackageFactory = () => Record<string, unknown>;
const namespaceRegistry: Record<string, NamespaceFactory> = {
  math: () => ({
    round: Math.round,
    floor: Math.floor,
    ceil: Math.ceil,
    max: Math.max,
    min: Math.min,
  }),
  dates: () => ({
    nowIso: () => new Date().toISOString(),
  }),
};
const packageRegistry: Record<string, PackageFactory> = {
  lodash: () => {
    const { get, pick, omit } = require("lodash");
    return { get, pick, omit };
  },
  decimal: () => {
    const Decimal = require("decimal.js");
    return { Decimal };
  },
};

		

Notice what is missing: no arbitrary imports, no filesystem, no fetch, no process access, no environment access.

Example isolate runner

			
import ivm from "isolated-vm";
import { ExecuteJsInput } from "./schemas";
export async function runSandboxedScript(
  request: z.infer<typeof ExecuteJsInput>,
  policyStore: PolicyStore
) {
  const policy = await policyStore.resolve({
    namespaces: request.allowedNamespaces,
    packages: request.allowedPackages,
  });
  const isolate = new ivm.Isolate({ memoryLimit: 64 });
  const context = await isolate.createContext();
  const jail = context.global;
  await jail.set("global", jail.derefInto());
  const safeNamespaces = Object.fromEntries(
    policy.namespaces.map((name) => [name, namespaceRegistry[name]!()])
  );
  const safePackages = Object.fromEntries(
    policy.packages.map((name) => [name, packageRegistry[name]!()])
  );
  await jail.set("input", new ivm.ExternalCopy(request.input ?? null).copyInto());
  await jail.set("namespaces", new ivm.ExternalCopy(safeNamespaces).copyInto());
  await jail.set("packages", new ivm.ExternalCopy(safePackages).copyInto());
  const wrapped = `
    "use strict";
    (async function () {
      const console = undefined;
      const process = undefined;
      const require = undefined;
      const module = undefined;
      const exports = undefined;
      const Buffer = undefined;
      const setTimeout = undefined;
      const setInterval = undefined;
      const userFn = async ({ input, namespaces, packages }) => {
        ${request.code}
      };
      return await userFn({ input, namespaces, packages });
    })()
  `;
  const script = await isolate.compileScript(wrapped);
  try {
    const result = await script.run(context, { timeout: request.timeoutMs });
    const copied = await new ivm.Reference(result).copy();
    return {
      ok: true,
      result: copied,
    };
  } catch (error) {
    return {
      ok: false,
      error: sanitizeError(error),
    };
  } finally {
    isolate.dispose();
  }
}

		

This is intentionally opinionated.

The sandbox gets input
The sandbox gets namespaces
The sandbox gets packages
The sandbox does not get Node
The sandbox does not get require
The sandbox does not get the environment

That is the right posture.

The isolated-vm project describes these isolates as separate JavaScript environments free of the extra capabilities that Node normally exposes. That is why it is a better primitive here than vm. (GitHub)

How whitelisting should really work

A lot of teams hear “whitelist packages” and think they should allow date-fns or lodash directly. That is still too coarse.

You want three policy levels.

1. Namespace whitelist

These are internal capability groups you define, such as:

math
dates
currency
tax
normalizers

These are ideal for domain logic because they let you present stable semantic surfaces to the model.

2. Package facade whitelist

This is not raw NPM package access. It is a curated wrapper over a package.

Example:

			
const packageRegistry = {
  dateFns: () => {
    const { addDays, formatISO, parseISO } = require("date-fns");
    return { addDays, formatISO, parseISO };
  },
};

		

3. Tenant or tool policy whitelist

Even if a package exists in the registry, a given agent or tenant may not be allowed to use it.

That means final access should be the intersection of:

globally supported capabilities
tenant policy
current agent policy
current tool invocation request

That keeps the model from escalating its own power simply by naming more packages.

What “most secure method” means in practice

Here is the honest version.

If the code is untrusted, the strongest production pattern is not “just use a safer JavaScript library.” The strongest pattern is:

dedicated Node execution service
running in a separate process or container from the agent host
Node permission model enabled where possible
no filesystem permission unless explicitly required
no network permission unless explicitly required
no child process permission
no raw module loading
isolate-based execution inside the service
per-request timeout
per-request memory cap
rate limiting and audit logging
kill-and-recycle strategy for suspicious runs

Node’s permission model is now stable and is specifically intended to restrict access to resources during execution. That makes it a useful outer control around the execution worker process. (Node.js)

So the recommendation is:

Do not run JavaScript evaluation in the Microsoft Agent Framework process. Run it in a separate hardened execution service, and inside that service use an isolate with only host-injected safe facades.

Prompting the agent correctly

One subtle mistake is giving the model too much freedom in how it uses the tool. Your tool description should bias toward deterministic use cases.

Good use cases:

schema normalization
mathematical calculations
JSON reshaping
derived field generation
deterministic validation helpers
short business-rule transforms

Bad use cases:

arbitrary web requests
importing unknown libraries
long-running workflows
anything requiring secret access
anything that should really be a reviewed backend feature

You want the tool to feel more like “dynamic formula execution” than “tiny remote code runner.”

Observability and governance

Once you add this capability, you need a paper trail.

Log:

agent name
conversation or run id
caller identity
code hash
requested namespaces
requested packages
approved namespaces
approved packages
execution duration
memory tier
success or failure
sanitized error output

Do not log secrets in payloads. Do log enough to reconstruct who ran what and under which policy.

This matters because the risk is no longer just technical. It is operational. A dynamic execution tool without auditability becomes impossible to govern at scale.

Where this pattern is worth it

This pattern is especially valuable when building agents that need deterministic computation without shipping a new backend endpoint for every micro-use-case.

Examples:

tax calculation helpers
document extraction post-processing
migration mapping rules
payroll normalization
dynamic scoring or threshold logic
transforming AI output into strict structured shapes

In all of those cases, JavaScript is the execution language, but policy is the product.

Final opinion

The wrong way to add JavaScript to an agent is to think of it as a convenience feature.

The right way is to think of it as a controlled runtime.

Microsoft Agent Framework gives you the right extension point through function tools. tRPC gives you a clean typed boundary. Node can host the execution service. But the part that separates a toy from a production design is this: never let the model execute inside your primary trust boundary, and never equate “sandboxed” with “safe” unless you can explain the exact layers doing the isolation. (Microsoft Learn)

That is the architecture I use.

Category: Node.JS

How to Add a Secure JavaScript Execution Tool to Microsoft Agent Framework