How to Add a Secure JavaScript Execution Tool to Microsoft Agent Framework

There is a recurring moment in agent design where a team realizes the model does not just need to reason. It needs to compute. It needs to transform JSON, run a formula, post-process extracted fields, normalize dates, build a dynamic object, or apply domain logic that is simply easier to express in JavaScript than in prompt text.

That is where most teams make a dangerous move. They reach for evalFunction, or Node’s vm module and tell themselves it is “sandboxed enough.”

It is not.

Node’s own documentation is explicit that node:vm is not a security mechanism and should not be used to run untrusted code. Worker threads are also not the right boundary for hostile code because they are designed for parallelism and can share memory. At the same time, Microsoft Agent Framework is built to let agents call external tools through function tools, so the clean pattern is not “run JavaScript inside the agent host.” The clean pattern is “make JavaScript execution a remote tool with a hardened execution boundary.” (Node.js)

That is the architecture this post covers:

  • Microsoft Agent Framework in .NET
  • A custom function tool exposed to the agent
  • A tRPC call from the tool to a separate Node.js execution service
  • Execution inside a locked-down isolate, not vm
  • Explicit whitelisting of namespaces and packages
  • Validation, time limits, memory limits, and auditable policy controls

The key design principle is simple: treat JavaScript execution as a privileged capability, not a convenience API.

The architecture

At a high level, the flow looks like this:

  1. The agent decides it needs computation.
  2. Microsoft Agent Framework calls a function tool.
  3. The function tool sends a request over HTTP to a tRPC endpoint in Node.js.
  4. The Node service validates the request with Zod.
  5. The Node service creates an isolated execution environment.
  6. Only approved globals and wrapped package facades are injected.
  7. The code runs with strict limits for time, memory, and output shape.
  8. The result is returned to the agent as tool output.

Microsoft Agent Framework supports function tools as first-class extensions to an agent, and tRPC gives you a type-safe RPC layer with input and output validation. That combination is ideal here because the .NET side stays thin and deterministic, while the execution policy lives in one place on the Node side. (Microsoft Learn)

First principle: “secure eval” is really “isolated execution”

It is important to be direct here. There is no magic secureEval() in Node.js. If you are executing model-authored or user-authored JavaScript, the safest practical pattern is:

  • out-of-process execution boundary
  • fresh isolate per run or per tenant pool
  • no ambient filesystem or network access
  • no raw require
  • whitelisted host-provided capabilities only
  • timeouts, memory ceilings, and payload size limits
  • container and OS-level restrictions around the service

Why not use node:vm? Because the Node docs explicitly say not to use it as a security boundary. Why not just use worker threads? Because workers are concurrency primitives, not isolation primitives. A better starting point for JavaScript isolation in Node is isolated-vm, which exposes V8 isolates and is designed for running code in fresh environments with no default Node runtime capabilities. Node’s permission model can also further restrict the Node process itself. (Node.js)

The important nuance is this: even isolated-vm should be one layer, not the only layer. The strongest production posture is to run the execution service in its own locked-down container or workload boundary and assume defense in depth.

Tool contract design

Do not let the model send arbitrary source code and a free-form module list with no governance. Give it a constrained contract.

A good request shape looks like this:

import { z } from "zod";
export const ExecuteJsInput = z.object({
code: z.string().max(10_000),
input: z.unknown().optional(),
allowedNamespaces: z.array(z.string()).default([]),
allowedPackages: z.array(z.string()).default([]),
expectedResultSchema: z
.object({
type: z.enum(["json", "string", "number", "boolean", "array", "object"]),
})
.optional(),
timeoutMs: z.number().int().min(50).max(3000).default(1000),
});

This matters for two reasons.

First, tRPC is designed around typed procedures, and Zod-driven validation makes the boundary explicit. Second, you now have a place to enforce policy before any code gets near an isolate. (trpc.io)

The Microsoft Agent Framework side

On the .NET side, the tool should be boring. That is the goal.

Microsoft Agent Framework lets you expose custom logic through function tools, including by creating an AIFunction from a C# method. The agent does not need to know how tRPC works. It just needs a tool description that makes the capability understandable to the model. (Microsoft Learn)

A simplified example:

using System.ComponentModel;
using System.Net.Http.Json;
public class JavaScriptExecutionTool
{
private readonly HttpClient _httpClient;
public JavaScriptExecutionTool(HttpClient httpClient)
{
_httpClient = httpClient;
}
[Description("Executes tightly sandboxed JavaScript for deterministic data transformation and calculation.")]
public async Task<string> ExecuteSandboxedJavaScript(
[Description("The JavaScript source to execute. Must return a serializable result.")] string code,
[Description("Optional JSON input payload for the script.")] string? inputJson = null,
[Description("Approved namespaces the script may access.")] string[]? allowedNamespaces = null,
[Description("Approved package facades the script may access.")] string[]? allowedPackages = null)
{
var request = new
{
code,
input = string.IsNullOrWhiteSpace(inputJson) ? null : System.Text.Json.JsonSerializer.Deserialize<object>(inputJson),
allowedNamespaces = allowedNamespaces ?? Array.Empty<string>(),
allowedPackages = allowedPackages ?? Array.Empty<string>(),
timeoutMs = 1000
};
var response = await _httpClient.PostAsJsonAsync("/trpc/js.execute", request);
response.EnsureSuccessStatusCode();
return await response.Content.ReadAsStringAsync();
}
}

Then you register it as a function tool with your agent. The architectural point is more important than the exact setup syntax: the agent host never evaluates code locally. It delegates execution to the hardened service. (Microsoft Learn)

The tRPC boundary

tRPC is a strong fit because it gives you typed procedures, validation, and a clean contract between the .NET caller and Node service. Even though .NET is not consuming generated TypeScript types directly, the Node service still benefits from strict schemas and a maintainable procedure surface. (trpc.io)

Example router:

import { initTRPC } from "@trpc/server";
import { z } from "zod";
import { ExecuteJsInput } from "./schemas";
import { runSandboxedScript } from "./sandbox";
const t = initTRPC.create();
export const appRouter = t.router({
js: t.router({
execute: t.procedure
.input(ExecuteJsInput)
.mutation(async ({ input, ctx }) => {
return await runSandboxedScript(input, ctx.policyStore);
}),
}),
});
export type AppRouter = typeof appRouter;

This is where you can also add authentication, tenant context, rate limiting, audit metadata, and policy lookup.

The secure execution service

This is the heart of the design.

The mistake many teams make is trying to whitelist modules by exposing require. Do not do that. If you expose require, you are recreating Node inside the sandbox and dramatically expanding the attack surface.

Instead, preload and wrap approved capabilities in the host, then inject only those facades into the isolate.

That means your whitelist is not “the sandbox may import lodash.” It is “the sandbox may access a safe facade called packages.lodash that exposes only getpick, and omit.”

That is a much better boundary.

Example policy registry

type NamespaceFactory = () => Record<string, unknown>;
type PackageFactory = () => Record<string, unknown>;
const namespaceRegistry: Record<string, NamespaceFactory> = {
math: () => ({
round: Math.round,
floor: Math.floor,
ceil: Math.ceil,
max: Math.max,
min: Math.min,
}),
dates: () => ({
nowIso: () => new Date().toISOString(),
}),
};
const packageRegistry: Record<string, PackageFactory> = {
lodash: () => {
const { get, pick, omit } = require("lodash");
return { get, pick, omit };
},
decimal: () => {
const Decimal = require("decimal.js");
return { Decimal };
},
};

Notice what is missing: no arbitrary imports, no filesystem, no fetch, no process access, no environment access.

Example isolate runner

import ivm from "isolated-vm";
import { ExecuteJsInput } from "./schemas";
export async function runSandboxedScript(
request: z.infer<typeof ExecuteJsInput>,
policyStore: PolicyStore
) {
const policy = await policyStore.resolve({
namespaces: request.allowedNamespaces,
packages: request.allowedPackages,
});
const isolate = new ivm.Isolate({ memoryLimit: 64 });
const context = await isolate.createContext();
const jail = context.global;
await jail.set("global", jail.derefInto());
const safeNamespaces = Object.fromEntries(
policy.namespaces.map((name) => [name, namespaceRegistry[name]!()])
);
const safePackages = Object.fromEntries(
policy.packages.map((name) => [name, packageRegistry[name]!()])
);
await jail.set("input", new ivm.ExternalCopy(request.input ?? null).copyInto());
await jail.set("namespaces", new ivm.ExternalCopy(safeNamespaces).copyInto());
await jail.set("packages", new ivm.ExternalCopy(safePackages).copyInto());
const wrapped = `
"use strict";
(async function () {
const console = undefined;
const process = undefined;
const require = undefined;
const module = undefined;
const exports = undefined;
const Buffer = undefined;
const setTimeout = undefined;
const setInterval = undefined;
const userFn = async ({ input, namespaces, packages }) => {
${request.code}
};
return await userFn({ input, namespaces, packages });
})()
`;
const script = await isolate.compileScript(wrapped);
try {
const result = await script.run(context, { timeout: request.timeoutMs });
const copied = await new ivm.Reference(result).copy();
return {
ok: true,
result: copied,
};
} catch (error) {
return {
ok: false,
error: sanitizeError(error),
};
} finally {
isolate.dispose();
}
}

This is intentionally opinionated.

  • The sandbox gets input
  • The sandbox gets namespaces
  • The sandbox gets packages
  • The sandbox does not get Node
  • The sandbox does not get require
  • The sandbox does not get the environment

That is the right posture.

The isolated-vm project describes these isolates as separate JavaScript environments free of the extra capabilities that Node normally exposes. That is why it is a better primitive here than vm. (GitHub)

How whitelisting should really work

A lot of teams hear “whitelist packages” and think they should allow date-fns or lodash directly. That is still too coarse.

You want three policy levels.

1. Namespace whitelist

These are internal capability groups you define, such as:

  • math
  • dates
  • currency
  • tax
  • normalizers

These are ideal for domain logic because they let you present stable semantic surfaces to the model.

2. Package facade whitelist

This is not raw NPM package access. It is a curated wrapper over a package.

Example:

const packageRegistry = {
dateFns: () => {
const { addDays, formatISO, parseISO } = require("date-fns");
return { addDays, formatISO, parseISO };
},
};

3. Tenant or tool policy whitelist

Even if a package exists in the registry, a given agent or tenant may not be allowed to use it.

That means final access should be the intersection of:

  • globally supported capabilities
  • tenant policy
  • current agent policy
  • current tool invocation request

That keeps the model from escalating its own power simply by naming more packages.

What “most secure method” means in practice

Here is the honest version.

If the code is untrusted, the strongest production pattern is not “just use a safer JavaScript library.” The strongest pattern is:

  • dedicated Node execution service
  • running in a separate process or container from the agent host
  • Node permission model enabled where possible
  • no filesystem permission unless explicitly required
  • no network permission unless explicitly required
  • no child process permission
  • no raw module loading
  • isolate-based execution inside the service
  • per-request timeout
  • per-request memory cap
  • rate limiting and audit logging
  • kill-and-recycle strategy for suspicious runs

Node’s permission model is now stable and is specifically intended to restrict access to resources during execution. That makes it a useful outer control around the execution worker process. (Node.js)

So the recommendation is:

Do not run JavaScript evaluation in the Microsoft Agent Framework process. Run it in a separate hardened execution service, and inside that service use an isolate with only host-injected safe facades.

Prompting the agent correctly

One subtle mistake is giving the model too much freedom in how it uses the tool. Your tool description should bias toward deterministic use cases.

Good use cases:

  • schema normalization
  • mathematical calculations
  • JSON reshaping
  • derived field generation
  • deterministic validation helpers
  • short business-rule transforms

Bad use cases:

  • arbitrary web requests
  • importing unknown libraries
  • long-running workflows
  • anything requiring secret access
  • anything that should really be a reviewed backend feature

You want the tool to feel more like “dynamic formula execution” than “tiny remote code runner.”

Observability and governance

Once you add this capability, you need a paper trail.

Log:

  • agent name
  • conversation or run id
  • caller identity
  • code hash
  • requested namespaces
  • requested packages
  • approved namespaces
  • approved packages
  • execution duration
  • memory tier
  • success or failure
  • sanitized error output

Do not log secrets in payloads. Do log enough to reconstruct who ran what and under which policy.

This matters because the risk is no longer just technical. It is operational. A dynamic execution tool without auditability becomes impossible to govern at scale.

Where this pattern is worth it

This pattern is especially valuable when building agents that need deterministic computation without shipping a new backend endpoint for every micro-use-case.

Examples:

  • tax calculation helpers
  • document extraction post-processing
  • migration mapping rules
  • payroll normalization
  • dynamic scoring or threshold logic
  • transforming AI output into strict structured shapes

In all of those cases, JavaScript is the execution language, but policy is the product.

Final opinion

The wrong way to add JavaScript to an agent is to think of it as a convenience feature.

The right way is to think of it as a controlled runtime.

Microsoft Agent Framework gives you the right extension point through function tools. tRPC gives you a clean typed boundary. Node can host the execution service. But the part that separates a toy from a production design is this: never let the model execute inside your primary trust boundary, and never equate “sandboxed” with “safe” unless you can explain the exact layers doing the isolation. (Microsoft Learn)

That is the architecture I use.

Review : Build a Network Application with Node

I had the opportunity to review the PACKT video series “Build a Network Application with Node” by Joe Stanco.  This video series walks you through how, at a high-level, to create different types of web apps using Node.JS.

This video series targets the JavaScript developer with a basic understanding of Node.JS.  Joe Stanco does an excellent job in guiding the viewer through the creation of a series of web apps designed to highlight common develop use cases.  The examples start with a barebones “hello world” type app and gradually move the viewer to a more complex Socket.IO and Bootstrap app.

Joe Stanco’s presentation skills are impressive, from introduction to conclusion his delivery was clear, easy to understand, and in sync with his examples.  With a length of over 2 hours, the pace and clarify of presentation made “Build a Network Application with Node” easy to watch in a single sitting.

I do wish this was divided into a “Fundamentals” and “Advanced” course.  This would allow for Joe to spend more time on the advanced topics.

You can check out a sample section of “Build a Network Application with Node” on YouTube here.