GAI Insights - Paul Baier

Another Company Drops ChatGPT For Claude

Paul Baier — Tue, 21 Apr 2026 16:13:05 GMT

Last week I wrote about OpenAI's credibility problem. The pressure continues. The full customer impact remains unclear, but product and credibility concerns are eroding some customers' commitment to OpenAI as a long-term digital intelligence vendor.

What are you seeing?

Join us for our Claude Demo JAM session May 4. Join 125 others who have already registered here.

10 great Claude demos (3 min demo, 3 min Q&A)

How to use Claude to create editable, powerpoint slides
How to use Claude to schedule a personal daily brief (to do, schedule, top leads from Hubspot, etc.)
How to use Claude with Excel to create a 5 year financial plan with sensitivity analysis
How to use Claude as your personal AI assistant, similar to OpenClaw
etc.

Onward,
Paul

Resources and Media:

LinkedIn: Calendar: Learning Lab: HBR Article; Daily AI Show: GAI World 2026: TEDx: X/Twitter: TikTok

GAI Insights helps alternative asset managers and companies increase revenue per employee. Our Navigator platform, AI analysts, AI Adoption Specialists aid companies in speeding AI adoption.

Zero Percent. Zero Alignment. The Sam Altman Question Enterprise Buyers Must Ask.

Paul Baier — Sat, 18 Apr 2026 12:32:50 GMT

Digital intelligence suppliers are the single most important competitive input for large enterprises today in this new Age of AI. CEOs and CIOs need absolute confidence in their long-term AI suppliers. That confidence is now eroding around OpenAI and its CEO.

Front page article, “above the fold” in Wall Street Journal, April 18, 2026

Private equity firms, CEOs and CIOs are asking GAI Insights a direct question: Is Sam Altman trustworthy, and is OpenAI a reliable long-term technology partner?

Two major investigative reports dropped this month. The Wall Street Journal reported on that Altman’s personal investment portfolio creates persistent conflicts of interest or self dealing as OpenAI approaches its IPO, valued at roughly $850 billion.

Days earlier, The New Yorker published a 16,000-word investigation which raised serious questions about a pattern of deception and self-dealing throughout Altman’s career.

Here is what makes Altman’s position unusual. He holds zero equity in OpenAI. His $3.3 billion net worth (per Forbes, March 2026) comes from a portfolio of more than 400 personal investments, some of which intersect with companies doing business with OpenAI.

Why Founder-CEO Trust Matters

In early-stage enterprise software, customers place enormous bets on the company roadmap. The founder-CEO is the roadmap. Their credibility, stability and alignment with customer interests determines whether a CIO signs a 3-5-year commitment or hedges with a competitor.

Consider the track record of founder-CEOs who built trusted enterprise platforms: Marc Benioff at Salesforce. Hasso Plattner at SAP. Aneel Bhusri at Workday. Scott Cook at Intuit. Jay Chaudhry at Zscaler. Aaron Levie at Box. Stewart Butterfield at Slack. Bill Gates at Microsoft. Larry Ellison at Oracle.

Each of these leaders held significant equity. Their personal financial outcomes were directly tied to customer success. Their incentives were aligned with the enterprises that depended on them.

Altman fails this test. He holds no OpenAI equity. His personal wealth grows through outside ventures, some of which compete with OpenAI for resources and attention. He told the Big Technology podcast he is “zero percent” excited to be a public company CEO.

The Competitive Picture Has Shifted

OpenAI’s position in the AI market has weakened. As the WSJ noted, the company’s lead is slipping. Anthropic’s annualized revenue reached $30 billion in April 2026, surpassing OpenAI’s $24 to $25 billion run rate. A year ago, Anthropic was at roughly $1 billion.

We work with multiple asset management firms, including private equity, hedge funds and alternative asset managers, delivering 201 and 301 AI training. Every one of these firms has rolled out Claude alongside ChatGPT. Their feedback is consistent: Claude is the superior product for professional knowledge work.

The Questions Enterprise Buyers Are Asking

The questions we hear from CIOs and investment professionals are pointed:

Does Altman have OpenAI’s best interests at heart, or his own?
Are the WSJ and New Yorker reports competitive hit pieces, or legitimate investigative journalism that should concern enterprise buyers?
How does OpenAI execute an IPO with a CEO who holds no equity, maintains opaque personal investments and says he does not want the job?

These are reasonable questions. Enterprise technology buyers spend millions on AI infrastructure. They deserve clear answers about the leadership and incentive alignment of their most important suppliers.

Can you trust OpenAI? Are you reconsidering them as a strategic IT supplier to your company?

Join us for our Claude Demo JAM session May 4. Join 85 others who have already registered here.

10 great Claude demos (4 min demo, 4 min Q&A)

How to use Claude to create editable, powerpoint slides
How to use Claude to schedule a personal daily brief (to do, schedule, top leads from Hubspot, etc.)
How to use Claude with Excel to create a 5 year financial plan with sensitivity analysis
How to use Claude as your personal AI assistant, similar to OpenClaw
etc.

"Trust takes years to build, seconds to break, and forever to repair." -author unknown

Onward,
Paul

Resources and Media:

LinkedIn: Calendar: Learning Lab: HBR Article; Daily AI Show: GAI World 2026: TEDx: X/Twitter: TikTok

GAI Insights helps alternative asset managers and companies increase revenue per employee. Our Navigator platform, AI analysts, AI Adoption Specialists aid companies in speeding AI adoption.

FAQ

Q: Why should enterprise buyers care about Sam Altman’s personal investment portfolio?

A: Altman holds zero equity in OpenAI. His $3.3 billion net worth comes from more than 400 personal investments, some of which intersect with companies doing business with OpenAI. This creates a misalignment rare among founder-CEOs of major enterprise platforms. Every trusted enterprise software leader, from Benioff to Gates to Ellison, held significant equity in their company. Their wealth grew when customers succeeded. Altman’s wealth grows through outside ventures. Enterprise buyers spending millions on AI infrastructure should understand this incentive structure before signing long-term commitments.

Q: Has OpenAI’s competitive position changed?

A: Yes. Anthropic’s annualized revenue reached $30 billion in April 2026, surpassing OpenAI’s $24 to $25 billion run rate. A year ago, Anthropic was at roughly $1 billion. In our work with private equity firms, hedge funds and alternative asset managers, every firm has deployed Claude alongside ChatGPT. The consistent feedback: Claude is the superior product for professional knowledge work. OpenAI still holds significant market share, but the competitive gap has closed and, in some segments, reversed.

Q: Are the Wall Street Journal and New Yorker investigations legitimate concerns or competitive noise?

A: Both publications have rigorous editorial standards and long track records of investigative reporting. The WSJ reported on persistent conflicts of interest tied to Altman’s personal portfolio as OpenAI approaches an IPO valued at roughly $850 billion. The New Yorker published a 16,000-word investigation raising questions about a pattern of deception across Altman’s career. Enterprise buyers should treat these reports as material inputs to their vendor risk assessment, not dismiss them as hit pieces. The questions they raise about leadership stability, incentive alignment and governance are standard due diligence for any strategic technology supplier.

FYI: Here Is My Current List of Skills in Claude

Paul Baier — Fri, 17 Apr 2026 14:02:26 GMT

Tens of thousands of nontechnical and technical users are switching from ChatGPT to Claude. 97% of attendees at the MIT AI Summit last Friday use Claude as their preferred AI productivity tool according to my survey.

Claude “Skills” are custom instruction sets you save inside Claude so it performs specific tasks the same way every time. Instead of explaining how you want something done at the start of each conversation, you write the instructions once. Claude follows them automatically when the right situation comes up. Skills are very similar to Custtom GPT in ChatGPT.

You will find introductory information on Skills at Anthropic, on YouTube, and many other places. When I need to create a Skill, I ask Claude to build it. For instance, I asked Claude to summarize the proofreading feedback I received from Forbes on my column and turn it into a Skill. After you create a Skill, it appears in your Skills Library in Claude.

To use a Skill, start a chat in Claude and reference the Skill. Example: “Proofread the article below using Forbes proofreading Skill.”

Skills work best for specific tasks you perform repeatedly. I had 7-10 Custom GPTs in ChatGPT and have been converting them to Skills in Claude.

Here is my current list of Skills.

What Skills have you built?

Join us for our Claude Demo JAM session May 4. Register here.

10 great Claude demos (4 min demo, 4 min Q&A)

how to use Claude to create editable, powerpoint slides
how to use Claude to schedule a personal daily brief (to do, schedule, top leads from Hubspot, etc.)
how to use Claude with Excel to create a 5 year financial plan with sensitivity analysis
how to use Claude as your personal AI assistant, similar to OpenClaw
etc.

Onward,
Paul

Resources and Media:

LinkedIn: Calendar: Learning Lab: HBR Article; Daily AI Show: GAI World 2026: TEDx: X/Twitter: TikTok

GAI Insights helps alternative asset managers and companies increase revenue per employee. Our Navigator platform, AI analysts, AI Adoption Specialists aid companies in speeding AI adoption.

Anthropic's Mythos Too Powerful to Release | Essential AI News for Apr 6-10

Paul Baier — Mon, 13 Apr 2026 01:28:44 GMT

Hello everyone,

GAI Insights helps alternative asset managers and companies increase revenue per employee. Our Navigator platform, AI analysts, and AI Adoption Specialists help companies speed AI adoption.

Essential AI News Apr 6-10

Last week, our AI analysts rated 11 articles as “Essential” reads

The AI Transformation Manifesto (McKinsey)
Rationale: This McKinsey article lays out 12 themes for companies trying to become truly AI-native, arguing that competitive advantage comes less from access to AI tools and more from building enduring capabilities, focusing on economic leverage points, earning trust, and mastering agentic engineering. Our analysts highlighted it as a rare executive guide that is both concise and practical, especially for leaders thinking about implementation speed, adoption at scale, and how to align business strategy with Generative AI transformation.

AI Adoption By the Numbers (a16z)
Rationale: This article from a16z analyzes enterprise AI adoption trends based on enterprise startup penetration data, showing significant real-world deployment of generative AI across large organizations with measurable productivity gains. Our analysts emphasized the importance of the data-driven perspective, noting that adoption rates — particularly among Fortune 500 companies — demonstrate that AI is moving into production at scale, countering narratives that AI initiatives are largely failing.

97% of Attendees at MIT AI Summit Prefer Claude Over ChatGPT
Rationale: This survey if 125 technical and nontechnical attendees at the large Imagination in Action AI Summit at MIT Media Lab April 10 shows the massive shift away from ChatGPT to Claude by early adopters.

New York Times: Economists Once Dismissed the A.I. Job Threat, but Not Anymore
Rationale: This NYT article examines a shift in mainstream economic thinking, as economists increasingly acknowledge that AI could materially disrupt white-collar employment and reshape labor markets in the near term. Our analysts stressed that this matters less for its policy prescriptions than for the fact that a broader business audience is finally being forced to take AI-driven job displacement seriously, making it a key signal for enterprise leaders planning workforce, operating model, and competitive strategy.

Anthropic’s Project Glasswing - Security Concerns with Mythos
Rationale: Project Glasswing introduces Anthropic’s gated Claude Mythos Preview for defensive cybersecurity, with capabilities aimed at finding and helping remediate serious software vulnerabilities in critical infrastructure and widely used open-source systems. Our analysts highlighted that it signals a new level of AI capability, pairs that capability with a tightly controlled coalition model, and shows how frontier AI may now be strong enough to force organizations to rethink how they harden software and internet infrastructure.

Auto-Research for Legal Agents
Rationale: This article shows how AI self-improving agent loops can materially improve legal agent performance, turning a traditionally labor-intensive drafting workflow into a highly automated one. Our analysts emphasized that this is one of the first clearest operational examples yet of self-improving agent loops moving from theory into practice, with a jump from very low initial performance to near-complete task success on complex complaint drafting.

Claude Managed Agents: Get To Production 10x Faster
Rationale: Anthropic introduces Claude Managed Agents, a fully hosted platform for building, deploying, and managing AI agents at scale with integrated memory, orchestration, and analytics. Our analysts noted this as a major step toward enterprise-grade AI automation, highlighting its ability to simplify production deployment and create platform stickiness, while also signaling Anthropic's strategic move into infrastructure and increasing competition with hyperscalers and OpenAI specifically.

Meta-Harness: End-to-End Optimization of Model Harnesses
Rationale: This paper introduces Meta-Harness, a system for automatically improving the code, prompts, retrieval, and context logic surrounding large language models rather than just improving the models themselves. Our analysts highlighted this as a major architectural shift because it suggests that more of the practical intelligence in AI systems may move into the harness layer, with the paper showing gains such as a 7.7-point improvement in text classification using 4x fewer context tokens and better performance on math reasoning and agentic coding benchmarks.

Mapping AI into Production: A Field Experiment on Firm Performance
Rationale: This research studies the “mapping problem” in AI adoption, or how firms identify where AI actually creates value inside real production workflows rather than only on isolated tasks. Our analysts emphasized that this is one of the more useful organizational AI studies to date because the experiment across 515 startups found treated firms discovered 44% more AI use cases, completed 12% more tasks, were 18% more likely to acquire paying customers, generated 1.9x higher revenue, and reduced demand for outside capital, reinforcing that AI transformation depends on redesigning the company around the technology, not just adding tools.

The Art of Building Verifiers for Computer Use Agents
Rationale: This research paper tackles one of the biggest bottlenecks in agentic AI: how to reliably verify whether computer-use agents actually completed tasks correctly. Our analysts emphasized that verification is becoming foundational for trustworthy AI automation because agents still hallucinate, go off the rails, and need stronger checks on process, outcomes, and context before enterprises can scale them with confidence.

Introducing Meta’s Muse Spark: Scaling Towards Personal Superintelligence
Rationale: Meta introduces Muse Spark, a new high-performance large language model aimed at advancing toward personal superintelligence with strong multimodal, reasoning, and agentic capabilities. Our analysts highlighted its impressive benchmark performance, innovations like iterative long-thinking with thought compression to reduce token usage, and its potential impact given Meta’s massive distribution and free access, making it a significant development in the competitive AI model landscape.

We completed our 554th episode of our Daily AI News Show. Each week day, our AI Analysts review and rate 5 enterprise AI articles as “Essential”, “Important”, or “Optional” for you, the time-starved AI leader. Watch the debate on our Daily AI news show on YouTube, Spotify or Apple Podcasts receive a daily email, and search reviewed articles here.

OpenAI and Anthropic Watch

OpenAI

S-1 not filed yet for IPO. 2026 revenue projected to be $30B, up from $20B last year
Closed $122B round
The ads in ChatGPT already generating $100 million in annualized revenue, expected to be $100B revenue by 2030

Anthropic

S-1 not filed yet for IPO. Anthropic’s revenue explodes - surpasses OpenAI
Claude Mythos: too dangerous to release
Anthropic is rumored to be considering building its own AI chips, mirroring moves by Google (TPUs) and Amazon (Trainium).
Continues to position itself as a “safety first” company

Would you value a email customized for your company or AI Center of Excellence? As us about our Navigator product.

Onward,
Paul

Resources and Media:

LinkedIn: Calendar: Learning Lab: HBR Article; Daily AI Show: GAI World 2026: TEDx: X/Twitter: TikTok

GAI Insights helps alternative asset managers and companies increase revenue per employee. Our Navigator platform, AI analysts, AI Adoption Specialists aid companies in speeding AI adoption.

FAQ

Question: What separates companies that get real business value from AI from those that only run pilots?

Answer: The difference is not access to better tools. It is the ability to redesign work around AI, focus on the highest-value use cases, and build the internal systems needed to scale adoption. Several of the articles in this roundup point to the same conclusion: companies create stronger results when they map AI into real production workflows, improve trust and oversight, and treat AI as an operating model shift rather than a side experiment.

Question: Is enterprise AI adoption now large enough that executives should treat it as an immediate priority?

Answer: Yes. The data highlighted here shows that generative AI is already moving into production across large organizations, including Fortune 500 companies, with measurable productivity gains. For executive teams, the issue is no longer whether AI will matter. The more important question is where it can create the fastest financial impact and how quickly the company can adopt it without creating unnecessary risk.

Question: What are the biggest risks leaders need to manage as AI becomes more powerful across the enterprise?

Answer: Three risks stand out: workforce disruption, security exposure, and unreliable outputs. As AI takes on more knowledge work, leaders need a clear plan for role redesign and reskilling. As models become more capable in areas like cybersecurity and automation, companies need tighter controls, stronger review processes, and clear limits on access. And as AI agents handle more tasks, verification becomes critical so teams can trust results before acting on them.

If You Are NOT Using Claude, You Are Officially BEHIND

Paul Baier — Thu, 09 Apr 2026 20:09:55 GMT

I’m calling it. You are behind if you are not using Claude.

If you are one of the many companies that have already invested in ChatGPT and employee training around it, we recommend keeping that investment (several clients asked us whether they should drop ChatGPT, and our view is no). At the same time, we encourage you to evaluate the ROI of giving all or some of your employees access to Claude.

Join me at this large AI Conference tomorrow Friday, Apr 10 at MIT curated by Imagination in Action. I’m moderating 2 panels and participating on a third.

Onward,
Paul

Resources and Media:

LinkedIn: Calendar: Learning Lab: HBR Article; Daily AI Show: GAI World 2026: TEDx: X/Twitter: TikTok

GAI Insights helps alternative asset managers and companies increase revenue per employee. Our Navigator platform, AI analysts, AI Adoption Specialists aid companies in speeding AI adoption.

Wow. Anthropic Now Expected To Have MORE Revenue Than OpenAI This Year

Paul Baier — Tue, 07 Apr 2026 19:32:07 GMT

Anthropic has closed the gap with OpenAI.

Just a year ago, OpenAI had a material lead over Anthropic. That is no longer true.

Anthropic’s run-rate revenue surpassed $30 billion in April 2026, up from $9 billion at the end of 2025. Claude Code, Cowork, and Claude.ai are driving adoption across technical and enterprise audiences alike. Monthly visits to claude.ai surged from 16 million in January 2025 to 220 million in January 2026, a 13-fold increase in twelve months. One in five businesses on the corporate payments platform Ramp now pays for Anthropic, up from one in 25 a year ago.

OpenAI’s estimated revenue run rate for this year is $25B, which is now lower than Anthropic’s.

OpenAI is not standing still. It closed a $122 billion funding round in April 2026 at an $852 billion post-money valuation, the largest private capital raise in technology history.

But OpenAI is managing a compounding set of headwinds.

Investors are concerned about the massive losses OpenAI is projecting (source Wall Street Journal)
Three C-suite roles were vacated or restructured within a single week. Fidji Simo, CEO of AGI Deployment, is taking medical leave. CMO Kate Rouch is stepping down after breast cancer treatment. COO Brad Lightcap moved to a special projects role. Losing three senior leaders simultaneously, ahead of a potential IPO, is not a routine transition.
The New Yorker published a sweeping investigation this week, built on internal memos from former OpenAI co-founders Ilya Sutskever and Anthropic CEO Dario Amodei, concluding that OpenAI systematically abandoned its safety-first founding mission as it scaled. One former board member called Altman “unconstrained by truth.” The story will complicate enterprise procurement conversations.
There is near zero interest for OpenAI shares in the secondary market, always a concern sign.
The OpenAI-Microsoft divorce is nearly final, with Microsoft investing heavily in its own AI models and partnering with Anthropic for Cowork.
On the product side, OpenAI manages a wide and expanding portfolio: ChatGPT, Codex, image generation, voice, and an in-development super app aimed at combining them. Anthropic has concentrated resources on code, agents, and enterprise API access. That focus reflects a deliberate strategic choice, and the revenue numbers show which approach is gaining ground faster.

The competitive question for enterprise buyers is direct. Your AI vendor strategy, built 12 to 18 months ago when OpenAI was the default choice, deserves a fresh look. Claude is now the only frontier AI model available across all three major cloud platforms: Amazon Web Services, Google Cloud, and Microsoft Azure. The number of enterprise customers spending over $1 million annually on Anthropic doubled from 500 to more than 1,000 in less than two months.

OpenAI retains capital, brand recognition, and a massive consumer base.

Are you re-considering your company’s strategy with OpenAI? Let me know.

Join me at this large AI Conference this Friday, Apr 10 at MIT curated by Imagination in Action. I’m moderating 2 panels and participating on a third.

Onward,
Paul

Resources and Media:

LinkedIn: Calendar: Learning Lab: HBR Article; Daily AI Show: GAI World 2026: TEDx: X/Twitter: TikTok

GAI Insights helps alternative asset managers and companies increase revenue per employee. Our Navigator platform, AI analysts, AI Adoption Specialists aid companies in speeding AI adoption.

FAQ

Question: Should we reconsider our AI vendor strategy now that Anthropic has closed the gap with OpenAI?

Answer: Yes. Vendor selection made 12–18 months ago likely assumed OpenAI as the default leader. Anthropic’s rapid revenue growth, enterprise adoption, and multi-cloud availability change that assumption. A structured reassessment comparing performance, cost, integration flexibility, and risk across both providers is now warranted.

Question: How does Anthropic’s focused product strategy impact enterprise value compared to OpenAI’s broader platform approach?

Answer: Anthropic’s concentration on code generation, agents, and enterprise APIs can translate into faster improvements and clearer ROI in specific business workflows. OpenAI’s broader platform offers more versatility but can introduce complexity. The better choice depends on whether your priority is depth in key use cases or a unified, multi-modal ecosystem.

Question: What risks should executives weigh when evaluating OpenAI given recent leadership changes and scrutiny?

Answer: Leadership turnover and public criticism around governance and safety can affect long-term stability, procurement confidence, and regulatory perception. While OpenAI remains well-funded and widely adopted, enterprises should factor in vendor continuity, transparency, and alignment with internal risk standards before committing to large-scale deployments.

3 Types of AI Fluency at Zapier | Essential AI News for Mar 30-Apr 3

Paul Baier — Sun, 05 Apr 2026 13:55:03 GMT

Hello everyone,

GAI Insights helps alternative asset managers and companies increase revenue per employee. Our Navigator platform, AI analysts, and AI Adoption Specialists aid companies in speeding AI adoption.

Essential AI News Mar 30-Apr 3

Inside Shopify’s AI-first engineering playbook (Bessemer Venture Partners)

Rationale: This article lays out how Shopify built an AI-first engineering model around shared infrastructure, broad tool experimentation, weekly demos, and guardrails against “comprehension debt,” with a 20% productivity gain. Our analysts emphasized that this is a practical blueprint for AI leaders because it shows how executive sponsorship, fast experimentation, human review, and real operating metrics can turn Generative AI from a coding aid into an organizational operating model.

One Year Later: Raising the AI Fluency Bar for Every Zapier Hire

Rationale: This article outlines how Zapier operationalizes AI fluency across its workforce, defining levels like capable, adoptive, and transformative to guide hiring and development in the era of Generative AI. Our analysts emphasized its highly actionable framework, highlighting how it provides concrete benchmarks across functions (e.g., marketing, legal, ops) and positions AI talent strategy as core to business strategy, making it a practical blueprint for enterprise AI adoption and workforce transformation.

Redpoint AI Market Update Presentation

Rationale: This 69-page market update lays out a data-heavy view of the AI economy, covering model economics, infrastructure demand, vertical versus horizontal SaaS disruption, and why the current cycle looks different from the dot-com era. The panel treated it as essential because its fact-based slides helped frame where value is accruing now, especially around infrastructure utilization and unusually high ARR-per-employee benchmarks for leading AI companies, even as one analyst flagged cash-flow risk beneath the boom.

Case Study: Rocket Close transforms mortgage document processing with Amazon Bedrock and Amazon Textract

Rationale: Rocket Close used Amazon Bedrock tools to automate mortgage abstract-package processing, handling files that average 75 pages, cutting processing from about 30 minutes per package to under 2 minutes, and reaching about 90% overall accuracy with a system designed to scale past 500,000 documents annually. Our analysts highlighted this as one of the clearest enterprise AI automation case studies of the day because it combines measurable business impact, strong implementation detail, and a document-heavy workflow pattern that can transfer to many other industries.

AI Agent Traps (Research Paper)

Rationale: This paper introduces a taxonomy of emerging AI security risks tied to agentic systems, including prompt injection, semantic manipulation, and human-in-the-loop exploits in AI automation workflows. Our analysts stressed that as AI agents gain autonomy (e.g., browser and system control), they create entirely new attack surfaces, with real-world exploit success rates already significant, making this a critical framework for understanding and mitigating next-generation AI security threats.

Reimagine Marketing at Volkswagen Group with Generative AI

Rationale: Volkswagen Group showcases how generative AI is transforming marketing through large-scale AI-driven content creation, particularly in generating brand-consistent images across multiple global brands. Our analysts highlighted this as a strong enterprise case study demonstrating real-world deployment, including the use of LLMs for evaluation and quality control, offering valuable insights into how large organizations operationalize generative AI despite lacking explicit ROI metrics.

Arcee’s Trinity-Large-Thinking: Scaling an Open Source Frontier Agent

Rationale: Arcee introduces Trinity, a new open-weights frontier AI agent model designed to compete with leading large language models while enabling organizations to fully own and control their AI systems. Our analysts emphasized this as a critical market signal, highlighting the emergence of a U.S.-based open model that fills a major gap for enterprises—especially in regulated industries—seeking sovereignty over their AI infrastructure and reducing reliance on closed or foreign models.

Introducing Multi-Model Intelligence in Researcher (Microsoft Product)

Rationale: Microsoft introduces multi-model intelligence in its Researcher product, combining multiple large language models with critique loops to improve research quality, accuracy, and depth. Our analysts emphasized the importance of using LLM-as-a-judge patterns to validate outputs, noting that this approach significantly enhances factual accuracy and will likely become a standard in AI-driven research workflows, especially given Microsoft's enterprise distribution and integration with organizational data.

Sycophantic AI decreases prosocial intentions and promotes dependence

Rationale: This paper examines how leading AI models respond sycophantically to users and shows that flattering, affirming outputs can reduce prosocial behavior, worsen judgment, and increase dependence. Our analysts saw it as essential because the risk is not just bad model tone but downstream effects on workplace culture, coaching, HR, and mental health, especially when users actually prefer the more sycophantic responses.

On our Daily AI Show each week day, our AI Analysts review and rate 30 enterprise AI articles as “Essential”, “Important”, or “Optional” for you, the time-starved AI leader. Watch the debate on our Daily AI news show on YouTube, Spotify or Apple Podcasts receive a daily email, and search reviewed articles here.

OpenAI and Anthropic Watch

OpenAI

S-1 not filed yet for IPO. 2026 revenue projected to be $30B, up from $20B last year
Closed $122B round
Acquires media show TBPN
Discontinues Sora

Anthropic

S-1 not filed yet for IPO. 2026 revenue project to be $20B, up from $9M last year
Buys 8-month old AI Pharma startup for $400M
Some predict that Anthropic’s growth rate will surpass OpenAI
Claude Code source code accidentally leaked

Would you value a email customized for your company or AI Center of Excellence? As us about our Navigator product.

Join us Monday April 6, at 7p ET / 4p PT to see a demo OpenClaw (a Personal AI Agents) and to discuss the implications for companies. Claude is already matching some of these features and ChatGPT is expected to as well. Register here (215 already have registered).

Join me at this large AI Conference this Friday, Apr 10 at MIT curated by Imagination in Action. I’m moderating 2 panels and participating on a third.

Onward,
Paul

Resources and Media:

LinkedIn: Calendar: Learning Lab: HBR Article; Daily AI Show: GAI World 2026: TEDx: X/Twitter: TikTok

GAI Insights helps alternative asset managers and companies increase revenue per employee. Our Navigator platform, AI analysts, AI Adoption Specialists aid companies in speeding AI adoption.

GAI Insights Secures $500,000 Pre-Seed Round to Scale Its Navigator Platform for Financial and Enterprise Markets

Paul Baier — Thu, 02 Apr 2026 14:26:05 GMT

All,

We closed a pre-seed funding round for GAI Insights. Details below. Thank you to everyone in this Substack community who has shared in our collective learning through this amazing Age of AI.

Onward,

Paul

GAI Insights Secures $500,000 Pre-Seed Round to Scale Its AI Navigator Platform for Financial and Enterprise Markets

Analyst firm experiences rapid growth as private equity firms, asset managers and enterprises shift from AI experimentation to scaled deployment.

BOSTON, April 2, 2026 – GAI Insights, a leading AI research and advisory firm, today announced it closed a $500,000 pre-seed funding round to accelerate the development of its proprietary GAI Insights AI Navigator™ platform. The investment will scale the firm’s specialized services for private equity firms, hedge funds, asset management firms and enterprises that view the speed of AI adoption as a critical competitive advantage.

A syndicate of three CEOs, four Harvard Business School professors and other experienced angel investors backed the round, signaling strong market confidence in the firm’s trajectory and validation of GAI Insights’ unique methodology, developed by co-founder Dr. John Sviokla.

The capital injection follows six months of rapid customer acquisition. Financial institutions, government agencies and enterprise leaders are increasingly selecting GAI Insights to transition from AI experimentation to full-scale deployment. Recent firm milestones include:

Read the full release on Linkedin

Apply here to speak at our annual conference, GAI World 2026 in Boston Sep 28-30.

.Our AI Adoption Specialists are conducting a lot of GenAI 201 and 301 instructor-lead training for ChatGPT, Copilot, and Claude. Workshops are department level specific (equity research, investment, due diligence, portfolio support, investor relations, customer success, sales, etc.). Contact us if you have this need.

Onward,
Paul

Resources and Media:

LinkedIn: Calendar: Learning Lab: HBR Article; Daily AI Show: GAI World 2026: TEDx: X/Twitter: TikTok

FAQ

Q: What is GAI Insights, and what did they announce?

A: GAI Insights is an AI analyst, research and advisory firm based in Boston. They announced the closing of a $500,000 pre-seed funding round to scale their proprietary AI Navigator platform.

Q: Who is the target market for GAI Insights’ services?

A: GAI Insights specializes in serving private equity firms, hedge funds, asset management firms and large enterprises that view the speed of AI adoption as a primary competitive differentiator.

Q: What is GAI Insights AI Navigator platform?

A: AI Navigator is a proprietary software platform developed by GAI Insights. It uses dozens of AI agents to convert enterprise AI signals into executive-grade intelligence, powering the firm’s research, biweekly trend reports, and the Daily AI News Show.

Meta's AI Agent Breaks All Security Checks | Essential AI News for Mar 23-27

Paul Baier — Sun, 29 Mar 2026 17:12:42 GMT

Hello everyone,

You are one of 12,000 readers from Fidelity, UBS, Bank of America, Merck, Pfizer, Roche, Bayer, AIG, Prudential, Nationwide, Travelers, HIG, Harvard, MIT, Stanford, Microsoft, OpenAI, Seekr, Mistral, Google, Amazon, Liquid AI, Harvey.AI, Databricks, Cloudflare, Intel, McKinsey, PwC, Accenture, BCG, Bain, IBM, Deloitte and others.

Essential AI News Mar 23-27

Last week, our AI analysts rated 11 articles as “Essential” reads (I believe this is a record).

AI Security Issue: Meta's rogue AI agent passed every identity check — four gaps in enterprise IAM explain why

Ratings Rationale: This article examines an AI security failure pattern in which an agent with valid credentials operated inside authorized boundaries yet still exposed sensitive information, highlighting four gaps in enterprise identity governance: missing agent inventory, static credentials, no post-authentication intent validation, and unverified agent-to-agent delegation. Our analysts treated it as a critical warning for AI leaders because it shows that traditional IAM and guardrails are not enough once agentic systems are given real permissions and autonomy.

What the Best AI Users Do Differently — and How to Level Up All of Your Employees (Harvard Business Review)

Ratings Rationale: This HBR article looks at how 2,500 employees at KPMG used large language models over an eight-month period and finds that the strongest users are more ambitious, treat AI as a reasoning partner, delegate complex tasks with clear objectives, and use it as a general cognitive tool rather than just a shortcut. Our analysts emphasized that this is essential for leaders focused on capability building because it shifts the question from simple adoption metrics to what sophisticated AI use actually looks like across the workforce.

You've Finally Figured Out AI at Work — Now Comes the Bill (WSJ)

Ratings Rationale: This WSJ article explains how token usage is becoming a real operating cost for enterprises as AI adoption moves from experimentation to scaled deployment, with companies starting to track tokens as a proxy for compute spend, productivity, and governance. Our analysts emphasized that this is no longer just an IT line item: leaders need new budgeting and measurement frameworks because AI is one of the first major enterprise software categories where successful usage can materially increase the bill over time.

Agentic Scenarios Every Marketer Must Prepare For (Boston Consulting Group)

Ratings Rationale: BCG lays out four possible agentic-commerce futures—an open bazaar, brand resurgence through data ecosystems, super apps, and creator-led authenticity—so marketers can plan for multiple AI-driven market structures instead of betting on a single outcome. The key takeaway from the discussion was that every scenario still collapses to the same two requirements: machine discoverability and brand desirability, meaning brands need to shift from SEO to answer-engine optimization and make themselves easy for AI agents to find, trust, and recommend.

Anthropic just shipped an OpenClaw killer called Claude Code Channels, letting you message it over Telegram and Discord

Ratings Rationale: Anthropic introduces Claude Code Channels, which lets developers connect Claude Code to Telegram and Discord so they can message a live coding agent remotely, with messages injected into the active session and replies sent back through the messaging platform. Our analysts highlighted why this matters for enterprise adoption: it brings always-on agent workflows closer to mainstream use while offering a more accessible and potentially safer alternative to more complex do-it-yourself autonomous agent setups like OpenClaw.

Harness Design for Long-Running Application Development (Anthropic)

Ratings Rationale: This post from Anthropic explores advanced design patterns for building reliable, long-running AI agents using structured harnesses, including state management, evaluation loops, and orchestration of generative AI systems. Our analysts emphasized how this represents a next-generation shift toward agent orchestration introducing evaluator agents, subjective quality scoring (e.g., design taste), and iterative improvement loops that fundamentally reshape how AI-driven software development is executed.

The Software Factory: Why Your Team Will Never Work the Same Again

Ratings Rationale: This article argues that AI coding agents and orchestration tools are turning software delivery into a “software factory,” where builders delegate implementation to agents and focus on design and review. Our analysts highlighted this as essential because it captures the shift to agentic development, collapsing cycles from weeks to hours, and shows that everyone, not just engineers, can become builders, using AI to streamline workflows, automate processes, and extend teams with AI teammates.

Mark Zuckerberg Is Building an AI Agent to Help Him Be CEO (WSJ)

Ratings Rationale: This article describes Meta’s push to give Mark Zuckerberg a personal AI agent that speeds up information retrieval and supports a flatter, more AI-native operating model. Our analysts emphasized that even though the article is light on detail, it is strategically important because it signals that top leadership is directly using generative AI for decision support, organizational flattening, and personal productivity rather than treating AI as a tool only for technical teams.

Scaling Karpathy's Autoresearch: What Happens When the Agent Gets a GPU Cluster

Ratings Rationale: This article shows how giving an autonomous coding agent access to 16 GPUs let it run about 910 experiments over 8 hours, reach the same best validation loss roughly 9x faster than a sequential baseline, and improve bits-per-byte metric from 1.003 to 0.974 by exploring parameter interactions in parallel. Our analysts framed this as more than a narrow experiment: it points toward recursive self-improvement and parallel AI research workflows, meta level intelligence, that could become a new performance lever for leading model builders and advanced enterprise AI teams.

Intercom's new post-trained Fin Apex 1.0 beats GPT-5.4 and Claude Sonnet 4.6 at customer service resolutions

Ratings Rationale: Intercom’s Fin Apex 1.0 is presented as a smaller post-trained customer-service model that outperforms larger frontier models on support-resolution tasks, showing how domain-specific AI systems can beat general-purpose LLMs in narrow workflows. Our analysts viewed this as a signal that vertical AI vendors can build durable moats through proprietary data, evals, and post-training, with software differentiation shifting from feature engineering toward intelligence engineering as companies decide whether to tune, own, or vertically specialize their models.

The Great Reorg: A Human’s Guide (Joann Chen)

Ratings Rationale: This piece argues that enterprises are moving beyond individual AI productivity gains into full organizational redesign, with smaller teams, collapsing job boundaries, and four durable human roles such as system architects, relationship experts, accountability officers, and validators. The most valuable insight from the panel was the “validator” problem: if generative AI automates away junior roles, companies may cut costs today but weaken the pipeline that develops tomorrow’s human reviewers, decision-makers, and domain expertise.

Would you value a email customized for your company or AI Center of Excellence? As us about our Navigator product.

.Our AI Adoption Specialists are conducting a lot of GenAI 201 and 301 instructor-lead training for ChatGPT, Copilot, and Claude. Contact us if you have this need.

Onward,
Paul

Resources and Media:

LinkedIn: Calendar: Learning Lab: HBR Article; Daily AI Show: GAI World 2026: TEDx: X/Twitter: TikTok

ChatGPT Is the New Google for Your Buyers

Paul Baier — Sat, 28 Mar 2026 20:39:44 GMT

Some firms report a 40-50% drop in click-through rate from Google.com. The cause: users are shifting from traditional search engines to AI chatbots like ChatGPT, Claude, and Perplexity and Google has changed is algorithms.

We see this at GAI Insights. Over the past three quarters, AI chatbots drove 243 clicks to www.GAIInsights.com. ChatGPT drives the most traffic, but Claude grew from 1 click in Q3 to 11 in Q1. The numbers are small for a company our size, but the trend is clear: AI chatbots are becoming a real source of web traffic. (We have at least one confirmed sales lead from someone using ChatGPT).

This shift has direct revenue implications for any business that relies on SEO for lead generation and product discovery. The Google search box is no longer the only front door to your website.

A new industry has emerged around this shift. Answer Engine Optimization (AEO) or Generative AI Engine Optimization (GEO) aim to influence how AI chatbots reference and rank your content. The space is early. Vendors promote dozens of tactics, but no single approach has proven reliable at scale.

Our recent 29 page GAI Insights report, “AI Answer Engines: The New Digital Front Door,” is a primer for executives to understand this critical issue and list specific recommendations. Email me if you would like a copy.

Here are the numbers for www.GAI Insights.com

Our AI Adoption Specialists are conducting a lot of GenAI 201 and 301 instructor-lead training for ChatGPT, Copilot, Claude, and N8N. Contact us if you have this need.

Onward,
Paul

Resources and Media:

LinkedIn: Calendar: Learning Lab: HBR Article; Daily AI Show: GAI World 2026: TEDx: X/Twitter: TikTok

FAQ

Question: How will the shift from Google search to AI chatbots affect our current lead generation strategy?

Answer: As more users rely on AI chatbots for answers, fewer people click through traditional search results. This can reduce traffic from Google while increasing referrals from AI tools. Companies should monitor these new sources closely and adjust their content so it is easily cited and recommended by AI systems.

Question: What is the business value of investing in Answer Engine Optimization right now if the space is still emerging?

Answer: Even though the space is early, the shift in user behavior is already happening. Early investment allows companies to test what drives visibility in AI responses, capture new traffic sources, and gain a competitive advantage before best practices become standardized.

Question: How can we tell if AI chatbots are actually driving meaningful revenue, not just small amounts of traffic?

Answer: Track referrals from AI tools the same way you track other channels, focusing on lead quality and conversion rates rather than volume alone. Even a small number of high-intent visitors can generate measurable revenue, making it important to evaluate impact based on outcomes, not just clicks.

CORRECTION: I Incorrectly Stated That Anthropic Booked $6 Billion In Revenue in February

Paul Baier — Fri, 27 Mar 2026 10:24:48 GMT

Correction: In yesterday’s newsletter, I incorrectly stated that Anthropic booked $6 billion in revenue in February. I apologize for the error.

A $6B revenue month implied a $72B+ annual run rate. Bloomberg reports that Anthropic nears a $20B annualized run rate, driven in large part by strong adoption of Claude Code.

I hold myself to a high standard of accuracy and regret the error.

Claude is on a tear

Early adopters on the technical and nontechnical side are flocking to Claude. We know of dozens of asset management firms and companies now rolling Claude out to their employees and updating their 201 and 301 training programs. Three firms told us that escalation requests to approve Claude went directly to the CEO.

We believe the interest is fully warranted.

Claude Code is a standout product
Claude with Excel and PowerPoint currently outperforms ChatGPT’s Excel and PowerPoint capabilities, and far exceeds Copilot
Cowork is a task-oriented AI tool for nontechnical employees. Multiple companies are rethinking their use of tools like N8N, Zapier, and others as a result
Claude has aggressively shipped features similar to those in OpenClaw, the white-hot, open-source, personal AI assistant software

For my personal use and all my colleagues at GAI Insights, Claude now accounts for 95% of our usage and is our preferred tool for personal employee productivity. I shut down my OpenClaw project because of Claude.

Anthropic Shipped 45+ Features in 90 Days

Anthropic has been shipping at a rapid pace Below is a list of features by product, each scored on a 1-10 impact scale for enterprise value. Several features include a link to a short product demo.

CLAUDE CHAT (Web & Mobile)

Opus 4.6 Model (Feb 5) - The most capable Claude model with 1M token context window, stronger multi-step reasoning, and top benchmark scores in legal, financial, and coding tasks. Impact: 9/10 (Video: analyze multiple files) (Video: turn messy notes into neat notes)
Sonnet 4.6 Model (Feb 17) - Full upgrade across coding, computer use, long-context reasoning, agent planning, knowledge work, and design with 1M token context window in beta. Impact: 8/10
Agent Teams (Feb 5) - Deploy multiple specialized AI agents to collaborate on different parts of a large project simultaneously. Impact: 9/10
Persistent Memory for All Users (Mar 2) - Claude retains your name, communication style, writing preferences, and project context across separate conversations. Impact: 7/10
Infinite Chats (Jan) - Eliminates context window limit errors by summarizing earlier messages when a chat approaches its limit. Impact: 6/10
Inline Visualizations (Mar 12) - Creates custom interactive charts, diagrams, and visualizations directly within chat responses. Impact: 7/10
File Creation: Excel, PPT, Docs, PDF (Jan/Feb) - Creates and edits Excel spreadsheets, PowerPoint decks, documents, and PDFs directly in the app. Impact: 8/10 [Maggie video Claude in Excel) (Video: create spreadsheet of contact from one company in my gmail)
Claude for Excel and PowerPoint Add-ins (Mar 11) - Add-ins now share full context across applications, support skills, and connect via LLM gateway to Bedrock, Vertex AI, or Microsoft Foundry. Impact: 8/10 (video of Claude with Powerpoint Add-in) (Video of FInancial modeling Claude vs ChatGPT)
Enterprise Analytics API (Feb) - Programmatic access to usage and engagement data for Claude and Claude Code Remote, aggregated per organization per day. Impact: 8/10
Mobile Interactive Apps (Mar) - Mobile app supports live charts, diagrams, and shareable visual assets rendered within the conversation. Impact: 5/10
Computer Use, macOS Preview (Mar 24) - Claude opens files, clicks through websites, and completes multi-step tasks on your screen using mouse and keyboard. Impact: 8/10

CLAUDE COWORK (Desktop Agent)

Cowork Launch (Jan) - Desktop agent for knowledge workers that reads files, executes multi-step workflows, and produces deliverables autonomously with no coding required. Impact: 9/10 (video overview of Cowork)
Plugin Marketplace with Admin Controls (Feb 17) - Third parties publish skills into the Claude Marketplace. Six enterprise partners live: GitLab, Harvey (legal), Rogo (finance), Snowflake, Lovable, Replit. Impact: 9/10
38+ Connectors (Feb) - Connectors for Gmail, Google Drive, Notion, Slack, Microsoft 365, and more. Impact: 8/10
Scheduled and Recurring Tasks (Feb 25) - Create and schedule both recurring and on-demand tasks from within Cowork. Impact: 8/10
Computer Use in Cowork (Mar 24) - Claude opens apps, navigates browsers, fills spreadsheets, and performs tasks on your screen with no setup required. Impact: 9/10
Dispatch, Cross-Device (Mar 17) - Persistent cross-device conversation that lets you assign tasks to your desktop Cowork session from your phone. Claude executes locally using your files and apps. Impact: 8/10

CLAUDE CODE (Developer CLI & Agent Platform)

Opus 4.6 as Default Model (Mar) - Opus 4.6 is now the default model for Claude Code, providing stronger reasoning and longer agentic task performance. Impact: 9/10 (video build analyze, export data)
1M Token Context Window (Mar) - One million token context for Max, Team, and Enterprise plans. Work with very large codebases without premature compaction. Impact: 9/10
Claude Code Security (Feb 20) - Scans codebases for security vulnerabilities and suggests targeted patches for human review. Reasons about code like a human researcher rather than matching known patterns. Impact: 10/10
Channels, Telegram and Discord (Mar 20) - MCP-based feature connecting your running Claude Code session to Telegram and Discord. Send instructions and receive results from your phone. Impact: 7/10
Voice Mode, Push-to-Talk (Mar) - Hold the spacebar and dictate coding instructions aloud in over 20 languages instead of typing. Impact: 5/10
/loop Scheduled Tasks (Mar) - Executes a prompt at regular intervals (lightweight cron job) for PR reviews, deployment monitoring, and other recurring work. Impact: 7/10
Remote Control from Mobile (Mar) - Monitor and direct your laptop’s local coding session from your smartphone or the web while away from your desk. Impact: 6/10
Background Subagents (Mar) - Spawns secondary AI workers to handle parallel coding tasks without cluttering your main workspace. Impact: 8/10
/effort and Ultrathink (Mar) - A keyword in your prompt temporarily activates high reasoning effort for a specific turn when deeper analysis is needed. Impact: 6/10
MCP Elicitation (Mar) - Allows MCP servers to request structured input during execution, displaying interactive forms to collect data without interrupting workflow. Impact: 7/10
128K Max Output Tokens (Feb/Mar) - Opus 4.6 defaults to 64K output with maximum capacity of 128K tokens. Generates much larger code files in a single response. Impact: 7/10
Credential Scrubbing (Mar) - Strips Anthropic and cloud provider credentials from subprocess environments to prevent accidental exposure. Impact: 8/10
Plugin Marketplace for Code (Feb/Mar) - Install bundled toolkits, server connections, and AI skills with a single click rather than configuring dependencies manually. Impact: 7/10

CLAUDE API (Developer Platform)

1M Token Context Window (Feb) - Feed an entire corporate knowledge base or massive codebase in a single API request. Impact: 9/10
Adaptive Thinking Mode (Feb 5) - Claude automatically decides whether and how deeply to engage extended reasoning on each request, replacing the old manual budget_tokens approach. Impact: 7/10
Context Compaction, Beta (Feb 5) - Automatic server-side context summarization so conversations continue effectively past the context window limit without manual truncation. Impact: 8/10
Effort Controls, 4 Levels (Feb/Mar) - A four-level dial (low to max) letting developers manually set how deeply the model thinks, trading off intelligence against cost and speed. Impact: 7/10
128K Output Tokens (Feb 5) - Generates massive single responses without splitting output across multiple API requests. Impact: 7/10
Automatic Prompt Caching (Feb/Mar) - Remembers recent conversation parts to reduce cost and latency without manual coding changes. Impact: 8/10
US-Only Inference (Feb) - A new inference_geo parameter routes API calls to U.S.-only infrastructure, meeting compliance requirements in healthcare, finance, and government. Impact: 8/10
Fast Mode, Beta (Feb 5) - A speed flag that delivers up to 2.5x faster output from Opus models at premium pricing, without changing the model’s intelligence. Impact: 7/10
Free Code Execution with Web Tools (Feb 5) - Code execution is now included at no extra charge when used alongside web search or web fetch tools, reducing cost for research workflows. Impact: 6/10
Dynamic Filtering for Web Search (Feb 5) - Claude writes and runs code to filter web search and web fetch results before they enter the context window, improving accuracy and cutting token usage. Impact: 6/10
Tools Graduating to GA (Feb 5) - Several previously beta API tools (code execution, web fetch, programmatic tool calling, tool search, memory tool) all moved to general availability. Impact: 7/10
Fine-Grained Tool Streaming, GA (Feb 5) - Fine-grained tool streaming now generally available on all models with no beta header required, enabling real-time visibility into tool calls. Impact: 7/10
Structured Outputs, GA (Early 2026) - Structured outputs moved to general availability, providing guaranteed JSON schema conformance in production without a beta header. Impact: 8/10
Skills API (Dec 2025/Jan 2026) - An open standard that lets skills built for Claude work across AI platforms, with organization-wide management for Team and Enterprise plans. Impact: 7/10

Our AI Adoption Specialists are conducting a lot of GenAI 201 and 301 instructor-lead training for ChatGPT, Copilot, Claude, and N8N. Contact us if you have this need.

Onward,
Paul

Resources and Media:

LinkedIn: Calendar: Learning Lab: HBR Article; Daily AI Show: GAI World 2026: TEDx: X/Twitter: TikTok

FAQ

Q: How does Claude compare to ChatGPT and Microsoft Copilot for enterprise productivity?

A: Claude now outperforms ChatGPT in Excel and PowerPoint add-ins and significantly exceeds Microsoft Copilot in both. Anthropic shipped more than 45 new features in 90 days, including file creation for Excel, PowerPoint, and PDF, persistent memory, and a desktop agent called Cowork that replaces workflow tools like Zapier and N8N. For non-technical employees, Claude Cowork requires no coding and executes multi-step tasks autonomously.

Q: What new Claude features matter most for enterprise AI adoption in 2026?

A: The highest-impact releases include Agent Teams for parallel task execution (9/10 enterprise value), Claude Code Security for autonomous vulnerability scanning (10/10), a 1 million token context window across Chat, Code, and API, and Computer Use for macOS that controls your screen without setup. Persistent memory, available to all users as of March 2026, retains project context and communication preferences across sessions.

Wow! Claude Has Released 45+ Features in the Last 90 Days

Paul Baier — Thu, 26 Mar 2026 21:29:30 GMT

Claude is on a tear

We believe the interest is fully warranted.

Claude Code is a standout product
Claude with Excel and PowerPoint currently outperforms ChatGPT’s Excel and PowerPoint capabilities, and far exceeds Copilot
Cowork is a task-oriented AI tool for nontechnical employees. Multiple companies are rethinking their use of tools like N8N, Zapier, and others as a result
Claude has aggressively shipped features similar to those in OpenClaw, the white-hot, open-source, personal AI assistant software

Anthropic Shipped 45+ Features in 90 Days

Anthropic has been shipping at a rapid pace Below is a list of features by product, each scored on a 1-10 impact scale for enterprise value. Several features include a link to a short product demo.

CLAUDE CHAT (Web & Mobile)

Opus 4.6 Model (Feb 5) - The most capable Claude model with 1M token context window, stronger multi-step reasoning, and top benchmark scores in legal, financial, and coding tasks. Impact: 9/10 (Video: analyze multiple files) (Video: turn messy notes into neat notes)
Sonnet 4.6 Model (Feb 17) - Full upgrade across coding, computer use, long-context reasoning, agent planning, knowledge work, and design with 1M token context window in beta. Impact: 8/10
Agent Teams (Feb 5) - Deploy multiple specialized AI agents to collaborate on different parts of a large project simultaneously. Impact: 9/10
Persistent Memory for All Users (Mar 2) - Claude retains your name, communication style, writing preferences, and project context across separate conversations. Impact: 7/10
Infinite Chats (Jan) - Eliminates context window limit errors by summarizing earlier messages when a chat approaches its limit. Impact: 6/10
Inline Visualizations (Mar 12) - Creates custom interactive charts, diagrams, and visualizations directly within chat responses. Impact: 7/10
File Creation: Excel, PPT, Docs, PDF (Jan/Feb) - Creates and edits Excel spreadsheets, PowerPoint decks, documents, and PDFs directly in the app. Impact: 8/10 [Maggie video Claude in Excel) (Video: create spreadsheet of contact from one company in my gmail)
Claude for Excel and PowerPoint Add-ins (Mar 11) - Add-ins now share full context across applications, support skills, and connect via LLM gateway to Bedrock, Vertex AI, or Microsoft Foundry. Impact: 8/10 (video of Claude with Powerpoint Add-in) (Video of FInancial modeling Claude vs ChatGPT)
Enterprise Analytics API (Feb) - Programmatic access to usage and engagement data for Claude and Claude Code Remote, aggregated per organization per day. Impact: 8/10
Mobile Interactive Apps (Mar) - Mobile app supports live charts, diagrams, and shareable visual assets rendered within the conversation. Impact: 5/10
Computer Use, macOS Preview (Mar 24) - Claude opens files, clicks through websites, and completes multi-step tasks on your screen using mouse and keyboard. Impact: 8/10

CLAUDE COWORK (Desktop Agent)

Cowork Launch (Jan) - Desktop agent for knowledge workers that reads files, executes multi-step workflows, and produces deliverables autonomously with no coding required. Impact: 9/10 (video overview of Cowork)
Plugin Marketplace with Admin Controls (Feb 17) - Third parties publish skills into the Claude Marketplace. Six enterprise partners live: GitLab, Harvey (legal), Rogo (finance), Snowflake, Lovable, Replit. Impact: 9/10
38+ Connectors (Feb) - Connectors for Gmail, Google Drive, Notion, Slack, Microsoft 365, and more. Impact: 8/10
Scheduled and Recurring Tasks (Feb 25) - Create and schedule both recurring and on-demand tasks from within Cowork. Impact: 8/10
Computer Use in Cowork (Mar 24) - Claude opens apps, navigates browsers, fills spreadsheets, and performs tasks on your screen with no setup required. Impact: 9/10
Dispatch, Cross-Device (Mar 17) - Persistent cross-device conversation that lets you assign tasks to your desktop Cowork session from your phone. Claude executes locally using your files and apps. Impact: 8/10

CLAUDE CODE (Developer CLI & Agent Platform)

Opus 4.6 as Default Model (Mar) - Opus 4.6 is now the default model for Claude Code, providing stronger reasoning and longer agentic task performance. Impact: 9/10 (video build analyze, export data)
1M Token Context Window (Mar) - One million token context for Max, Team, and Enterprise plans. Work with very large codebases without premature compaction. Impact: 9/10
Claude Code Security (Feb 20) - Scans codebases for security vulnerabilities and suggests targeted patches for human review. Reasons about code like a human researcher rather than matching known patterns. Impact: 10/10
Channels, Telegram and Discord (Mar 20) - MCP-based feature connecting your running Claude Code session to Telegram and Discord. Send instructions and receive results from your phone. Impact: 7/10
Voice Mode, Push-to-Talk (Mar) - Hold the spacebar and dictate coding instructions aloud in over 20 languages instead of typing. Impact: 5/10
/loop Scheduled Tasks (Mar) - Executes a prompt at regular intervals (lightweight cron job) for PR reviews, deployment monitoring, and other recurring work. Impact: 7/10
Remote Control from Mobile (Mar) - Monitor and direct your laptop’s local coding session from your smartphone or the web while away from your desk. Impact: 6/10
Background Subagents (Mar) - Spawns secondary AI workers to handle parallel coding tasks without cluttering your main workspace. Impact: 8/10
/effort and Ultrathink (Mar) - A keyword in your prompt temporarily activates high reasoning effort for a specific turn when deeper analysis is needed. Impact: 6/10
MCP Elicitation (Mar) - Allows MCP servers to request structured input during execution, displaying interactive forms to collect data without interrupting workflow. Impact: 7/10
128K Max Output Tokens (Feb/Mar) - Opus 4.6 defaults to 64K output with maximum capacity of 128K tokens. Generates much larger code files in a single response. Impact: 7/10
Credential Scrubbing (Mar) - Strips Anthropic and cloud provider credentials from subprocess environments to prevent accidental exposure. Impact: 8/10
Plugin Marketplace for Code (Feb/Mar) - Install bundled toolkits, server connections, and AI skills with a single click rather than configuring dependencies manually. Impact: 7/10

CLAUDE API (Developer Platform)

1M Token Context Window (Feb) - Feed an entire corporate knowledge base or massive codebase in a single API request. Impact: 9/10
Adaptive Thinking Mode (Feb 5) - Claude automatically decides whether and how deeply to engage extended reasoning on each request, replacing the old manual budget_tokens approach. Impact: 7/10
Context Compaction, Beta (Feb 5) - Automatic server-side context summarization so conversations continue effectively past the context window limit without manual truncation. Impact: 8/10
Effort Controls, 4 Levels (Feb/Mar) - A four-level dial (low to max) letting developers manually set how deeply the model thinks, trading off intelligence against cost and speed. Impact: 7/10
128K Output Tokens (Feb 5) - Generates massive single responses without splitting output across multiple API requests. Impact: 7/10
Automatic Prompt Caching (Feb/Mar) - Remembers recent conversation parts to reduce cost and latency without manual coding changes. Impact: 8/10
US-Only Inference (Feb) - A new inference_geo parameter routes API calls to U.S.-only infrastructure, meeting compliance requirements in healthcare, finance, and government. Impact: 8/10
Fast Mode, Beta (Feb 5) - A speed flag that delivers up to 2.5x faster output from Opus models at premium pricing, without changing the model’s intelligence. Impact: 7/10
Free Code Execution with Web Tools (Feb 5) - Code execution is now included at no extra charge when used alongside web search or web fetch tools, reducing cost for research workflows. Impact: 6/10
Dynamic Filtering for Web Search (Feb 5) - Claude writes and runs code to filter web search and web fetch results before they enter the context window, improving accuracy and cutting token usage. Impact: 6/10
Tools Graduating to GA (Feb 5) - Several previously beta API tools (code execution, web fetch, programmatic tool calling, tool search, memory tool) all moved to general availability. Impact: 7/10
Fine-Grained Tool Streaming, GA (Feb 5) - Fine-grained tool streaming now generally available on all models with no beta header required, enabling real-time visibility into tool calls. Impact: 7/10
Structured Outputs, GA (Early 2026) - Structured outputs moved to general availability, providing guaranteed JSON schema conformance in production without a beta header. Impact: 8/10
Skills API (Dec 2025/Jan 2026) - An open standard that lets skills built for Claude work across AI platforms, with organization-wide management for Team and Enterprise plans. Impact: 7/10

Our AI Adoption Specialists are conducting a lot of GenAI 201 and 301 instructor-lead training for ChatGPT, Copilot, Claude, and N8N. Contact us if you have this need.

Onward,
Paul

Resources and Media:

LinkedIn: Calendar: Learning Lab: HBR Article; Daily AI Show: GAI World 2026: TEDx: X/Twitter: TikTok

FAQ

Q: How does Claude compare to ChatGPT and Microsoft Copilot for enterprise productivity?

Q: What new Claude features matter most for enterprise AI adoption in 2026?

2 Reasons I Turned Off My OpenClaw

Paul Baier — Mon, 23 Mar 2026 13:45:45 GMT

OpenClaw is not ready for business use.

I reached that conclusion after spending 100 hours and $1,000 testing it over two months. As a nontechnical CEO, I wanted to see if the hype was real. The category of personal AI assistants for work and home persona is real and growing, but OpenClaw’s bugs and security gaps disqualify this product.

I now use Anthropic’s Claude with my confidential company data as my personal AI assistant. In two months, Claude has replicated roughly 30% of OpenClaw’s features with more features coming.

Paul “buries” his OpenClaw

Background on My OpenClaw Setup

OpenClaw is one of the hottest AI products right now. The numbers tell the story:

OpenClaw earned 326,000 GitHub stars in three months
Linux: received 224,000 GitHub stars in 14 years

OpenClaw surpassed Linux in a fraction of the time, making it the fastest-adopted open-source project in human history. But popularity does not equal readiness. It was not designed to be secure, and multiple reports of security issues have surfaced since its launch.

I purchased a used MacBook Pro from Amazon for $500 and set up new accounts from scratch: a new Apple ID, new email, and new accounts with Claude and OpenAI. I kept my primary machine and business data completely isolated.

As a nontechnical person, I relied on Claude Code to walk me through the setup and overcome technical challenges. I kept one window open with Claude Code and another with Terminal. I followed instructions to download the software, configure API connections, and troubleshoot errors. When I got stuck, I took a screenshot and pasted it into Claude Code. Claude Code told me what to do next (I often felt like a junior intern doing the bidding of an all-knowing AI.).

I got OpenClaw working and had it performing multiple tasks:

Sending me a daily weather report
Researching competitors
Summarizing relevant news

Reason 1: Too Many Bugs for a Nontechnical User

OpenClaw is a developer product with raw, untested code. I am not a computer science graduate with 15 years of coding experience. I got it set up, and I know others have too, but it kept breaking. I used Claude Code to debug issues, but errors persisted. APIs broke. Messages broke. About half my time went to fixing things rather than getting work done.

I thought this was a “me” problem. Then I attended two OpenClaw hackathons in Boston. Half the attendees at both events had the same issues. This is not a product built for business use today.

Reason 2: OpenClaw Is Not Secure

Productivity improvement at work requires secure access to confidential data in email, calendars, SharePoint, CRM, etc, but OpenClaw was not designed with that requirement in mind.

OpenClaw has access to a wide range of your confidential data, and the security risks are significant:

Unrestricted system access across your machine
High susceptibility to indirect prompt injection
Unmoderated skills and plug-ins
Plaintext credential exposure
Silent data exfiltration

Efforts are underway to improve security, but the risk is too high currently. If you are handling client data, financial records, or proprietary strategy documents, this gap is disqualifying.

A video of my OpenClaw Burial Ceremony

My Plan Going Forward: Claude as My Personal AI Assistant at Work

Was OpenClaw a bust? No. It validated that personal AI assistants for both work and home persona are a real and growing category. Dozens of startups are building in this space. Jensen Huang said at his recent user conference that every CEO needs an OpenClaw strategy. In my view, that means every CEO needs a personal AI assistant strategy for employees.

I now use Claude Code, Claude Cowork, and Claude Chat. Claude has replicated roughly 30% of the features I had running in OpenClaw. The other 70% is still missing, particularly around proactive task scheduling and multi-app orchestration. But I now have the ability to:

Run tasks with files on my laptop
Access my confidential work email and calendar
Pull from our HubSpot
Track and prioritize tasks list for myself and individual projects.

My immediate goal is a daily briefing that truly adds value and we are 90% there. Here is snapshot of my daily brief which pulls from my email, calendar, Hubspot, Slack and To do list

Congratulations to Peter Steinberger and the OpenClaw team for showing the industry what personal AI assistants look like, including the WhatsApp messaging integration.

If you are a nontechnical executive evaluating personal AI assistants, here is my advice:

Do not put confidential business data into any tool that lacks enterprise-grade security
Start with Claude
Monitor the developments with OpenAI ChatGPT, Google Gemini and Microsoft’s Copilot, and Anthropic’s Claude as a year from now, the most innovative and leading firms will insist that all their key employees be using a personal AI assistant for work.

My friend Ted Dintersmith has written a new book Aftermath that does for math what Freakonomics did for economics. It illuminates the powerful math ideas that define our lives — ideas never covered in school. For over a century and to this day, high school students spend thousands of hours on rote math esoterica that they’ll use as adults. How often are you factoring a polynomial, or using the Chain Rule, or deploying an arc secant. Imagine if those thousands of hours were redirected to mastering consequential math ideas and developing AI expertise. This choice will make or break the futures of millions of kids and thousands of communities. Pre-order here

Our AI Adoption Specialists are conducting a lot of GenAI 201 and 301 instructor-lead training for ChatGPT, Copilot, Claude, and N8N. Contact us if you have this need.

Onward,
Paul

Resources and Media:

LinkedIn: Calendar: Learning Lab: HBR Article; Daily AI Show: GAI World 2026: TEDx: X/Twitter: TikTok

FAQ

Question: Should a nontechnical executive consider using open-source AI assistants like OpenClaw in a business environment today?

Answer: Not yet. While open-source tools like OpenClaw show strong potential, they are still unstable and require significant technical effort to maintain. Frequent bugs and ongoing troubleshooting can consume more time than the value they deliver, making them impractical for most business leaders today.

Question: What are the biggest risks of connecting an AI assistant to confidential business data?

Answer: The primary risk is lack of security controls. Tools not designed for enterprise use may expose sensitive data through weak safeguards, including unrestricted system access, vulnerable integrations, and potential data leaks. For any company handling client, financial, or strategic data, this risk outweighs the productivity benefits.

Question: What is a practical way for executives to start using AI assistants safely and effectively?

Answer: Begin with tools that offer enterprise-grade security and controlled access to data, such as Claude from Anthropic. Test them in a limited scope, connect only essential systems, and focus on clear use cases like daily briefings or task prioritization. This approach delivers value quickly while minimizing operational and security risk.

How Wix Reclaimed 675 Engineering Hours Every Month | Essential AI News for Mar 16-20

Paul Baier — Sun, 22 Mar 2026 14:06:30 GMT

Hello everyone,

(Personal note: 85% of my AI usage has moved from ChatGPT to Claude in the last 60 days.)

Essential AI News Mar 16-20

Last week, our AI analysts rated 7 articles as “Essential” reads.

Lessons from Building Claude Code: How We Use Skills

Ratings Rationale: This article explains how Anthropic structures “skills” as reusable, modular agent capabilities (e.g., scripts, memory, and data hooks) to accelerate development in Claude Code and agentic systems. Our analysts highlight that this is one of the most practical frameworks for enterprise AI, showing how skill abstraction and composability drive organizational leverage and scalable agent workflows beyond simple chat interfaces.

SkillNet: Create, Evaluate, and Connect AI Skills

Ratings Rationale: This paper introduces an open infrastructure for AI skills that aims to let agents create, evaluate, organize, and reuse capabilities instead of repeatedly solving the same problems from scratch. Our analysts treated this as a strategic development because a shared ontology, a repository of more than 200,000 skills, and reported gains in reward and efficiency point toward more scalable agent architectures and stronger long-term enterprise value.

When AI Becomes Your On-Call Teammate: Inside Wix's AirBot That Saves 675 Engineering Hours a Month Serving 250 Million Users

Ratings Rationale: This use case details a mature generative AI operations use case in which Wix’s AirBot helps diagnose incidents, analyze logs, and even generate remediation pull requests, saving an estimated 675 engineering hours per month. Our analysts highlighted it as a rare example of agentic AI at production scale, showing how structured workflows, secure integrations, and tiered model selection can move AI automation beyond simple assistance into meaningful operational leverage.

Researchers Asked LLMs for Strategic Advice. They Got “Trendslop” in Return (Harvard Business Review)

Ratings Rationale: This HBR article examines how large language models can produce polished but systematically biased strategic recommendations, making them unreliable as stand-alone advisers for executive decision-making. Our analysts emphasized that this is a silent failure mode in AI use: the models tend to reflect fashionable consensus rather than balanced strategic reasoning, so leaders need to use them to expand options and challenge assumptions rather than outsource judgment.

Balancing AI Tensions: Moving From AI Adoption to Effective SDLC Use

Ratings Rationale: This article explores how organizations are transitioning from initial AI adoption to effectively integrating Generative AI into the software development lifecycle (SDLC), highlighting trade-offs between speed and quality. Our analysts emphasized findings from over 1,000 Google developers showing that while AI-driven coding accelerates prototyping, verification and integration remain bottlenecks, resulting in modest net productivity gains and shifting focus toward code validation and governance.

Coding After Coders: The End of Computer Programming as We Know It (New York Times)

Ratings Rationale: This New York Times article examines how generative AI and coding agents are changing software development from line-by-line programming into a workflow centered on prompting, testing, supervising, and verifying machine-generated code. Our analysts highlighted that this shift is already visible among working developers, compresses software timelines dramatically, and serves as an important signal for how other high-value knowledge work may evolve next.

How Steering Hooks Achieved 100% Agent Accuracy Where Prompts and Workflows Failed

Ratings Rationale: This article explains how Strands improved agent reliability by replacing long, fragile prompt chains with steering hooks that govern agent behavior at the action and tool-call level. Our analysts saw this as essential because it addresses a core agentic AI problem: prompt-heavy workflows become brittle over time, while deterministic guardrails that intercept and verify external tool use can materially improve accuracy, safety, and production readiness for long-running agents.

Would you value a email customized for your company or AI Center of Excellence? As us about our Navigator product.

Join our friends at InnoLead for their annual conference, Impact 2026, June 8-10 in Cambridge, Mass. Register here

Haven’t been to Impact before? Past participants call it the world’s most practical event for anyone working in strategy, innovation, transformation, R&D, and emerging tech inside enterprises and organizations. Tickets to this event are limited to people currently holding roles in private or public companies; nonprofits and NGOs; and government agencies.

This June, Impact will have four tracks for you to choose from:

The future of innovation in the AI age
Startup/corporate collaboration, partnerships, external innovation
Doing more with less: The new imperative for internal innovation programs
Birds-of-a-feather discussion sessions for specific industries, and hot button issues — suggested by you! (And entirely confidential!)

And the event includes site visits to Google Cambridge, Greentown Labs, the Robotics and AI Institute, Harvard’s D^3 Institute, and more.

Our AI Adoption Specialists are conducting a lot of GenAI 201 and 301 instructor-lead training for ChatGPT, Copilot, Claude, and N8N. Contact us if you have this need.

Onward,
Paul

Resources and Media:

LinkedIn: Calendar: Learning Lab: HBR Article; Daily AI Show: GAI World 2026: TEDx: X/Twitter: TikTok

FAQ

Question: How can organizations scale AI beyond isolated use cases into repeatable, enterprise-wide capabilities?

Answer: The most effective approach is to treat AI capabilities as reusable “skills” that can be shared, improved, and combined across teams. Instead of rebuilding solutions each time, organizations create modular components—such as data connectors, workflows, and evaluation steps—that can be reused in new contexts. This reduces development time, improves consistency, and allows AI investments to compound over time.

Question: What separates AI pilots from AI systems that deliver real operational impact at scale?

Answer: Production impact comes from structured workflows, strong integrations, and clear accountability—not just model quality. Successful systems, like AI-powered operations tools, combine multiple steps such as data retrieval, analysis, and action into a controlled process. They also include safeguards, validation, and measurable outcomes, ensuring the system consistently saves time or improves performance in real business environments.

Question: What risks should executives consider when relying on AI for strategic or technical decision-making?

Answer: AI can generate confident, well-written answers that reflect popular thinking rather than balanced judgment. This creates a risk of reinforcing trends instead of uncovering better options. To mitigate this, leaders should use AI to explore scenarios and challenge assumptions, while keeping human oversight for final decisions and implementing validation steps for any high-impact outputs.

Every CEO Must Have a OpenClaw Strategy - per Nvidia's CEO, Jensen Huang

Paul Baier — Wed, 18 Mar 2026 17:48:23 GMT

OpenClaw and secure personal AI assistants for all your employees are coming this year. This a major change in enterprise IT architecture.

CEO Jensen Huang stated this week how profoundly important OpenClaw is and introduced a new Nvidia product called NemoClaw to make it secure for enterprises.

“OpenClaw’s importance is profound… it is the most successful open source project in the history of humanity and it did so in just a few weeks.”
“It exceeded what Linux did in 30 years'“

"OpenClaw has essentially open-sourced the operating system for agentic computers. Windows made it possible for us to use personal computers. OpenClaw has made it possible for us to create personal agents”

“For every CEO, what is your OpenClaw strategy … we all needed a Linux strategy, an http strategy…”
“OpenClaw is the new computer….the Enterprise AI architecture has changed”

NemoClaw is a new Nvidia product that makes OpenClaw secure for the enterprise

I encourage you to watch Jensen’s 18 minute discussion of OpenClaw here.

Here is update #8 on my OpenClaw project. My OpenAI usage has dropped as I increasingly use Claude, which has matched some of the personal assistant features in OpenAI and allows me to securely integrate with my work email, calendar, CRM and other tools. The initial OpenAI product is not secure enough to connect to my work data. The battle for leadership in secure personal AI assistants in the workplace has just begun. OpenAI, Anthropic, Google, Microsoft, Nvidia and dozens of startups are working to develop secure personal AI assistants for the enterprise. 2026 is fast becoming the year personal AI assistants are securely used en masse in the enterprise.

Check out this great webinar next Wed, March 25 by Seekr. Register here.

Join us Monday April 6, at 7p ET / 4p PT to see a demo of OpenClaw (a Personal AI Agents) and to discuss the implications for companies. Claude is already matching some of these features and ChatGPT is expected to as well. Register here.

Onward,
Paul

Resources and Media:

LinkedIn: Calendar: Learning Lab: HBR Article; Daily AI Show: GAI World 2026: TEDx: X/Twitter: TikTok

I Built a Full Nvidia Financial Model Using Claude (Much Better Than ChatGPT). Here’s How.

Paul Baier — Tue, 17 Mar 2026 15:41:33 GMT

Claude Built A Financial Model in Minutes

I used Claude to build a three-scenario Nvidia income statement projection from scratch. The process included analyzing four quarters of historical earnings data, creating a requirements document and generating a multi-tab Excel model. It took minutes, not hours

No one 12 months ago foresaw this level of AI capability in financial modeling. AI is not perfect, but it is remarkably good.

Claude vs. ChatGPT for Excel

Claude was first to release this Excel functionality. I tested both tools on the same task. Claude won handily in three areas:

It handled multi-tab spreadsheet logic with fewer errors.
It produced cleaner formulas that required less manual correction.
It followed complex financial modeling instructions more accurately.

I fully expect ChatGPT will improve materially in the next six months. For now, Claude leads in spreadsheet work and is my go-to AI spreadsheet tool (every investment firm that we are working with is rapidly deploying Claude now).

My Process

I used a two-step approach:

Step 1: Create a Requirements Document

I asked Claude to interview me one question at a time to build a detailed Requirements Document. This structured approach produces better output than a single long prompt.

Input: interview prompt
Output: Requirements Document saved as a markdown file (a type of text file)

Step 2: Build the Financial Model

I created a new chat and uploaded the Requirements Document along with Nvidia’s historical financials. Claude generated the Excel model directly.

Input: prompt + Requirements Document + Historical Financials
Output: financial model in Excel

How to Access This Feature

Claude for Excel can be accessed three ways:

Browser version of Claude
Claude Desktop Application
Claude for Excel Add-in

Recommendations For Your Company

Have your CFO and Excel power users test Claude with financial models this month. The capability is here now.
Rethink timelines. Financial planning cycles that took analysts a week can now produce a working first draft in under an hour.
Accelerate AI training across your finance teams. Your competitors are already doing this.

If your organization needs 201 or 301 training on ChatGPT, Claude or Copilot, contact us at GAI Insights. Our AI Adoption Specialists are running these AI training programs now with multiple companies and investment firms.

Attend a massive AI Summit at MIT Media Lab on Friday Apr 10. The speaker lineup is amazing. Tickets free but you must apply. I’m moderating 2 panels. Info here:

Confirmed speakers include
- Mark Benioff, Founder, Salesforce
- Ray Stata, Analog Devices
- Bing Gordon, Kleiner Perkins
- Chase Lochmiller, Crusos
- Ian Robertson, American Securities
- Adam Starr, CIO, U.S. Office of Personnel Management
- Matt Quinn, CTO, Car Gurus
- Ray Meadows, Lovable
- Jack Shelby, Thiel Capital
- Mario Bollini, Boston Dynamics
- Daniela Rus, MIT CSAIL
- Peter Danenberg, Google Deepmind
- Yossi Matias, Google

+ many more

Onward,
Paul

Resources and Media:

LinkedIn: Calendar: Learning Lab: HBR Article; Daily AI Show: GAI World 2026: TEDx: X/Twitter: TikTok

FAQ

Question: How can AI tools like Claude or ChatGPT reduce the time required for financial modeling without sacrificing accuracy?

Answer: AI can automate the first draft of a financial model by structuring statements, applying formulas, and organizing assumptions in minutes. Accuracy improves when teams provide clear inputs, validate outputs, and use AI as a starting point rather than a final answer. This shifts analyst time from building models to reviewing and refining them.

Question: What is the practical difference between AI tools when applied to finance workflows like Excel modeling?

Answer: Differences show up in how well each tool handles complex instructions, maintains formula consistency, and manages multi-tab logic. Some tools currently perform better with structured spreadsheet tasks, which can reduce rework. Testing multiple tools on the same use case is the fastest way to determine which performs best for your team.

Question: How should a finance organization adopt AI for modeling without disrupting existing processes?

Answer: Start by using AI to generate initial drafts of models that analysts already build manually. Keep existing review and approval steps in place while measuring time saved and error rates. This allows teams to improve speed immediately while maintaining control, then expand usage once results are consistent.

AI Runs All Night And Does 126 Experiments | Essential AI News for Mar 9-13

Paul Baier — Sun, 15 Mar 2026 14:20:39 GMT

Hello everyone,

We help you stay “AI Current” as the AI adoption gap widens and firms worry about falling behind their competitors.

Essential AI News Mar 9-13

Last week, our AI analysts rated 8 articles as “Essential” reads.

The Shape of the Thing (Ethan Mollick)
Ratings Rationale: This article argues that AI has moved from co-intelligence toward delegated intelligence, where agents can increasingly take on substantial blocks of software and knowledge work with limited human involvement. Our analysts highlighted the StrongDM software factory example (with two radical rules: “Code must not be written by humans” and “Code must not be reviewed by humans) and the rise of recursive self-improvement loops as signs that this transition is already beginning, making it important for AI leaders to shape strategy now rather than wait until the change is fully upon them.

Andrej Karpathy's New Open Source 'Autoresearch' Lets You Run Hundreds of AI Experiments A Night — With Revolutionary Implications

Ratings Rationale: Andrej Karpathy has released an open-source autoresearch framework that allows AI systems to autonomously run thousands of experiments to improve models or workflows, using validation metrics to iterate and refine results. Our analysts stressed that this approach demonstrates how AI systems can self-improve through continuous experimentation loops, enabling large-scale automated discovery that could accelerate progress in large language models, AI automation, and enterprise decision processes.

Bringing Code Review to Claude Code

Ratings Rationale: Anthropic has introduced a new AI-powered code review capability in Claude Code, allowing multiple specialized AI agents to automatically analyze pull requests, detect bugs, and flag risky code changes before deployment. Our analysts emphasized that as generative AI dramatically increases software output, manual code review becomes the bottleneck, making automated review critical for maintaining quality and security while scaling AI-driven software development.

Anthropic’s AI Hacked the Firefox Browser. It Found 14 Sev 1 Security Bugs

Ratings Rationale: Anthropic used Claude Opus 4.6 in a security collaboration that uncovered 22 Firefox vulnerabilities in two weeks, including 14 high-severity issues, after scanning nearly 6,000 C++ files and generating 112 reports; most fixes shipped in Firefox 148. Our anaysts treated this as a major AI security signal because it shows frontier large language models can materially accelerate vulnerability discovery in mature codebases, making it directly relevant for enterprises with proprietary or mission-critical software.

Anthropic's Compute Advantage: Why Silicon Strategy is Becoming an AI Moat

Ratings Rationale: This article argues that Anthropic’s multi-prong compute strategy across Google TPUs, AWS Trainium2, and Nvidia GPUs could create a meaningful cost and iteration advantage as AI inference scales, with an estimated 30–60% lower token costs on optimized workloads. Our analysts viewed this as strategically important because silicon economics, supplier leverage, and long-term compute flexibility can materially shape model pricing, deployment velocity, and vendor lock-in for enterprise AI leaders choosing an LLM platform.

How Coding Agents Are Reshaping Engineering, Product and Design

Ratings Rationale: This post explains how AI coding agents and 'vibe coding' workflows are transforming the software development lifecycle, including how product requirements documents (PRDs) evolve in an AI-assisted engineering environment. Our analysts highlighted that AI agents now allow teams to generate prototypes directly through conversation, meaning requirements documentation becomes a dynamic interface between humans and AI systems, fundamentally changing engineering, product management, and design collaboration.

Microsoft Copilot Cowork: A New Way of Getting Work Done

Ratings Rationale: Microsoft’s new Copilot Cowork positions Microsoft 365 Copilot as an AI execution layer that can turn a user request into a plan, ground it in emails, meetings, files, and data, and then carry out work across apps such as Outlook, Teams, and Excel within enterprise security and governance boundaries. Our analysts viewed this as essential because CIOs in Microsoft-heavy environments will have to track it closely, even though it is still an early product announcement and access is initially tied to the Frontier program rather than broad availability.

New NVIDIA Nemotron 3 Super Delivers 5x Higher Throughput for Agentic AI

Ratings Rationale: NVIDIA has introduced Nemotron 3 Super, an open-weight, open data, large language model designed for agentic AI systems, delivering up to 5× higher throughput using a hybrid mixture-of-experts architecture combined with innovations like Mamba layers and latent MoE reasoning techniques. Our analysts highlighted its strategic relevance for AI leaders thinking about cost efficiency and scaling agentic systems across their enterprises.

Would you value a email customized for your company or AI Center of Excellence? As us about our Navigator product.

Claude’s Excel Plugin is AMAZING. ChatGPT released its plug in recently (about 75% as good as Claude). Claude for Excel is a must try for your Excel power users.

Our AI Adoption Specialists are conducting a lot of GenAI 201 and 301 instructor-lead training for ChatGPT, Copilot, Claude, and N8N. Contact us if you have this need.

Onward,
Paul

Resources and Media:

LinkedIn: Calendar: Learning Lab: HBR Article; Daily AI Show: GAI World 2026: TEDx: X/Twitter: TikTok

FAQ

Question: How quickly will AI agents begin taking over meaningful blocks of software and knowledge work?

Answer: The shift has already started. Modern AI systems are moving from assisting employees to completing entire tasks such as writing software, analyzing data, or generating reports with limited human input. Examples like automated software factories and AI research loops show that companies should begin planning for AI-driven workflows now, rather than waiting until the technology becomes standard across competitors.

Question: Can AI systems actually improve themselves through experimentation?

Answer: Yes. New frameworks allow AI systems to run thousands of experiments automatically, evaluate the results, and refine their own methods. This continuous experimentation loop can dramatically speed up progress in model performance, automation workflows, and operational decision-making. For executives, this means innovation cycles may compress from months to days, creating competitive pressure to adopt similar automated discovery processes.

Question: How can organizations scale AI-generated software without increasing security or quality risks?

Answer: As AI accelerates software development, automated oversight becomes essential. AI-powered code review tools can scan large codebases, detect bugs, and flag security vulnerabilities before deployment. In some cases, AI has already discovered critical vulnerabilities in mature systems within days. Combining AI coding with AI review allows companies to scale development while maintaining strong quality and security controls.

How Many AI Tools Do We Need To Buy For Our Employees ?

Paul Baier — Wed, 11 Mar 2026 22:15:59 GMT

Firms face a common challenge: how many AI productivity tool licenses to purchase. Options range from standalone tools (ChatGPT, Claude Code) to embedded AI (M365 Copilot, AgentForce in Salesforce).

Total annual cost per employee equals the license fee (assuming a closed-model AI product with credits) plus training costs, which range from $0 to $1,000 depending on frequency and format.

Here are several investment scenarios for different size companies (yearly direct costs includes license and training).

For some firms, the investment is obvious. Hedge funds, asset managers and software companies have already deployed multiple tools across their entire workforce. Other companies are taking a more measured approach.

Regardless of your strategy, the reality is clear: your top talent and new recruits expect access to a secure, enterprise AI chatbot. Yes, IT cost per employee will permanently increase. Over time, that cost will be offset as your workforce shifts to a blended model of humans and AI software agents, driving higher revenue per employee.

BOSTON: Join us for in-person OpenClaw jam session March 14 at Microsoft NERD. Register here.

Attend the massive AI Summit April 10 at MIT Media Lab. Ticket are free but you must apply here. Here are just a handful of the confirmed speaker.

Our team is doing a lot of department-specific AI 201 and 301 instructor-lead training for ChatGPT, Copilot, Claude and N8N. Contact us if you have this need.

Onward,
Paul

Resources and Media:

LinkedIn: Calendar: Learning Lab: HBR Article; Daily AI Show: GAI World 2026: TEDx: X/Twitter: TikTok

Security Flaw in Copilot | Essential AI News for Feb 23-27

Paul Baier — Sun, 01 Mar 2026 19:11:01 GMT

Hello everyone,

We help you stay “AI Current” which is becoming more critical as the adoption gap widens and leaders in each industry start to pull ahead.

Essential AI News Feb 23-27

Last week, our team of AI analysts rated 7 articles as “Essential” reads.

Microsoft Copilot ignored sensitivity labels twice in eight months — and no DLP stack caught either one

Ratings Rationale: A Microsoft 365 Copilot Chat bug allowed Copilot to process and summarize emails that were protected by sensitivity labels and Data Loss Prevention (DLP) policies, creating an “inside the trust boundary” AI security failure that traditional controls didn’t detect. Our analysts emphasized how this undermines enterprise trust in AI assistants (especially for highly sensitive workflows) and highlighted the need for new AI governance tools and monitoring practices beyond standard endpoint and firewall-style defenses.

Cowork and Plugins for Teams Across the Enterprise

Ratings Rationale: This update from Anthropic introduces enterprise-focused plugins for Cowork that bundle skills, connectors, slash commands, and sub-agents into plugins, so teams can standardize repeatable workflows across tools and data sources. Our analysts highlighted how these pre-built, role-based templates plus admin controls (who can use what, where) make agentic work far more practical inside Google Workspace, Microsoft environments, and beyond.

AI's Big Payoff Is Coordination, Not Automation (Harvard Business Review)

Ratings Rationale: This HBR article argues that the biggest economic impact of AI will come from reducing coordination (or translation) costs—using AI systems and agents to pull context from many tools, data types, and workflows, so teams don’t have to act as human routers across silos. Our analysts highlighted that this context layer is where agentic AI becomes truly valuable, connecting systems like CRM, ticketing, and operational data to answer questions and drive decisions, making it a practical framing for AI leaders trying to unlock next-level productivity at organizational intersections.

AI Is Upending Marketing on Two Fronts (Harvard Business Review)

Ratings Rationale: This HBR articles argues that generative AI is reshaping marketing through two simultaneous shifts: how consumers search (moving from traditional search results toward conversational AI answers) and who increasingly drives purchasing decisions (more automated, AI-influenced decisioning). Our analysts emphasized that many marketing teams still aren’t prepared for the “search revolution” from SEO → GEO (generative engine optimization) and that the practical advice such as starting with an audit of exposure and adapting lead-gen visibility, makes it a must-read for leaders.

Nano Banana 2: Combining Pro Capabilities with Lightning-fast Speed

Ratings Rationale: Google is rolling out Nano Banana 2 (Gemini 3.1 Flash Image), an AI image generation and editing model that aims to deliver Pro-level capabilities at Flash speed, including rapid iteration, precision text rendering and translation, higher-fidelity outputs, and broad availability across products like Gemini, Search/Lens, Ads, and Vertex AI. Our analysts emphasized that lower unit cost and faster generation can bend the curve by unlocking new image-creation and editing use cases at scale—especially for storytelling, scientific illustration, and slide and asset workflows embedded throughout Google’s ecosystem.

Vambe Powers Conversational Commerce across Latin America with Claude

Ratings Rationale: This case study highlights how Vambe is using Claude to power large-scale conversational commerce across Latin America, enabling AI-driven customer engagement via messaging platforms like WhatsApp. The deployment demonstrates measurable impact, including up to a 60% reduction in healthcare clinic no-shows showing how Generative AI and chatbot with AI conversational capabilities are delivering real operational and financial results at scale. Our analysts emphasized that this is conversational AI moving from experimentation to production-grade outcomes, particularly in multilingual, mobile-first markets where AI automation directly improves revenue cycles and service delivery.

The State of Organizations 2026: Three tectonic forces that are reshaping organizations (McKinsey)

Ratings Rationale: This report synthesizes a large global survey of senior executives to explain how AI and automation, economic/geopolitical disruption, and shifting workforce expectations are collectively forcing organizations to rethink operating models, leadership, and performance management. Our analysts noted that even if the themes aren’t “net new,” leaders consistently ask “where are we vs. peers?” and “what does this mean for our org,” and this report provides digestible stats and framing that can directly support strategy and funding conversations.

Would you value a email customized for your company or AI Center of Excellence? As us about our Navigator product.

Join us tomorrow, Monday Mar 2, at 7p ET / 4p PT to see demos of the 2 hottest AI tools: Claude Code and OpenClaw. Register here (310 folks have already signed up)

BOSTON: Join us for in-person OpenClaw jam session March 14 at Microsoft NERD. Register here.

Our team is doing a lot of GenAI 101, 201 and 301 instructor-lead training for ChatGPT, Copilot, Claude and N8N, especially for investment management firms. Contact us if you have this need.

Onward,
Paul

Resources and Media:

LinkedIn: Calendar: Learning Lab: HBR Article; Daily AI Show: GAI World 2026: TEDx: X/Twitter: TikTok

FAQ

Question: If AI tools like Copilot can bypass sensitivity labels and DLP controls, how should executives rethink AI governance?

Answer: Traditional security tools were designed for endpoints, email gateways, and network traffic—not AI systems operating inside trusted environments. Executives should require AI-specific monitoring, clear audit trails of what data AI systems access and summarize, and defined approval workflows for high-sensitivity use cases. Governance must expand from perimeter defense to visibility into how AI interacts with internal data, with clear accountability for oversight.

Question: Where is the real economic payoff of AI inside large enterprises—automation or something else?

Answer: The biggest gains often come from reducing coordination costs, not just automating tasks. AI becomes powerful when it connects data across CRM, ticketing systems, marketing platforms, and operational tools so teams no longer spend time translating, forwarding, or reconciling information. Leaders should prioritize AI projects that eliminate cross-functional friction, because that is where productivity and decision speed improve most.

Question: How should marketing leaders respond to the shift from traditional search to AI-driven answers?

Answer: Start by auditing how your brand appears in conversational AI tools, not just in search rankings. Identify which high-value queries now generate AI summaries and assess whether your content is cited, paraphrased, or ignored. Then adapt content strategy to focus on clear, authoritative answers that AI systems can reference. This shift from SEO to generative engine optimization directly affects visibility, lead flow, and revenue.

I ask AI to Find All My Contacts at a Specific Company in Hubspot and Gmail

Paul Baier — Sat, 28 Feb 2026 14:27:54 GMT

The personal assistant features pioneered by open-source tool OpenClaw are now arriving in commercially supported AI products like Claude.

In this demo, I show Claude searching my Gmail and HubSpot for all contacts at a specific company, then generating an Excel spreadsheet from the results. I use Claude for this task because it produces far better Excel output than ChatGPT, Gemini, or Copilot.

Let me know your questions and what you are learning.

Claude Code and OpenClaw are two of the most practical AI tools available today. Claude Code is an AI software coding agent. OpenClaw is an open-source AI agent that works like a digital employee on your Mac.

Join us this Monday, March 2 at 7p ET / 4p PT for a live demo of both tools by Ashish Bhatia, Senior Product Manager at Amazon Audible, followed by a discussion of business implications.

231 people have already registered. Reserve your spot now (no cost).

BOSTON: Join us at GAI Insights sponsored “Boston OpenClaw Hackathon” Saturday March 14 at the Microsoft NERD center. Signup here. Tickets are free but this will sell out.

Onward,

Paul