Safe Intelligence

Building Trustworthy AI: Highlights from AAAI

Benedikt Brückner — Thu, 12 Feb 2026 13:32:53 +0000

As AI systems move from research labs into safety-critical deployment in areas such as self-driving cars or autonomous aviation, the question is no longer just “what can this model do?” but “can we trust it to do it reliably?”

This was the main theme for us at this year’s AAAI Conference on Artificial Intelligence. It is always one of the most significant events on the calendar, covering the full spectrum of AI research. For the Safe Intelligence team, it was an incredibly productive week of presenting our latest research, attending tutorials as well as poster sessions, and exchanging ideas with the other researchers.

Here is a look at what we presented and the key takeaways that inspired us.

Our Contributions: Certifying Robustness in the Real World

We presented two papers this year, both focusing on a shared challenge: making computer vision models robust against the messy, unpredictable nature of the real world.

Certified Background-Invariant Training

(Presented at the Workshop on AI for Air Transportation)

In object detection tasks, models often cheat by relying on background context rather than the object itself. In safety-critical fields like aviation, this is dangerous. A plane is still a plane, regardless of the weather behind it, and e.g. an image detector should still be able to detect that plane. We proposed a new method combining Variational Autoencoders (VAEs) and certified training to induce provable robustness against background variations. Our experiments on aircraft detection showed that this approach improves generalisation and guides the network to focus on the object, not the noise.

Defending Models Against Input Blurring

(Presented at the Workshop on AI for Cyber Security)

Real-world cameras shake, vibrate, and blur. Standard models often fail when faced with this motion blur, yet traditional defenses like adversarial training lack formal guarantees. We introduced a novel Certified Training approach that leverages an efficient encoding of convolutional perturbations. The result? A model that achieves over 80% robust accuracy against motion blur on CIFAR10, outperforming standard adversarial training while maintaining high standard accuracy.

The VNN-COMP Tutorial

One of our highlights was the tutorial: “The Verification of Neural Networks Competition (VNN-COMP): A Lab for Benchmark Proposers, Verification Tool Participants, and the Broader AI Community.”

As neural networks enter automotive, medical, and aerospace domains, verifying them is non-negotiable. This tutorial was about lowering the barrier to entry for researchers to join that mission.

Taylor T Johnson kicked off the tutorial by motivating the verification of safety-critical systems.
Konstantin Kaulen provided an overview of the competition’s history and the results from the last competition (VNN-COMP ’25).
Matthew Daggitt and Edoardo Manino spoke about the current benchmark landscape in the competition and introduced the new VNN-LIB 2.0 standard. This evolution of the original VNN-LIB standard for specifying verification problems significantly improves expressiveness and ease of use.

The session was highly interactive, featuring hands-on demos for running verification tools and creating benchmarks on your own. It also featured a look at the AWS-based evaluation infrastructure. We found the tutorial extremely useful and hope to see many more industry-related benchmarks in this year’s competition. (Link to Tutorial Materials)

AI Robustness Highlights: What Caught Our Eye

The conference was packed with fascinating research, both at the poster sessions as well as during the oral presentations. We noticed three distinct trends: the race to secure LLMs, the expansion of verification into the physical 3D world, and the constant sharpening of theoretical bounds.

1. The Battle for Safe LLMs

As Large Language Models (LLMs) are used in more domains, their robustness to attacks becomes increasingly relevant. While traditional robustness verification methods often struggle to scale to models of these sizes, there were multiple papers which stood out for their novel approaches to defend LLMs and their realistic assessment of vulnerabilities.

AlignTree proposed a highly efficient defense using a random forest classifier to monitor LLM activations, detecting misaligned behavior without the high computational cost of heavy guard models. (Link to Paper)
Adversarial Prompt Disentanglement (APD) took a semantic approach, using graph-based intent classification to isolate and neutralise malicious components of a prompt before they even reach the model. (Link to Paper)
CluCERT addressed the difficulty of certifying LLMs against synonym substitutions. By using a clustering-guided denoising method, they achieved tighter certified bounds and better efficiency than previous word-deletion methods. (Link to Paper)
However, the defense landscape is still evolving. The STACK paper provided a sobering look at current safeguard pipelines, demonstrating a “staged attack” that achieved a 71% success rate against state-of-the-art defenses. This proves that we still have work to do when it comes to making LLMs safe. (Link to Paper)

2. Verification Meets the Physical World

Despite the recent focus on the field of LLMs, AI keeps being employed in a number of other fields as well. As an example, it operates robots and drones (as well as the self-driving taxis that Waymo is bringing to the UK this year!).

Phantom Menace was a standout study on Vision-Language-Action (VLA) models. The authors developed a “Real-Sim-Real” framework to simulate physical sensor attacks (like attacking a robot’s microphone or camera), revealing critical vulnerabilities in how these models integrate multi-modal data. (Link to Paper)
In the realm of 3D perception, FreqCert proposed a way to certify 3D point cloud recognition which is a challenging task due to the structure of the input data. By shifting the analysis to the frequency domain (spectral similarity), they created a defense that is robust against the geometric distortions that often fool spatial-domain defenses. (Link to Paper)

3. Pushing the Boundaries of Verification

Finally, we saw excellent work on the fundamental mathematics of verification.

Neural network verifiers often consider inputs in their verification queries which are either out of distribution or do not actually exist in the real world. VeriFlow proposes a fix for this by modeling the data distribution with a flow-based model. This allows verifiers to restrict their search to the data distribution of interest. (Link to Paper)
Ghost Certificates served as an important warning: it is possible to “spoof” certificates. The authors focused on models certified using Randomized Smoothing (including state-of-the-art diffusion-based defenses like DensePure), demonstrating that they can be manipulated by specialised, region-based attacks. These adversarial inputs are crafted to remain imperceptible while tricking the certification mechanism into issuing a large robustness radius for a completely incorrect class. This highlights a subtle but critical flaw in how we interpret guarantees from probabilistic certification frameworks: a valid certificate ensures stability around an input, but it does not guarantee the semantic correctness of the prediction itself. (Link to Paper)
We also saw specialised advances for specific architectures, including DeepPrism for tighter RNN verification (Link to Paper) and new Parameterised Abstract Interpretation methods for Transformers that can verify instances where existing methods fail. (Link to Paper)

Looking Ahead

AAAI showed us that the field of AI verification is maturing rapidly. We are moving from simple image classifiers to complex, multi-modal systems and LLMs, and the tools we use to verify them are becoming more sophisticated.

We are excited to integrate these insights into our work at Safe Intelligence and are looking forward to the next iteration of the VNN-COMP in 2026.

The post Building Trustworthy AI: Highlights from AAAI first appeared on Safe Intelligence.

We’re at NeurIPS 2025!

Christopher Brix — Wed, 03 Dec 2025 21:05:42 +0000

As you are probably aware, this year’s NeurIPS is well underway. With over 5000 accepted papers, it is one of the biggest AI conferences. For the first time, it is being held across multiple locations: San Diego, Mexico City, and Copenhagen all opened their doors for some of the best researchers of the field presenting their newest research.

We are excited to be in San Diego for this event! We are looking forward to fascinating discussions, new insights and inspiring talks. If you are interested in neural network safety in the face of adversarial attacks, get in touch and connect!

We are also presenting our paper Scalable Neural Network Geometric Robustness Validation via Hölder Optimisation that allows to validate robustness of networks much larger scale than previous work which makes it applicable to real-world use cases.

After the conference, we’ll share a summary of our highlights, key takeaways, and the most interesting ideas we encountered.

Photo by Justin Wolff via Unsplash

The post We’re at NeurIPS 2025! first appeared on Safe Intelligence.

Scalable Neural Network Geometric Robustness Validation via Hölder Optimisation

Alessio Lomuscio — Tue, 02 Dec 2025 22:18:53 +0000

Zhang, Y., Kouvaros, P., Lomuscio, A. (2025)

39th Conference on Neural Information Processing Systems (NeurIPS 2025)

Outcome Value

Neural network (NN) verification methods provide local robustness guarantees for a NN in the dense perturbation space of an input. A key difficulty with state-of-the-art methods and tools lies in their lack of scalability to large, transformer-based NNs used in present-day applications such as visual transformers and language models. In this paper we introduce H²V, a novel optimization-based method for the validation of NNs that scales to models with hundreds of millions of tuneable parameters thereby enabling the validation of large NNs that were previously out of reach via standard methods.

Summary

The H²V method as presented targets NN robustness validation against geometric perturbations, such as camera rotation, scaling and translation. It uniquely employs a Hilbert space-filling construction to recast multi-dimensional optimization problems into single-dimensional ones combined withHölder optimization, iteratively refining the estimation of the Hölder constant for constructing the lower bound. In common with other optimization methods, Hölder optimization can theoretically converge to a local minimum, resulting in a potentially incorrect robustness result. However, we have identified conditions under which H²V is provably sound, and shown experimentally that even outside such soundness conditions, the risk of incorrect results can be minimized by introducing appropriate heuristics in the global optimization procedure.

Primary contributions

– We propose H²V as a global optimization method for the validation of neural networks based on spacefilling dimensionality reduction and Hölder optimization. We provide theoretical conditions for theoretical convergence, guaranteeing soundness. We illustrate that when convergence conditions are not met, the potential of a robustness error is well contained. Indeed, no errors were found in the extensive evaluation reported.

– We use H²V to validate the local robustness of models of up to 300M tunable parameters – including ResNet152 and vision transformers for image classification tasks – against geometric properties (rotation, scaling, and translation) on the large-scale ImageNet dataset.

– We use H²V to validate the geometric robustness of 3D ResNet models in video classification tasks for streams of 32 × 3 × 256 × 256 inputs.

What’s next

H²V opens the way for validation of robustness in present-day NNs with hundreds of millions of tuneable parameters including the latest state-of-the-art object detectors of the Yolo family and vision transformers. Future refinements of the methods will include other perturbations beyond geometric transformations, such as photometric changes, blur and beyond.

Link to paper

The post Scalable Neural Network Geometric Robustness Validation via Hölder Optimisation first appeared on Safe Intelligence.

AI Engineer Paris 2025 Recap – Through the Lens of Trustworthy AI

Brain Aboze — Fri, 24 Oct 2025 15:07:19 +0000

Just last month, on the 23rd-24th of September 2025, the atmosphere at the AI Engineer (AIE) Paris Expo was electric. This was the first edition outside the US, hosted at STATION F and organised with Koyeb. The event carried a very pragmatic energy, with engineers focused on building and deploying the next generation of AI agents, tools, and infrastructure. But beyond the demos and keynotes, as I moved between the sessions I attended, a deeper personal theme emerged: the future isn’t just about building more intelligent AI, it’s about building more trustworthy AI.

My Key Takeaways

→ Open Source as the Foundation of Trust

Lélio Renard Lavaud (Mistral) reminded us that adoption at enterprise scale hinges on openness. By sidestepping vendor lock-in and surfacing transparency, open models create the reliability businesses demand. Laurent Sifre (H Company) echoed this, framing open-source “bricks” as the future scaffolding for AI innovation.

Trustworthy AI starts with open foundations, not closed walls.

→ Security and Safety by Design

Jesús Espino (Ona) and Martin Woodward (GitHub) emphasised that trust cannot exist without security. Isolation, auditability, and reproducibility aren’t optional; they are essential, especially in regulated environments. Safe AI deployment protects both users and businesses, ensuring that agentic workflows remain reliable even under high stakes.

Trustworthy AI requires safety and security as core principles, not afterthoughts.

→ Learning from Failure: Context, State, and Infrastructure

Thomas Schmidt (Metabase) and Rémi Louf (.txt) reminded us that agents still hallucinate metrics, misreport outputs, and occasionally break fundamental interactions. Emil Eifrem (Neo4j) and talks from Spotify and Shopify highlighted the root cause: poor state and context management. Without clear context and robust infrastructure, agents misstep or stall. Yann Leger (Koyeb) emphasised that resilient, heterogeneous, agent-ready systems are essential to turn these failures into trustworthy performance.

Trust begins by confronting failure and building the context and infrastructure that prevent it.

→ Evaluation and Trust in AI Behaviour

If you can’t measure it, you can’t trust it. Pierre Burgy (Strapi), SallyAnn DeLucia (Arize AI), and Srilakshmi Chavali highlighted a hard truth: static benchmarks often fail to reflect real-world reliability. Instead, emerging frameworks focus on session-level monitoring, conversations, and user “vibe” to systematically evaluate AI behaviour. Continuous feedback loops enable agents to learn and improve transparently, turning evaluation into a true measure of trustworthiness.

Trust is earned through continuous, measurable evaluation, not just static metrics.

→ Efficiency, Sustainability, and Scale

From Bertrand Charpentier’s deep dive into compression techniques to Steeve Morin’s sparse attention on CPUs, engineers are proving that trustworthy AI must also be efficient AI. Sustainability isn’t a side goal. It’s part of reliability, cost-effectiveness, and accessibility.

Trust is not just about correctness, it’s about efficiency and responsibility at scale.

→ Agents Moving from Promise to Practice

The event showed the inflexion point where agents stopped being hype and started showing their promise.

Codebase rewriting at scale (Spotify)
Senior-level AI code review (Graphite)
Agent swarms for refactoring (All Hands AI)
Multi-agent orchestration (Docker)

Each showed a vision where agents aren’t assistants, they’re teammates.

Agents are no longer experiments; they’re workflows.

From the sessions I attended, my strongest impression is that the future of AI won’t be built on hype, but on a foundation of trust. The conference made it clear that trust is the result of a responsible engineering mindset.

Looking Forward

The AI Engineer community is just getting started, as the founders said, and from my own experience, you won’t want to miss what’s next. Mark your calendars and secure your tickets:

AI Engineer Code Summit | November 20–22, New York, NY
AI Engineer Europe | April 7–10, London, UK
AI Engineer World’s Fair 2026 | June 30–July 2, San Francisco, CA

I look forward to seeing you in the next one, know more.

The post AI Engineer Paris 2025 Recap – Through the Lens of Trustworthy AI first appeared on Safe Intelligence.

Safe Intelligence in Zagreb: Our Highlights from the 2025 Symposium on AI Verification (SAIV)

Benedikt Brückner — Wed, 06 Aug 2025 08:10:03 +0000

Brueckner, B., Kouvaros, P., Highlights from SAIV 2025

How can we truly trust an AI system?

As models learn from data in ways that aren’t always transparent, ensuring they are safe and reliable, especially in critical applications, is one of the most important challenges in technology today.

This is the central question that drives the Symposium on AI Verification (SAIV). This July, the event brought the researchers in formal verification and artificial intelligence to Zagreb, Croatia. Held alongside the prestigious Conference on Computer Aided Verification (CAV), it was the perfect place to discuss the latest breakthroughs in making AI safe. Naturally, the Safe Intelligence team was there, and we came back inspired.

A few presentations stood out to us for their direct relevance to the work we do every day. Here are some of our key takeaways:

Making AI More Efficient, Without Sacrificing Safety

Making AI models smaller and faster is crucial for real-world use. But how do you ensure that a streamlined “pruned” model is just as trustworthy as the original? Samuel Teuber (Personal Site, Bluesky) presented a new method for differential verification. His framework can formally prove that a smaller model behaves identically to its larger counterpart across vast scenarios, offering significant speedups over existing techniques. (Link to Paper)

Verifying AI in the Physical World: A Drone Case Study

Applying AI to physical systems like drones is notoriously difficult because real-world physics are incredibly complex. Colin Kessler (LinkedIn) shared a fascinating case study on creating robust controllers for gliding drones. His work demonstrates how clever simplifications can make it possible to verify these complex systems despite the various nonlinearities which arise when modelling such a system. His work was especially interesting because it pushes today’s verification tools to their absolute limits and shows how better training can lead to safer AI controllers. (Link to Paper)

Testing AI Vision Against Realistic Challenges

For a long time, testing the robustness of computer vision models involved adding simple “white noise” or static to an image. But that’s not what AI encounters in the real world. Jannick Strobel (https://www.sen.uni-konstanz.de/members/research-staff/jannick-strobel/ ) proposed novel properties for verifying the robustness of computer vision models which rely on image similarity measures. Although the similarity measures introduce nonlinear relations which impact the scalability of the method, the work is a great contribution towards being able to check the robustness of models exposed to more semantically meaningful input changes. (work not published yet, so no link, only https://www.aiverification.org/2025/talks/poster7/ )

VERONA: An Open-Source Toolkit for Better Experiments

For anyone in the field, comparing different verification tools can be a complex and time-consuming task. Annelot Bosman (https://www.universiteitleiden.nl/en/staffmembers/annelot-bosman#tab-1 ) presented VERONA, an open-source “experiment box” that makes it much easier to run and compare multiple verification toolkits at once. This is a great practical tool for accelerating research in the community. (Link to GitHub)

A Highlight: The Verification of Neural Networks Competition (VNN-COMP)

Konstantin Kaulen (https://rwthcontacts.rwth-aachen.de/person/PER-SF4C2YW ) presented the results of this year’s Verification of Neural Networks Competition (VNN-COMP) which was certainly a highlight at the symposium and saw the largest attendance.

1st Place: The champion for several years running, alpha-beta-CROWN, held its top spot with its powerful GPU-accelerated bound propagation approach.
2nd Place: NeuralSAT took an impressive second place using a completely different strategy for robustness verification based on ideas from mathematical logic and backtracking.
3rd Place: The abstract interpretation-based PyRAT secured a strong third place.

Seeing different techniques being employed across the field is definitely encouraging that there are many ways that could lead to even more scalable verifiers in the future, and besides the top three places there were various other tools that also achieved excellent results. Those are listed in the full results (https://docs.google.com/presentation/d/1ep-hGGotgWQF6SA0JIpQ6nFqs2lXoyuLMM-bORzNvrQ/ ). Konstantin further presented a work on terminating robustness verifiers early to save time when verification is not expected to terminate within the given time budget (https://ojs.aaai.org/index.php/AAAI/article/view/34946) and a library for the certified training of neural networks (https://openreview.net/forum?id=bWxDF6HosL).

From “Probably” to “Provably”: Guaranteeing Performance on New Data

One of the biggest questions in AI is: “How can we be sure a model will work on new data it has never seen before?” When assessing model generalisation capabilities, existing methods often only provide a statistical guess. Arthur Clavière (LinkedIn) introduced a formal method that provides more rigorous guarantees of a model’s performance on unseen inputs. It works by systematically breaking down the vast, continuous space of inputs into smaller, more manageable sub-regions. For each of those the method then attempts to prove that the model’s error stays within a predefined tolerance and, if unsuccessful, splits the region into smaller regions again. While the challenge of scaling such an analysis to cover large, high-dimensional input spaces remains, the work is a valuable step towards safer AI. (Link to Paper)

We had an amazing time at SAIV 2025. The discussions and advancements presented in Zagreb reinforce our mission at Safe Intelligence: building the tools to make AI provably safe and reliable. These developments aren’t just academic—they are the building blocks for the future we are working towards.

Zagreb was a beautiful backdrop for a week of intense learning and networking. We connected with great people and exchanged a lot of ideas with others in the field. We’re already looking forward to SAIV 2026!

The post Safe Intelligence in Zagreb: Our Highlights from the 2025 Symposium on AI Verification (SAIV) first appeared on Safe Intelligence.

Dynamic Back-Substitution in Bound-Propagation-Based Neural Network Verification

Panagiotis Kouvaros — Tue, 29 Jul 2025 08:00:00 +0000

Kouvaros, P., Brueckner, B., Henriksen, P., Lomuscio, A. (2025), Proceedings of the 39th AAAI Conference on Artificial Intelligence (AAAI25)

Outcome Value

The paper shows advances of state-of-the-art neural network verification by accelerating an algorithm which is used by most verification toolkits. This allows verifiers to scale to even large neural networks, aiding the certification of the safety of neural networks before their deployment.

Summary

Bound Propagation is a popular method for obtaining bounds for a given layer of a neural network under a given perturbation. While different bound propagation methods exist, back-substitution is the most popular one due to its superior precision. Unfortunately, this precision comes at a substantial computational cost. We propose a method called Dynamic Back-Substitution which accelerates these computations without any loss in precision.

Primary contributions

The paper extends Venus, a neural network verification toolkit, with the dynamic back-substitution method to accelerate the computations run during bound propagation. The specific contributions are:

– The introduction of the dynamic back-substitution algorithm which uses intermediate results from cheaper bound propagation methods to dynamically reduce the size of the equations which are being propagated through the network if one can guarantee that no precision gain is to be expected from running back-substitution for specific neurons

– The design of a heuristic to govern the use of dynamic-back-substitution. The heuristic uses a cheap estimate of the expected gain to decide whether the full back-substitution procedure should be used for a given layer of a neural network

– A thorough empirical evaluation on a number of neural networks ranging from 92000 to 21.8M parameters which demonstrates that our method reduces the required time for running back-substitution by up to 40%

What’s next

Although a lot of progress has been made in neural network verification over the past years, scalability is still the main challenge that these methods face. Our aim is to improve the scalability of these methods even more by not only improving the algorithm introduced in this work and its governing heuristic, but by also experimenting with entirely different approaches to verification.

Link to paper

The post Dynamic Back-Substitution in Bound-Propagation-Based Neural Network Verification first appeared on Safe Intelligence.

Verification of Neural Networks Against Convolutional Perturbations via Parameterised Kernels

Benedikt Brückner — Wed, 07 May 2025 09:47:00 +0000

Brueckner, B., Lomuscio, A. (2025), Proceedings of the 39th AAAI Conference on Artificial Intelligence (AAAI25)

Outcome Value

Ensuring the robustness of neural networks against real-world perturbations is critical for their safe deployment. Existing verification techniques struggle to efficiently handle convolutional perturbations due to loose bounding techniques and high-dimensional encodings. Our work advances the state-of-the-art by providing a method that offers both precision and scalability, allowing verification toolkits to consider larger networks. By certifying robustness under realistic conditions such as motion blur and sharpening, the approach helps mitigate risks associated with adversarial attacks and hidden weaknesses in practical applications.

Summary

Neural networks have become integral in safety-critical applications such as autonomous driving and medical diagnosis, making their reliability extremely important. The paper introduces a novel verification method designed to verify the robustness of neural networks against convolutional perturbations, such as motion blurring and sharpening. We achieve this by parameterising perturbation kernels in a way that preserves key kernel properties while allowing for controlled variations in perturbation strength. By integrating the convolution of inputs with these kernels into the verification process, our method ensures tighter bounds and enables robustness certification to scale to large networks where existing methods fail.

Primary contributions

– We introduce a technique for linearly parameterising kernels that encode perturbations including motion blur, box blur, and sharpening effects. These parameterised kernels allow for a variation of the perturbation strength applied to a given input.

– We present a theorem that demonstrates the ease of convolving an input with a parameterised kernel and design layers which can be prepended to a neural network in order to encode the perturbations of interest into the network.

– Our experimental evaluation demonstrates that due to the low dimensionality of our encoding, input splitting can be used to quickly perform robustness verification and scale to architectures and network sizes which are out of reach of an existing baseline.

What’s next

While this paper represents a significant step forward in verifying neural networks against convolutional perturbations, the challenge of scalability remains. As a next step, we’ll focus on further optimising the introduced method and exploring alternative verification techniques that handle more complex perturbations in conjunction with our perturbations. Additionally, we’ll investigate ways of integrating other realistic perturbations into verification approaches to enable a broader range of perturbations to be considered when assessing the safety of AI systems.

Link to paper

The post Verification of Neural Networks Against Convolutional Perturbations via Parameterised Kernels first appeared on Safe Intelligence.

My First 30 Days as a Developer Evangelist at Safe Intelligence

Brain Aboze — Tue, 29 Apr 2025 12:56:00 +0000

Hello world, I’m Brain (yes ) After six years of technical writing and public speaking in the AI space plus experience working in the financial industry as a data scientist, I have stepped into a new role as a Developer Evangelist at Safe Intelligence. I am here to help you build and deploy better, provably safe, and fair AI.

Looking backwards, I can see how the dots connect, bringing me to this role. My goal for this role is to help developers and AI practitioners demystify complex machine learning challenges and showcase Safe Intelligence’s automatic verification and robustification offerings. This includes the creation of demos, the development of content, public speaking, and the organization of events.

Safe Intelligence’s vibrant culture was evident from the moment I joined. My onboarding included meeting the entire team—an extraordinary group who balance serious work with hearty laughter and an uncanny ability to find great lunch spots (We have a random restaurant picker that might be overdue for an agentic upgrade , if you ask me!). I was introduced via demos to our products and services and immediately sensed how the team is committed to pushing boundaries in making ML models robust and reliable. The product and service introductions began with hands-on demos, giving me a firsthand look at our cutting-edge verification and robustification technologies. We’re excited to soon share these powerful solutions that enhance the entire ML lifecycle and strengthen model performance and robustness. Over the past few weeks, I’ve been diving deep into our platforms, toolkits, and internal workflows through GitHub and documentation.

I attended my first MLOps meetup in London on 26th March 2025. It was an electrifying experience—Alex Jones from Monzo discussed unifying feature engineering across data and backend systems, Callum Court explored the power of multi-task deep learning in fraud detection, and Eleanor Shatova wowed everyone with how LLMs can supercharge search. Between mouthwatering chocolate cake and networking with professionals, I got to see firsthand how diverse and passionate the AI community here really is.

Creating an internal walkthrough of our offerings has been a highlight so far, providing me with a hands-on look at how everything interconnects. I’ve also had the chance to learn from some brilliant ML experts here in London, and I’ll soon be hopping around to more local meetups—if you’re in town, come say hi and let’s chat about AI, ML, and building models that are safe and reliable. A big part of my role involves content creation—blog posts, short webinars, and videos—all aimed at helping developers level up their ML workflows. Whether it’s covering the basics or diving into advanced topics, my goal is to make every piece engaging, accessible, and genuinely useful.

As AI models grow increasingly complex, understanding how they’ll behave in real-world scenarios becomes trickier. Traditional testing isn’t enough for high-stakes environments like finance or healthcare, which is why Safe Intelligence uses formal verification to mathematically prove a model’s behavior under various conditions. Exciting things are ahead—new content, meetups, and more ways to connect with the AI community. Are you interested in learning more about robust, verifiable AI systems? Say hi or explore our early-access ML validation platform: Safe Intelligence Early Access. Let’s shift AI from guesswork to guarantees.

Thanks for being part of my first-month journey; see you around!

The post My First 30 Days as a Developer Evangelist at Safe Intelligence first appeared on Safe Intelligence.

AI Quality and Safety at the AI Engineer Summit

Steven Willmott — Sat, 01 Mar 2025 11:31:00 +0000

Every new wave of technology brings a new wave of conferences, summits, and meetups. AI is no different, and a dizzying array of new events has sprung up. One of the most interesting of these is the AI Engineer Summit series (and its counterpart, the AI Worlds Fair). The events resonate with our work at Safe Intelligence because they focus on the engineering of products, services, and systems around AI, rather than just AI models themselves.

Sure, you can train a model, but how do you put it in the right execution context, test the resulting products, and run it at scale?

The recent event in New York touched on many exciting themes and wasn’t explicitly focused on safety, but production quality and reliability shone through in many talks. Here are just a few:

Lux Capital’s Grace Isford talked about the accumulation of small errors looking at AI market and agents in particular. The more we use AI powered autonomous systems to execute tasks on their own, and the more complex those tasks, resulting small errors in any step add up. These errors aren’t theoretical, in fact they are almost guaranteed when solving messy real world problems where it becomes hard to know how much “context” to factor in. What’s relevant in flight booking for example? Just the flight schedules in price? Seat preferences? Traffic or commute times? Weather? Airline status?
Contextual’s Douwe Kiela focused on the need for context in order for models to accurately resolve tasks. The less specific the knowledge an agent has, the less likely it is to get the right answers. Kielo’s talk focused on LLM RAG systems that provide reasoning context, but the same is true for any type of model input relevant to an action. Vision models need to be tested with data from the actual sensor clusters they are attached to, trading models need to be tested (and receive signals from) as many market indicators as possible that might affect decisions.
Anthropic’s AI team dug deep into the hard topic of explainability for LLMs. Alexander Bricken and Joe Bayley, talked through the company’s roadmap of research to better understand the linkage between LLM model outputs and the structures represented in model weights. This is a hugely challenging task and it’s likely it may never be fulfilled. Still understanding what models are doing, what conditions could cause failures and how things are represented in a model is extremely valuable if it can be teased out.
In one of the other highlight talks, Mustafa Ali (Method Financial) and Kyle Corbitt (OpenPipe) talked through how they took built early prototypes for an LLM based system using powerful off-the-shelf LLMs to build out an application and then produced a smaller distilled open source based model for large scale deployment. The engineering mindset here is really to focus the fine tuning of models on specific tasks, make the smallest model that works and thereby gain efficiency and high performance.

All in all a great event and we’ll be back at the event for the 2025 World’s Fair!

The post AI Quality and Safety at the AI Engineer Summit first appeared on Safe Intelligence.

Scaling AI through more powerful validation and robustness

Steven Willmott — Thu, 20 Feb 2025 13:06:00 +0000

AI is fast becoming one of the building blocks of society and holds huge potential for productivity gains, automation and improvements to life. As techniques advance we make better models and enable new use-cases. However it remains much harder to validate that AI systems will act appropriately when deployed in the real world.

Safe Intelligence was founded to provide new technical solutions to bridge this gap in model validation and robustness. For the past two years the team has been working on extending deep academic research on formal verification of machine learning models and in model robustification through certified learning.

These techniques make it possible not just to check whether individual test cases pass or fail, but to verify whether they will always do so under a change in the environment. The same methods also make it possible to make models more robust to change during operation and hence gain significantly in performance.

Today we’re excited to announce our seed funding round led by Amadeus Capital Partners and with participation from VSquared Ventures and OTB.VC. The press release can be found here!

With the new investment the company will release new tooling and an enterprise grade platform for ML validation and robust model performance. Join us on the journey by:

Signing up for our early access program
Becoming part of the team!

Thank you to Amadeus, VSquared, OTB and all the early customer collaborations who’ve helped shaped all the progress so far!

The post Scaling AI through more powerful validation and robustness first appeared on Safe Intelligence.