exalsius’ cover photo
exalsius

exalsius

Technology, Information and Internet

Berlin, Berlin 278 followers

Providing AI Teams with One-Click GPU Infrastructure

About us

exalsius simplifies access to compute infrastructure for AI teams. Training today’s AI models often demands multiple GPU nodes — either because the models are too large for a single GPU, or the data volumes require distributed processing to finish training within a reasonable timeframe. Therefore, building AI is no longer just about designing smart models and training algorithms. It’s become an infrastructure problem — one that forces teams to search across cloud providers for affordable GPUs, configure distributed servers and networking, install and maintain complex dependencies, and struggle to keep jobs running reliably and securely. We believe AI development should focus on innovation — not infrastructure. That’s why we’re building **exalsius**: a Kubernetes-native system that automatically finds affordable GPUs across clouds or data centers, prepares the infrastructure, installs the right tools, and deploys your distributed training jobs — securely, reliably, and without hassle. Whether teams are using Ray, kubeflow, or custom pipelines, exalsius handles everything beneath — provisioning, setup, and orchestration of the GPU nodes — so they can concentrate on what matters most: building great AI.

Website
https://exalsius.ai
Industry
Technology, Information and Internet
Company size
2-10 employees
Headquarters
Berlin, Berlin
Type
Privately Held
Founded
2025
Specialties
GPU Infrastructure, Model Deployment, AI Training, AI Inference, and Distributed Training

Locations

Employees at exalsius

Updates

  • exalsius reposted this

    We are not late to the AI party. We are early to a different, European one. My main takeaway from the Flower Summit 2026 was that we aren’t here to win someone else’s race. We are building an ecosystem that actually fits the way Europe works: our federal structures, our distributed, highly valuable industry data, and our human-centric values. Instead of trying to force-fit a centralized, homogeneous model onto a decentralized continent, we are building a new AI paradigm designed for heterogeneity and federations. Proud to be part of this accelerating ecosystem and this new way of thinking about AI. Thanks for the great event, Flower Labs, Daniel, Nicholas, Taner! 🌼

    • No alternative text description for this image
    • No alternative text description for this image
    • No alternative text description for this image
    • No alternative text description for this image
  • exalsius reposted this

    Getting Germany’s entire deep tech ecosystem into one room. That’s something only SPRIND can pull off. This year’s Venture SPRIND felt different. Compared to last year, there was a new level of urgency. A pressure to move, to build, to race paired with a deep sense of optimism. I had the chance to sit down with Johannes and Arya to discuss a critical tension: the mismatch between the current AI ecosystem and Europe’s federal, data-protective structure. Building at the forefront of AI we are faced with a choice: 1. We can try to copy the American or Chinese approaches to AI. 2. Or, we can build AI that actually fits our European structures and values. Both are tremendously hard challenges. Personally, I’m in strong favor of the second one. That is why we are building the fundamental infrastructure for federated, distributed, and decentralized AI. This means building AI that works for us, not against our principles. Thanks for support us on that journey, and thanks for having me SPRIND - Bundesagentur für Sprunginnovationen, Johannes, Jano, Leonard, Marcia, Sebastian.

    • No alternative text description for this image
    • No alternative text description for this image
    • No alternative text description for this image
    • No alternative text description for this image
  • exalsius reposted this

    Last week we published a technical report on distributed LLM pretraining during renewable curtailment windows, now available on arXiv (link in comments), which demonstrates the awesome potential of GPU resource flexibility and intelligent workload scheduling. The core result: we trained a 561M-parameter language model across GPU clusters in California, Texas, and South Australia, activating training only when regional renewable generation exceeded grid demand. The system elastically switches between local and federated training as sites come online or drop off. Training quality matched conventional single-site runs while producing just 5-12% of the carbon emissions. 🌱 I wrote a companion piece on Medium (link also in comments) exploring an application of this architecture beyond language models, specifically for multi-agent reinforcement learning (MARL) in distributed energy systems. A few key points from that piece: - MARL training for energy systems shares properties that make LLM pretraining a good fit for curtailment-aware scheduling: it can be delay-tolerant, checkpointable, and distributable across sites. - The federated structure we demonstrated (elastic participation, privacy-preserving weight sharing, geographic distribution) maps directly onto how MARL agents for virtual power plants would need to train. - GPU clusters running these workloads can themselves participate in grid flexibility, creating a closed loop where the training infrastructure is also a participant in the system being optimized. - For smaller, purpose-built clusters designed around delay-tolerant workloads and strategically located near stranded energy, the economics of grid flexibility can be more attractive than for hyperscale facilities that are optimized for sustained high utilization. Open questions remain around whether federated averaging preserves learning quality for interacting agent populations, and how sensitive MARL convergence is to the kind of intermittent training that curtailment windows impose. I'll be working on answering these questions during the rest of my time in the Venture Science Doctorate and beyond, but for now, the systems infrastructure exists and the application domain is well-matched. Special thanks to Philipp Wiesner, Alexander Acker, and the rest of the exalsius team for their expertise in refining and executing on the project's goals. Thanks to WattTime.org for the use of their API and to Flower Labs for their federated training infrastructure. And thanks to Deep Science Ventures for everything else.

    • No alternative text description for this image
  • exalsius reposted this

    I spent the weekend building `autoimprove` (https://lnkd.in/d6RA-g8g), a package that brings autoresearch improvement loop to any repository. The idea is creating autoresearchable (improvable) version of any repo, ML and non-ML. How to use autoimprove: - clone the repo - open claude code / opencode or similar - tell it: `install autoimprove cli and run it on my repo "<repo_path>"` - inspect `.autoimprove/program.md` - let it run I tried it on my own side project`opentab`: https://lnkd.in/dqSpzbDC On `opentab`, `autoimprove` proposed and tested changes like: vectorizing `DecisionTreeMapping`, pre-norm instead of postnorm, better truncation / test target handling, improving test embedding pooling, adding label smoothing. After several iterations (tried more than 30), it moved the average score on baseline evaluation datasets from 44% to 91% in just 4 minutes of training on laptop gpu. Fixed a lot of small errors. My take is that in the research space, these loops will provide speed up most definetely. However, having clear idea what to do still requires strong knowledge of fundamentals and first principles. I still can't speak for general software. If you want to try it, break it, improve it, or contribute directly, I'd really love that: https://lnkd.in/d6RA-g8g thanks to exalsius for providing GPUs for testing.

    • No alternative text description for this image
  • Resolving the paradox of AI scaling vs. sustainability Great to see our team member (Congratulations, Philipp!) on stage at the Flower AI Summit 2026 in London. Large-scale AI training is notorious for its energy demands, but Philipp is proving it doesn't have to stay that way. His upcoming talk, “Toward Carbon-Aware Distributed LLM Pretraining,” dives into a future where training jobs are mobile, shifting across time and geography to sync with periods of cleaner, greener electricity. Join the Flower Labs, us, and a huge community of distributed AI folks in London (April 15-16) to discuss a different approach to AI scaling.

    • No alternative text description for this image
  • exalsius reposted this

    The SPRIND Composite Learning Challenge has shifted from "Can we build it?" to "How can Europe win AI?" Most conversations in the EU ecosystem revolve around how to avoid losing. I'd rather think about how we can win AI. We have unique structural advantages: -> The European industry owns a huge amount of data that far outweigh what current LLMs are trained on. -> Our fragmented, diverse SME landscape is actually a much better field for building agentic models. Diverse organizational structures allow models to generalize better than on large monolithic US corporate structures. Europe's GPU infrastructure is fragmented and scarce. No single entity has the scale of a US hyperscaler. Jointly pooling resources is the only viable path forward. All Composite Learning Challenge teams proofed that the technology is working. The challenge now is to solve Europe's "federated friction." Reaching consent across a multitude of stakeholders is slow. In a space that moves as fast as AI, this hurts. We are spending a lot of our time figuring out how to make it undeniably beneficial for organizations to pool data and compute and how to implement a structure that maintains IP ownership. Scaling Composite Learning is a massive challenge by definition, and we need a fundamental shift in how European organizations collaborate. Flower Labs, Daniel J. Beutel, Taner Topal PanocularAI, Arya Mazaheri, Sören Heß SPRIND - Bundesagentur für Sprunginnovationen, Jano Costard, Johannes Otterbach, Leonard Schenk, Marcia Holst

    • No alternative text description for this image
  • 🎉 Great News! We are moving into Phase II of the SPRIND Composite Learning Challenge. For the last year, we’ve focused on one mission: building the technological foundation for exalsius. Our goal is to simplify multi-cloud AI infrastructure orchestration. After testing countless designs and architectures, we’ve developed a solution that enables a seamless switch between cloud providers or even custom on-prem servers. The wins? ✅ Infrastructure autonomy ✅ Compute resource control ✅ Total budget transparency 12 months ago, we’ve started from an hypothesis which evolved to a validated system running in production of several customers. Now, with Phase II funding secured, our focus shifts to growth. Excited for the next 9 months! Thanks to SPRIND - Bundesagentur für Sprunginnovationen - Leonard, Johannes, Jano, Sebastian, Marcia, and the whole team - for believing in the vision. Let’s build AI Infrastructure made in Europe 🇪🇺

    • No alternative text description for this image
    • No alternative text description for this image
    • No alternative text description for this image
  • 🏆 𝗖𝗼𝗻𝗴𝗿𝗮𝘁𝘂𝗹𝗮𝘁𝗶𝗼𝗻𝘀 𝘁𝗼 𝘁𝗵𝗲 𝗪𝗶𝗻𝗻𝗶𝗻𝗴 𝗧𝗲𝗮𝗺𝘀 𝗼𝗳 𝘁𝗵𝗲 [𝗖𝗼𝗹𝗱 𝗦𝘁𝗮𝗿𝘁:] 𝗗𝗶𝘀𝘁𝗿𝗶𝗯𝘂𝘁𝗲𝗱 𝗔𝗜 𝗛𝗮𝗰𝗸 𝗕𝗲𝗿𝗹𝗶𝗻 - 𝗧𝗿𝗮𝗰𝗸 𝟬𝟮! 🏆 We're excited to highlight the winners of our hackathon's open challenge track 02, where teams explored bold ideas at the intersection of federated AI, systems engineering, physical simulation, and real-world impact. Free to define their own problems and metrics, the emphasis of this track was on originality, depth, and proof-of-concept prototypes to drive federated learning adoption, enabling collaboration without sharing raw data, while tackling privacy, governance, and interoperability hurdles. Participants prototyped solutions with Flower and deployed data processing workloads with exalsius demonstrating distributed model training, privacy-preserving data handling, and personalization. Our top teams delivered creative innovations: 🥇 1st Place: FloraLab (Trinh Nguyen Phuong Tran, Quan Nguyen) - Tackled the bottleneck of running federated learning on university clusters with a multi-language tool: a Python CLI for managing Flower-AI deployments on SLURM, plus Go-based utilities for initialization, running, and stopping stacks under florago. 🥈 2nd Place: ElevenCube (Zoe Yan, Bharath Kumar, Martin Kaiser) - Addressed satellite constraints like bandwidth and battery life by creating a power-aware federated learning scheduler to optimize data processing and energy use in orbit. 🥉 3rd Place: SIMD (Bayangmbe Mounmo & Team) - Went beyond the AI hype and built physical simulations, demonstrating a federated fluid dynamics simulator across distributed GPUs for scalable, data-private environments. These pitches impressed our jury, which evaluated on them based on Technical Proficiency, Innovation, Impact, and Presentation quality. Bravo to all for showing what federated systems can be used for! 🙌

    • No alternative text description for this image
    • No alternative text description for this image
    • No alternative text description for this image
    • No alternative text description for this image
    • No alternative text description for this image
      +1
  • View organization page for exalsius

    278 followers

    🏆 𝗖𝗼𝗻𝗴𝗿𝗮𝘁𝘂𝗹𝗮𝘁𝗶𝗼𝗻𝘀 𝘁𝗼 𝘁𝗵𝗲 𝗪𝗶𝗻𝗻𝗶𝗻𝗴 𝗧𝗲𝗮𝗺𝘀 𝗼𝗳 𝘁𝗵𝗲 [𝗖𝗼𝗹𝗱 𝗦𝘁𝗮𝗿𝘁:] 𝗗𝗶𝘀𝘁𝗿𝗶𝗯𝘂𝘁𝗲𝗱 𝗔𝗜 𝗛𝗮𝗰𝗸 𝗕𝗲𝗿𝗹𝗶𝗻 - 𝗧𝗿𝗮𝗰𝗸 𝟬𝟭! 🏆 We're excited to spotlight the track 01 winners of our hackathon, working on a distributed AI challenge in healthcare. Models trained in one hospital often underperform in others due to varying data distributions, from imaging devices to patient demographics. The goal: Build a reliable model across hospitals via federated learning with Flower, simulating privacy-preserving training on siloed chest X-ray data with strong non-IID characteristics. Dataset silos: 𝗛𝗼𝘀𝗽𝗶𝘁𝗮𝗹 𝗔 (Portable Inpatient): Elderly males, AP views, fluid-related issues (Effusion, Edema, Atelectasis). 𝗛𝗼𝘀𝗽𝗶𝘁𝗮𝗹 𝗕 (Outpatient Clinic): Younger patients, PA views, findings like Nodules, Masses, Pneumothorax. 𝗛𝗼𝘀𝗽𝗶𝘁𝗮𝗹 𝗖 (Rare Conditions): Mixed demographics, PA views, rare conditions (Hernia, Fibrosis, Emphysema). Our winners and their creative solutions: 🥇 1st Place: TeamShaper (Justus Krebs, Julian Dobler, Emirkan Toplu, Edgar Blumenthal) - Pivoted from transformers to ResNet and finally DenseNet, optimizing batch sizes and learning rates, achieving an AUROC of 0.769! 🥈 2nd Place: Team 006 (Hacı İsmail Aslan, Jasmin Bogatinovski) - Explored a wild and diverse mix model architectures and data preprocessing in a complex pipeline, hitting AUROC 0.762. 🥉 3rd Place: FeedForward (Sarthi Borkar, Hrishikesh Jadhav, Fidel I. Mamani Maquera, Florian Stahr, Handan Özgöcen) - Worked on a CNN architecture, enhancing the EfficientNet-B0 model with parameter tweaks, configs, and optimizer experimentation for AUROC 0.758. Alongside the results, all teams delivered an impressive presentations to the jury. Kudos to all!

    • No alternative text description for this image
    • No alternative text description for this image
    • No alternative text description for this image
    • No alternative text description for this image
  • View organization page for exalsius

    278 followers

    𝗘𝘂𝗿𝗼𝗽𝗲 𝗵𝗮𝘀 𝗻𝗼 𝘀𝗵𝗼𝗿𝘁𝗮𝗴𝗲 𝗼𝗳 𝗔𝗜 𝘁𝗮𝗹𝗲𝗻𝘁 What it needs are the structures and incentives to keep that talent here. We need ecosystems where real innovation happens and this weekend offered a clear view of what that looks like. Our first 𝗗𝗶𝘀𝘁𝗿𝗶𝗯𝘂𝘁𝗲𝗱 𝗔𝗜 𝗛𝗮𝗰𝗸𝗮𝘁𝗵𝗼𝗻 in Berlin turned out way beyond our expectations: 424 registrations, two technical tracks, and 15 finalist teams building decentralized, data-sovereign AI models and competing for a prize pool. From privacy-preserving medical diagnostics to federated manufacturing assistants and power-aware satellite AI models, the projects made one thing obvious: Europe has enormous potential for AI innovation. Thanks to our jury of industry leaders for helping evaluate what the teams built, and to Berlin’s AI community for once again proving the large pool of talent here. The next step is connecting this talent with industry at scale and ensuring the momentum stays in Europe. Ralf Herbrich, Manfred Hauswirth, Felix Balzer, Alois Krtil, Jens Stapelfeldt, Dr. Florian Kegel, Delphine Mousseau, Katharine Jarmul, Piotr Mazurek Together with Flower Labs, Einstein Center Digital Future, and Technische Universität Berlin, we created a space where teams could collaborate and build. Thanks to AMD (Jens and Bryce) for providing MI300X Cloud Server GPUs, and to SPRIND - Bundesagentur für Sprunginnovationen, AI NATION, Science & Startups, and Merantix AI Campus for strengthening the AI ecosystems in Berlin and Munich. And of course to every participant working on pushing the boundaries of distributed AI! 🚀

    • No alternative text description for this image
    • No alternative text description for this image
    • No alternative text description for this image
    • No alternative text description for this image
    • No alternative text description for this image

Similar pages