Our first MLOps World Steering Committee session explored how practitioners evaluate probabilistic and agentic systems when metrics are uncertain, subjective, or incomplete.
The post The Biggest Constraint Facing the MLOps World 2026 Committee, And What It Reveals About Evals (Pt. 1) appeared first on MLOps World.
]]>
Evaluation and testing were the most frequently named constraints in our 2026 MLOps World Steering Committee survey.
If you’re responsible for production ML, platform, infra, applied ML, or the “keep it alive at 2am” layer, we asked the question, and the clearest signal was: evaluations.
So our first committee sessions explored how teams are actually approaching evals when dealing with uncertain metrics, specifically as they pertain to probabilistic agentic systems.
Here’s what we learned.
Across every domain discussed, similar patterns arose. Teams are not working with metrics that give them bad information, they’re working with metrics that give them partial information. And the missing part is usually the part that matters for the decision.
Fraud detection metrics exist, but ground truth on what is not fraud arrives too late to be operationally useful.
Human-in-the-loop metrics capture how often a human overrides the model, but not whether that override was actually better. GPU utilization shows allocation, not productive use.
Call deflection shows that fewer interactions reach a human, but does not indicate whether the customer’s issue was resolved. Recall looks strong on paper, but does not reliably describe what is happening downstream.
The consistent lesson: production metrics tend to measure something adjacent to the actual outcome. They measure activity, not effect. The gap between the two is where bad decisions happen.
Hallucination rates are a moving target. There is no stable definition that holds across teams, domains, or time. The metric shifts with the task, and attempts to pin it down tend to produce numbers that look precise but aren’t reliable.
The same problem applies to any task that involves subjective judgment. When the correct answer depends on tone, inflection, interpretation, or context, even human evaluators cannot agree.
People try to objectify these assessments into quantified scores, but the result is a metric that gives the appearance of rigor without the substance. This is not a gap that better tooling will close. It is a property of the task.
When the ideal metric is not available, teams do not stop making decisions. They find something else to lean on. The more useful question is not “do you have the right metric” but “do you know what your proxy is actually measuring, and what it is missing?”
The proxy strategies that held up best in the discussion shared a few properties. They were deliberate, not accidental. They had known limits. And they were tied to a feedback loop that continuously updated the proxy over time.
Golden datasets combined with adversarial stress tests, prompt injection, corrupted inputs, edge cases, were the most commonly referenced approach. These are not perfect, and they go stale, but they provide a stable reference point when live metrics are noisy.
The important thing is that failures from these tests get looped back into training data, RAG pipelines, and data strategy, closing the loop rather than just flagging problems.
Maintaining a rules-based or decision-tree baseline was raised by multiple participants. If the model cannot beat a simple baseline, that is a signal worth paying attention to.
This prevents a common failure: deploying a more sophisticated model that is actually worse than what came before.
For tasks where multiple valid answers exist, teams are moving toward soft correctness, hierarchical scoring and degrees of right, rather than binary pass/fail.
This is especially relevant for classification at different levels of a hierarchy, where several answers can be technically correct but at different levels of specificity. Binary evaluation on non-binary tasks produces misleading results.
These are 3 of the 10 takeaways noted in our meeting. In the next part, we will discuss;
MLOps World | GenAI Summit is built for MLOps practitioners, infra leads, platform owners, and production ML teams running systems under real operating constraints. Discussions like this shape the program, and they shape the kinds of conversations that are only possible in an undiluted room of operators.
MLOps World | GenAI Summit 2026 · Nov 17–18 · Austin, TX
Have something to share on stage?
If you’re working through these problems and want to bring your experience to the committee or the program, we’d like to hear from you: [email protected]
The post The Biggest Constraint Facing the MLOps World 2026 Committee, And What It Reveals About Evals (Pt. 1) appeared first on MLOps World.
]]>We launched the first five AI in Production Field Notes to document the post-deployment reality.
This post is the short front-door summary: five patterns we keep seeing, and what they imply for how you build.
The post Your “Simple” LLM Feature Isn’t Simple After Launch appeared first on MLOps World.
]]>
If you own the ML platform, run infra, or ship LLM features that have to survive real traffic, you already know the punchline:
Most ML/GenAI systems don’t fail because the model is “bad.”
They fail because everything around the model gets stressed the moment users show up.
That’s why we launched the first five issues of AI in Production Field Notes: long-form writeups grounded in real production architectures, metrics, and decision frameworks. Not thought leadership. Not “AI takes.” Post-deployment notes.
This post is a short front-door summary: the five patterns we keep seeing, and what they imply for how you build.
If you’re still in demo-land (no latency budget, no access control, no incident response), bookmark it for later.
RAG fails because retrieval becomes a systems problem.
In the notebook, RAG looks clean: chunk → embed → retrieve → generate.
In production, it turns into:
The reason this hurts is simple: retrieval is now part of your app’s critical path. You’re not “adding context.” You’re operating a distributed system that decides what the model is allowed to see, under time pressure.
Opinion: If your RAG system doesn’t have a real plan for entitlements, cost tracking, and retrieval eval, it’s not a production system. It’s a demo with a pager attached.
Agents are seductive because they feel like progress. They also make failure modes harder to see.
Here’s the practical rule we keep coming back to:
If you can solve it with a workflow, an agent is usually expensive overengineering.
A workflow is deterministic. You can reason about it. You can test it. You can budget it.
An agent adds loops, tool calls, dynamic step counts, and “it depends” everywhere, which is exactly what you don’t want when reliability and cost predictability matter.
That doesn’t mean “never use agents.” It means you earn agents when simpler approaches stop working:
Opinion: Most agent failures aren’t “the agent isn’t smart enough.” They’re “we added agency when we needed control.”
Traditional monitoring tells you servers are up. It doesn’t tell you your model is quietly becoming wrong.
In production, drift shows up as:
What separates mature teams isn’t “we detect drift.” It’s what happens next.
The Field Notes pattern here is: drift management becomes an ops loop:
Opinion: If your pipeline can’t diagnose → intervene with guardrails, you’re not running ML. You’re running a permanent incident queue.
A lot of LLM products start as: “It’s just one call.”
Then the real world arrives:
So the “one call” becomes a system:
This is what production looks like: not a bigger prompt, but an orchestra of small controls that keep the app reliable under load.
Opinion: If you can’t test, measure, and retry each step independently, you don’t have an LLM app, you have a demo that occasionally works.
In enterprise environments, “move fast” doesn’t fail because the model is slow.
It fails because:
The Field Notes pattern: teams that ship repeatedly don’t treat governance as a blocker to “get through.”
They treat it as part of the system design, embedded early, updated dynamically, and enforced consistently.
Opinion: You don’t “add governance later.” If you try, you’ll rebuild everything under pressure.
If you’re building or inheriting one of these systems, here are five questions worth answering before you scale anything:
If those questions are fuzzy, the model won’t save you.
This post is the short version. The long version (architectures, metrics, decision frameworks) is in the first five issues of:
AI in Production Field Notes (Substack)
The post Your “Simple” LLM Feature Isn’t Simple After Launch appeared first on MLOps World.
]]>It shows up later, after launch, at scale, under real traffic, real latency budgets, and real operational pressure.
As systems become more autonomous, that risk compounds. Agentic workflows introduce longer execution chains, hidden dependencies, and failure modes that don’t surface in demos.
MLOps World | GenAI Summit exists for practitioners operating in that after-launch phase, where reliability, cost, ownership, and control stop being abstract concerns and become daily responsibilities.
The post The Real AI Risk Shows Up After Launch appeared first on MLOps World.
]]>
Most AI risk doesn’t appear during development.
It appears later, when systems are scaled, monitored, handed off, and expected to run continuously.
That risk compounds as systems become more autonomous. Agentic workflows introduce longer execution chains, hidden dependencies, and decisions that unfold over time, often outside the narrow scope of a demo.
When there’s real traffic.
Real latency budgets.
Real cost curves.
Real operational pressure.
That timing mismatch is why production teams often feel blindsided. The demo went well. The launch looked fine. Then the system started drifting, degrading, or quietly accumulating operational debt until it became an incident.
MLOps World | GenAI Summit exists for practitioners operating in that after-launch phase, where reliability, cost, ownership, and control become unavoidable.
Austin, Texas
November 17–18, 2026
Save the dates.
Production AI doesn’t always “break.” It often erodes.
The most consequential problems show up as slow-moving changes that are easy to miss in the early weeks:
This isn’t a theoretical risk. It’s what live systems do when they move from controlled conditions to sustained operation.
A common inflection point comes after the initial build:
The system is shipped. The team shifts to new priorities. The platform or infra group inherits pieces of the stack. The product assumes it’s “stable.” The on-call rotation becomes the real feedback loop.
That’s when assumptions get stress-tested:
In production, these aren’t neutral statements. They become operational debt, and debt collects interest under load.
Across years of practitioner-led curation at MLOps World | GenAI Summit, the same operating realities return, not as trends, but as repeatable failure modes in live systems:
1) Ownership boundaries that don’t hold during incidents
When something degrades, teams discover the boundary isn’t clear enough:
Who owns detection? Who owns rollback? Who can change thresholds? Who approves hotfixes? Who’s accountable for the bill?
2) Monitoring that’s built for dashboards, not decisions
Teams often have observability, but not decision-grade monitoring:
signals that reliably indicate drift early, distinguish data vs. infra issues, and trigger action before impact spreads.
3) Operational debt hidden inside “working” pipelines
Pipelines can appear stable until scale, dependency changes, or partial failures reveal brittleness:
orchestration fragility, tightly coupled steps, slow recovery paths, and failure modes that are hard to reproduce.
4) Cost behavior that shifts as usage becomes real
Cost doesn’t always spike at launch. It escalates with adoption:
data movement, feature compute, retrieval, retries, inference patterns, and “small” inefficiencies that compound at volume.
None of these problems are rare in production. They’re common precisely because the lifecycle timing is predictable: the risks mature after launch.
Many technical events gravitate toward what systems should look like.
MLOps World | GenAI Summit stays focused on what systems actually do once they’re running:
how they behave under load, what breaks after handoff, where teams underestimated complexity, and how operational reality reshapes architecture decisions.
That’s not a preference. It’s a credibility stance.
Because the people we serve aren’t optimizing for prototypes. They’re optimizing for:
If you’ve ever inherited a system “after the demo phase,” you already understand why those stories matter.
When we say MLOps World | GenAI Summit is curated by practitioners, we mean the selection lens is shaped by people who have carried production accountability, through incidents, tradeoffs, constraints, and on-call reality.
Over years, that creates a consistent filter:
This isn’t aspirational content. It’s real community lessons.
Save the dates
MLOps World | GenAI Summit (2026)
Austin, Texas
November 17–18, 2026
The post The Real AI Risk Shows Up After Launch appeared first on MLOps World.
]]>The post RAG demos are easy. Retrieval at scale is where it breaks. appeared first on MLOps World.
]]>
If you’ve shipped a RAG system beyond a proof-of-concept, you’ve probably run into the same pattern:
The demo looks strong.
Then production shows up, real users, real traffic, real permissions, real budgets, and the system starts answering confidently from the wrong context.
That usually isn’t a “model problem.”
It’s a retrieval problem.
This is a recurring production pattern across teams operating RAG in the wild: once you move from “it works” to “we can run it,” retrieval, not generation, becomes the dominant risk.
(These notes are synthesized from deployed systems and practitioners experience operating retrieval at scale, including lessons drawn from Rajiv Shah’s real-world work in production retrieval.)
RAG demos typically include:
Production introduces:
And the user-facing symptom tends to be consistent:
trust erosion, because the system “sounds right” while being wrong.
1) Relevance drift
Over time, retrieval quality can degrade quietly:
The worst part is that the system still retrieves something, so the failure often isn’t obvious until users complain.
2) Latency + cost blowups
Teams often try to “fix” quality by doing more retrieval work:
At real traffic levels, these choices compound quickly, and retrieval becomes the dominant driver of both tail latency and cost.
3) Weak or missing hybrid baselines
A common anti-pattern is jumping straight to vector search without proving baseline strength.
In many org corpuses, strong lexical + metadata filtering is hard to beat. If you can’t measure whether a hybrid improves your query distribution, you don’t have a retrieval strategy, you have a preference.
4) Permission mismatches
Hallucinations are embarrassing. Permission bugs are incidents.
Retrieval can fail “upstream” in ways that no prompt can patch:
5) No retrieval observability
When answer quality drops, teams often can’t answer basic questions:
Without retrieval-level logs and metrics, teams end up prompt-tuning a system whose core failure is upstream.
If you want RAG to behave in production, treat retrieval as its own system with its own contract:
Given this query and this user, can we fetch the right evidence within our latency + cost budget, while enforcing access control correctly?
That contract forces clarity on:
Can you beat a strong baseline?
Pick 50–100 real queries from production (or logs) and compare:
If you’re not reliably outperforming the baseline, don’t scale complexity, fix fundamentals.
Can you explain a bad answer end-to-end?
For a known failure, can you inspect:
If not, you don’t have a debugging loop yet, and quality will remain “mysterious.”
This post stops here on purpose.
The full field note goes deeper on the production mechanics: what hybrid baselines actually look like in practice, the observability signals that matter, and the common “retrieval fixes” that backfire on latency/cost.
Read the full Substack post here
The post RAG demos are easy. Retrieval at scale is where it breaks. appeared first on MLOps World.
]]>The post TMLS Stack Drop Offer: Get 30 Days Free ZenML Pro + 50% Off Agentic Pipeline Platform appeared first on MLOps World.
]]>Accelerate agent development with dynamic pipelines, expert support, and enterprise-ready infrastructure. This incredible offer is limited to the first 25 redemptions.

ZenML is the unified MLOps platform purpose-built for the next wave of agentic AI, enabling teams to build, deploy, and scale multi-agent systems with production-grade workflows and reproducible pipelines.
Through this special Stack Drop offer, TMLS community members can unlock a 30-day free trial of ZenML Pro Cloud (2x the standard length), get 50% off the enterprise platform for 6 months, and access exclusive support resources designed to help teams succeed in deploying agentic pipelines at scale.
Stack Drops are exclusive, limited deals curated by the TMLS community. Please note that TMLS and its events are not responsible for the terms, delivery, or fulfillment of third party offers, including Stack Drops.
This Stack Drop gives teams a chance to test and deploy with production-grade infrastructure, Kubernetes-native orchestration, and seamless integration between traditional ML and next-gen agentic workflows.
Both offers are available on a first-come basis and subject to eligibility. Make sure to review the details below to secure access before the deadline or cap is reached.
Don’t wait to operationalize your agent workflows with the support of the ZenML team:
ZenML is an MLOps platform purpose-built for the new era of agentic AI. It helps teams bridge traditional ML pipelines with dynamic, multi-agent workflows using Kubernetes-native orchestration and reproducible infrastructure.
By consolidating infrastructure complexity into a single interface, TrueFoundry enables faster iteration, smoother production rollouts, and reduced operating costs.
Why teams choose ZenML:
ZenML gives AI teams the tools and infrastructure they need to go from prototype to production—faster, safer, and more reliably.
The 6th annual MLOps World | GenAI Summit is taking place October 8–9, 2025 at the Austin Renaissance Hotel.
For AI practitioners, including AI Engineers, Agent Builders, Solution Architects, Vibe Coders, and infra teams, this is a high-impact, IRL opportunity to optimize and de-risk projects through cutting-edge strategies, real-world case studies, technical deep dives, and hands-on workshops.
Every session is carefully curated by a volunteer committee of top AI practitioners whose primary objective is to help industry colleagues understand where the line of AI in excellence is, right now.
The experience includes a vibrant expo, where attendees shift from focused learning to active problem-solving by engaging in Brain Dates, Community Square, Startup Zone, and interactive demos with leading AI solution providers, including Weights & Biases, Outerbounds, and DataBricks.
MLOps World | GenAI Summit is a compact and focused way to elevate skills, accelerate projects, and advance AI-centric careers.
Early Bird tickets are on sale now and offer 15% savings when you register in advance. Team discounts also available.
The post TMLS Stack Drop Offer: Get 30 Days Free ZenML Pro + 50% Off Agentic Pipeline Platform appeared first on MLOps World.
]]>The post TMLS Stack Drop Offer: Get 3 Months Free Access to TrueFoundry SaaS or Claim $40K On-Prem Offer appeared first on MLOps World.
]]>Cut Inference Costs and Simplify AI Deployment with a Unified Platform for LLMs, Agents, and ML Workloads

TrueFoundry is the all-in-one AI gateway and deployment platform trusted by enterprise teams building scalable LLM, agentic, and ML workloads.
Through this special Stack Drop offer, TMLS community members can get 3 months free access to their SaaS platform or unlock a $40K value on the on-premise enterprise package, available only to the first eligible redeemers.
Stack Drops are exclusive, limited deals curated for the TMLS community. Please note that TMLS and its events are not responsible for the terms, delivery, or fulfillment of third party offers, including Stack Drops.
TrueFoundry is offering this Stack Drop in two tracks, depending on organization size:
Enterprise Package ($40,000 Value):
SaaS Access (Free for 3 Months):
Both offers are available on a first-come basis and subject to eligibility. Make sure to review the details below to secure access before the deadline or cap is reached.
This Stack Drop is ready for you, just follow these quick steps.
TrueFoundry is an end-to-end AI deployment platform that helps teams run LLMs, agents, and ML models faster and more efficiently. It provides a unified control plane to manage multi-model deployments, failover routing, semantic caching, performance tracing, and cost governance, whether on cloud or on premise.
By consolidating infrastructure complexity into a single interface, TrueFoundry enables faster iteration, smoother production rollouts, and reduced operating costs.
Why teams choose TrueFoundry:
These capabilities make TrueFoundry a reliable and scalable choice for teams moving from prototype to production.
The 6th annual MLOps World | GenAI Summit is taking place October 8–9, 2025 at the Austin Renaissance Hotel.
For AI practitioners, including AI Engineers, Agent Builders, Solution Architects, Vibe Coders, and infra teams, this is a high-impact, IRL opportunity to optimize and de-risk projects through cutting-edge strategies, real-world case studies, technical deep dives, and hands-on workshops.
Every session is carefully curated by a volunteer committee of top AI practitioners whose primary objective is to help industry colleagues understand where the line of AI in excellence is, right now.
The experience also includes a vibrant expo, where attendees shift from focused learning to active problem-solving by engaging in Brain Dates, Community Square, Startup Zone, and interactive demos with leading AI solution providers, including Weights & Biases, Outerbounds, and DataBricks.
MLOps World | GenAI Summit is a compact and focused way to elevate skills, accelerate projects, and advance AI-centric careers.
Early Bird tickets are on sale now and offer 15% savings when you register in advance. Team discounts also available.
The post TMLS Stack Drop Offer: Get 3 Months Free Access to TrueFoundry SaaS or Claim $40K On-Prem Offer appeared first on MLOps World.
]]>The post Call for Volunteers Now open for 6th Annual MLOps World | GenAI Summit appeared first on MLOps World.
]]>
MLOps World | GenAI Summit 2025 is the premier, peer-curated event hosted by the Toronto Machine Learning Society (TMLS), designed to help AI practitioners scale systems in production through real-world insights and curated content.
We’re looking for passionate and reliable volunteers to help bring the 6th Annual MLOps World & Generative AI World Conference to life this October 8–9, 2025 in Austin, Texas.
By volunteering, you’ll become part of a global community of AI practitioners working together to share lessons, support one another’s growth, and drive safe and practical AI advancements.
Expect a diverse group of attendees, including AI engineers, agentic builders, solution architects, infra teams, LLM/SLM trainers, full-stack developers, founders, enterprise teams and researchers all bringing together years of expertise and unique perspectives.
Join us and let’s make this a powerful experience for AI practitioners and deepening your industry exposure and contacts.
The post Call for Volunteers Now open for 6th Annual MLOps World | GenAI Summit appeared first on MLOps World.
]]>The post Stack Drop Offer: Get 30% Off UbiAI to Build NLP Products Faster appeared first on MLOps World.
]]>Slash Time-to-Market with AI-Assisted Labeling, Fine-Tuning, and Deployment in One Place

UbiAI is the all-in-one NLP platform trusted by teams building custom LLMs, chatbots, summarization tools, and more. This special Stack Drop offer gives you 30% off any package, exclusively for TMLS community members and only for the first 50 redeemers.
This offer is part of Stack Drops, exclusive time-limited deals curated for the TMLS AI/ML community. Please note that TMLS and its events are not responsible for the terms, delivery, or fulfillment of third-party offers, including Stack Drops.
This exclusive TMLS offer from UbiAI is designed to help NLP teams move faster, with more accuracy and less effort. From labeling to fine-tuning to production, it’s all here:
Use code TMLS30 at checkout to get 30% off.
UbiAI is an end-to-end NLP platform that dramatically accelerates the development of custom language models. It allows teams to collect data, label it with the help of LLMs, fine-tune task-specific models, evaluate performance, and deploy to production, all within a single workflow.
By simplifying and unifying each step of the process, UbiAI reduces the time to deploy from months to days.
Why teams chose UbiAI:
The 6th annual MLOps World | GenAI Summit is taking place October 7–9, 2025 at the Austin Renaissance Hotel.
Don’t miss this chance to accelerate and de-risk your AI/ML, agentic, and infrastructure outcomes through cutting-edge strategies, real-world case studies, technical deep dives, and hands-on workshops. Every presentation is hand-picked by a committee of top AI practitioners whose primary goal is to help their industry colleagues understand where the line of AI in excellence is, right now.
The experience also includes a vibrant expo, where attendees shift from focused learning to active participation by engaging in Brain Dates, Community Stage, Startup Zone, and interactive demos with leading vendors like Weights & Biases, Outerbounds, and DataBricks.
MLOps World | GenAI Summit is a compact, high-impact way to learn, connect, and elevate your team, projects, and career.
Early Bird tickets are on sale now and offer 15% savings when you register in advance.
The post Stack Drop Offer: Get 30% Off UbiAI to Build NLP Products Faster appeared first on MLOps World.
]]>The post Stack Drop offer: Claim $2000 USD in GPU Credits + 30% Off Outerbounds to Launch Production-Grade AI Faster appeared first on MLOps World.
]]>Whether you’re building copilots, autonomous agents, or complex ML pipelines, Outerbounds gives you everything you need to create production-grade AI products, faster. This special Stack Drop offer includes platform access, hands-on onboarding, GPU credits, and a discount to make your path to market smoother and more efficient.
This offer is part of Stack Drops, exclusive time limited deals curated for the TMLS AI/ML community. Please note that TMLS and its events are not responsible for the terms, delivery, or fulfillment of third-party offers, including Stack Drops.
This exclusive TMLS offer from Outerbounds is designed to give serious AI teams a major head start. With GPU credits, discounted platform fees, and expert-led onboarding, you can go from idea to deployment with speed and confidence:
Visit outerbounds.com to get-started and mention the code OBSUMMER25 to redeem this limited-time offer.
To be eligible, your company must be a new Outerbounds customer and either:
Sign up with a valid business email and complete onboarding within 14 days to qualify.
Outerbounds is the production-grade platform built to help teams build standout AI products in their own cloud environment. Whether you’re running in AWS, GCP, or Azure, Outerbounds lets you bring together your data, models, and agents with the software rigor needed to build real AI applications.
Developed by the team that created Metaflow at Netflix, the platform is now trusted by top AI companies to deliver modern infrastructure with deep AI engineering expertise.
The 6th annual MLOps World | GenAI Summit is taking place October 7–9, 2025 at the Austin Renaissance Hotel.
Don’t miss this chance to accelerate and de-risk your AI/ML, agentic, and infrastructure outcomes through cutting-edge strategies, real-world case studies, technical deep dives, and hands-on workshops. Every presentation is hand-picked by a committee of top AI practitioners whose primary goal is to help their industry colleagues understand where the line of AI in excellence is, right now.
The experience also includes a vibrant expo, where attendees shift from focused learning to active participation by engaging in Brain Dates, Community Stage, Startup Zone, and interactive demos with leading vendors like Weights & Biases, Outerbounds, and Databricks.
MLOps World | GenAI Summit is a compact, high-impact way to learn, connect, and elevate your team, projects, and career.
Early Bird tickets are on sale now and offer 15% savings when you register in advance.
The post Stack Drop offer: Claim $2000 USD in GPU Credits + 30% Off Outerbounds to Launch Production-Grade AI Faster appeared first on MLOps World.
]]>The post Video: Unleashing the Algorithm Genie: AI as the Ultimate Inventor (feat. Jepson Taylor, VEOX ex-DataRobot / Dataiku) appeared first on MLOps World.
]]>From snowboarding epiphanies to billion-dollar fabs, Jepson Taylor has had a career defined by risky decisions and hard-earned lessons. In this engaging and unpredictable keynote from MLOps World | GenAI Summit 2024, Jepson explores how adaptation, storytelling, and agent-based systems are reshaping the boundaries of intelligence.
What begins with a chaotic decision-making framework quickly evolves into a profound reflection on how LLMs might outpace PhDs, how generative AI is transforming art and software, and why the next wave of machine learning may come from agents inventing their own algorithms.
This talk was recorded during MLOps Word | GenAI Summit 2024 which took place at the Austin Renaissance Hotel.
This talk is for AI practitioners, researchers, and innovators seeking to understand where intelligence and innovation intersect in the GenAI era:
Jepson Taylor is the CEO of VEOX and former Chief AI Evangelist at DataRobot. Known for his provocative takes and deep experience across hedge funds, high-stakes startups, and enterprise AI, Jepson now explores the edge of AI evolution, including agentic workflows and machine-led research.
The 6th annual MLOps World | GenAI Summit is taking place October 7–9, 2025 at the Austin Renaissance Hotel.
Don’t miss this chance to accelerate and de-risk your AI/ML, agentic, and infrastructure outcomes through cutting-edge strategies, real-world case studies, technical deep dives, and hands-on workshops. Every presentation is hand-picked by a committee of top AI practitioners whose only goal is to help their industry colleagues understand where the line of AI in production excellence is, right now.
The experience also includes a vibrant expo, where attendees shift from focused learning to active participation by engaging in Brain Dates, Community Stage, Startup Zone, and interactive demos with leading vendors like Weights & Measures, Outerbounds, and Data Bricks.
MLOps World | GenAI Summit is a high-impact way to learn, connect, and elevate your team, projects, and career.
Early Bird tickets are on sale now and offer 15% savings when you register in advance.
The post Video: Unleashing the Algorithm Genie: AI as the Ultimate Inventor (feat. Jepson Taylor, VEOX ex-DataRobot / Dataiku) appeared first on MLOps World.
]]>