ICLR 2026

Presenting a Paper is an Art: Self-Improvement Aesthetic Agents for Academic Presentations

Chengzhi Liu^1,*, Yuzhe Yang^1,*, Kaiwen Zhou², Zhen Zhang¹, Yue Fan², Yanan Xie³, Peng Qi³, Xin Eric Wang¹

^* Equal contribution

UC Santa Barbara

UC Santa Cruz

Uniphore

arXiv Code 🗂️ Dataset ▶️ Demo 📝 BibTeX

EvoPresent unifies coherent storytelling, aesthetic-aware slide design, and lifelike talking-head delivery. A dedicated Checker Agent, powered by the PresAesth multi-task reinforcement learning model, critiques every draft and guides iterative self-improvement, enabling reliable and engaging academic presentations from only raw paper materials.

Overview

The promotion of academic papers has become an important means of enhancing research visibility. However, existing automated methods struggle with limited storytelling, insufficient aesthetic quality, and constrained self-adjustment. EvoPresent addresses these limitations with a self-improvement agent that unifies coherent narratives, aesthetic-aware designs, and realistic presentation delivery via virtual characters. Central to EvoPresent is PresAesth, a multi-task reinforcement learning aesthetic model that supplies reliable scoring, defect adjustment, and comparative feedback so the agent can iteratively refine its output even with limited training data. To systematically evaluate end-to-end systems, we also introduce the EvoPresent Benchmark, featuring (i) Presentation Generation Quality, covering 650 top-tier AI conference papers with multimodal resources, and (ii) Aesthetic Awareness, containing 2,000 slide pairs with varying quality to support joint training and evaluation.

High-quality feedback matters

Strong initial capability alone cannot guarantee effective self-correction; the PresAesth critic is essential for meaningful improvements.

Design vs. content trade-off

Automated pipelines often sacrifice layout polish for factual coverage. EvoPresent balances both objectives within a single agent loop.

Multi-task RL generalization

Training PresAesth jointly on scoring, defect adjustment, and pairwise comparison yields better aesthetic awareness than single-task baselines.

💻 EvoPresent Agent Pipeline

Overview of the EvoPresent framework. (a) EvoPresent first performs content extraction and voice generation, then constructs the storyline and script, followed by content enhancement using image generation and knowledge retrieval. Design and rendering are handled next, and the aesthetic checker evaluates the initial slide and provides adjustments. (b) PresAesth is trained on a human-preference aesthetic dataset via multiple tasks (scoring, defect adjustment, and comparison). (c) The PresAesth model guides the agent framework in iterative self-improvement.

The complete agent loop stage-by-stage.

✨ Aesthetic Judgement for Self-Improvement

Checker Agent powered by PresAesth.

EvoPresent's high-quality output is driven by an iterative "draft → feedback → refinement" cycle supervised by a dedicated Checker Agent. The checker leverages PresAesth, a multi-task aesthetic model trained with Multi-Task Group Policy Optimization (GRPO) on human preference data. A single pass through PresAesth yields an absolute aesthetic score, identifies concrete defects (e.g., layout imbalance or typography issues), and compares competing slide candidates. This compound signal unlocks self-directed improvements without relying on expensive human-in-the-loop reviews.

EvoPresent Benchmark

Evaluation protocol spanning generation quality and aesthetic awareness.

The EvoPresent Benchmark offers a comprehensive suite for evaluating both presentation generation and aesthetic models. Its data sources are twofold: curated materials from top-tier AI conferences (slides, videos, and scripts) and a specialized dataset of paired slides with varying aesthetic quality. Correspondingly, its evaluation metrics assess (i) content fidelity and design quality measured against conference materials, and (ii) the model's capabilities in absolute scoring, defect identification, and pairwise comparison using the paired aesthetic slides. This structure enables rigorous and reproducible evaluation for both content generation and aesthetic judgment.

🎨 Aesthetic Comparison

Side-by-side comparisons highlight PresAesth guidance on layout balance, typography, and iconography.

Interactive Demos

Nine paired demos showcase slide interactivity and synchronized talking-head delivery. Each video is generated directly from the EvoPresent pipeline using the same script as the interactive deck.

Demo 0

Slides

Presenter Video

Demo 1

Slides

Presenter Video

Demo 2

Slides

Presenter Video

Demo 3

Slides

Presenter Video

Demo 4

Slides

Presenter Video

Demo 5

Slides

Presenter Video

Demo 6

Slides

Presenter Video

Demo 7

Slides

Presenter Video

Demo 8

Slides

Presenter Video

BibTeX

@misc{liu2025presentingpaperartselfimprovement,
  title        = {Presenting a Paper is an Art: Self-Improvement Aesthetic Agents for Academic Presentations},
  author       = {Chengzhi Liu and Yuzhe Yang and Kaiwen Zhou and Zhen Zhang and Yue Fan and Yanan Xie and Peng Qi and Xin Eric Wang},
  year         = {2025},
  eprint       = {2510.05571},
  archivePrefix= {arXiv},
  primaryClass = {cs.CL},
  url          = {https://arxiv.org/abs/2510.05571}
}