Publications

🧭 Graph-based LLM Systems

Designing cost-efficient, high-performance graph-based LLM systems spanning Retrieval-Augmented Generation (RAG), structured agent memory, and large-scale social simulation.
  1. Yingli Zhou, Zixuan Wang, Yixiang Fang. HAMMER: An Automatic RAG Tuning System via Hierarchical Memory-Guided Monte Carlo Tree Search. Proceedings of the ACM on Management of Data (SIGMOD), 2026.
  2. Shu Wang, Yixiang Fang, Yingli Zhou, Xilin Liu, Yuchi Ma. ArchRAG: Attributed Community-based Hierarchical Retrieval-Augmented Generation. AAAI Conference on Artificial Intelligence (AAAI Oral), 2026.
  3. Yaodong Su, Yixiang Fang, Yingli Zhou, Chuanhui Yang. Clue-RAG: Towards Accurate and Cost-Efficient Graph-based RAG via Multi-Partite Graph-based Index. IEEE International Conference on Data Engineering (ICDE), 2026.
  4. Yingli Zhou, Yaodong Su, Youran Sun, Shu Wang, Taotao Wang, Runyuan He, Yongwei Zhang, Sicong Liang, Xilin Liu, Yuchi Ma, Yixiang Fang. In-depth Analysis of Graph-based RAG in a Unified Framework. Proceedings of the VLDB Endowment (VLDB), 18(13): 5623 - 5637. (High-Star Project)
  5. Yingli Zhou, Shu Wang. Towards the Next Generation of Agent Systems: From RAG to Agentic AI. Proceedings of the VLDB Endowment (VLDB), Graph+LLM Workshop, 2025.
  6. Fangyuan Zhang, Zhengjun Huang, Yingli Zhou^${*}$, Qingtian Guo, Wensheng Luo, Xiaofang Zhou. Scalable Graph-based Retrieval-Augmented Generation via Locality-Sensitive Hashing. Proceedings of the VLDB Endowment (VLDB), Graph+LLM Workshop, 2025.

🔨 Large (Language) Models for Data

Advancing database system optimizations and data management through the application of pre-trained models and LLMs, covering pivotal tasks such as dataset search, cardinality estimation, latency prediction, and automated testing.
  1. Yingli Zhou#, Tianjing Zeng#, Rong Zhu, Yingze Li, Junwei Lan, Zhewei Wei, Yixiang Fang, Bolin Ding, and Jingren Zhou. Lamba: A Pretrained Model for Latency Prediction over Distributed Databases. The VLDB Journal (VLDBJ), 2026.
  2. Yujia Chen, Yingli Zhou, Fangyuan Zhang, Cuiyun Gao. LLM-Based Test Case Generation in DBMS through Monte Carlo Tree Search. International Conference on Software Engineering (ICSE), Industry Challenge Track, 2026.
  3. Tianjing Zeng, Junwei Lan, Jiahong Ma, Wenqing Wei, Rong Zhu, Yingli Zhou, Pengfei Li, Bolin Ding, Defu Lian, Zhewei Wei, Jingren Zhou. PRICE: A Pretrained Model for Cross-Database Cardinality Estimation. Proceedings of the VLDB Endowment (VLDB),18(3): 637 - 650, 2025.

⚡️ Graph Mining & Algorithms

Developing highly efficient and scalable algorithms for graph mining and graph data management, including densest subgraph discovery, community search, and clique counting/listing.
  1. Yingli Zhou, Youran Sun, Yixiang Fang. Efficient Anchored Densest Subgraph Discovery: Improved Time Complexity and Practical Performance. Proceedings of the ACM on Management of Data (SIGMOD), 2026.
  2. Yingli Zhou, HuiZhong Wang, Chenhao Ma, Yixiang Fang. A Semantics-aware Approach for Graph Edit Distance Estimation over Knowledge Graphs. Proceedings of the VLDB Endowment (VLDB), 19(6): 1226 - 1239, 2026.
  3. Jingbang Chen#, Weinuo Li#, Yingli Zhou#, Hangrui Zhou, Qiuyang Mang, Can Wang, Yixiang Fang, Chenhao Ma. Scalable Approximate Biclique Counting over Large Bipartite Graphs. Proceedings of the VLDB Endowment (VLDB), 2026.
  4. Luocheng Liang, Yingli Zhou, Yixiang Fang. Accelerated Coordinate Descent for Directed Densest Subgraph Discovery. ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), 2026.
  5. Youran Sun, Yingli Zhou, Yixiang Fang, Cheng Chen, Yongmin Hu, Yingqian Hu. Efficient Influential Community Search over Dynamic Graphs. Proceedings of the ACM on Management of Data (SIGMOD), 2026.
  6. Yingli Zhou#, Yige Jiang#, Yixiang Fang, Wensheng Luo, Yongmin Hu, Yingqian Hu, Cheng Chen. Effective Durable Community Search in Large Temporal Graph. Proceedings of the VLDB Endowment (VLDB), 2026.
  7. Yingli Zhou, Luocheng Liang, Yixiang Fang. Efficient and Scalable Directed Densest Subgraph Discovery. Proceedings of the ACM on Management of Data (SIGMOD), 2026.
  8. Yingli Zhou#, Qingshuo Guo#, Yixiang Fang. Efficient 𝑘-Clique Densest Subgraph Discovery: Towards Bridging Practice and Theory. Proceedings of the VLDB Endowment (VLDB), 18(10): 3490-3503, 2025.
  9. Yue Zhang, Yankai Chen, Yingli Zhou, Yucan Guo, Xiaolin Han, Chenhao Ma. UTCS: Effective Unsupervised Temporal Community Search with Pre-training of Temporal Dynamics and Subgraph Knowledge. International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), 2025.
  10. Qiuyang Mang, Jingbang Chen, Hangrui Zhou, Yu Gao, Yingli Zhou, Qingyu Shi, Richard Peng, Yixiang Fang, Chenhao Ma. Efficient Historical Butterfly Counting in Large Temporal Bipartite Networks via Graph Structure-aware Index. Proceedings of the VLDB Endowment (VLDB), 18(6): 1607-1620, 2025.
  11. Yingli Zhou, Qingshuo Guo, Yi Yang, Yixiang Fang, Chenhao Ma, Laks V.S. Lakshmanan. In-depth Analysis of Densest Subgraph Discovery in a Unified Framework. Proceedings of the VLDB Endowment (VLDB), (PVLDB), 18(4): 1131-1144, 2025.
  12. Yingli Zhou, Yixiang Fang, Chenhao Ma, Tianci Hou, Xin Huang. Efficient Maximal Motif-Clique Enumeration over Large Heterogeneous Information Networks. Proceedings of the VLDB Endowment (VLDB), 17(11): 2946-2959, 2024.
  13. Wensheng Luo, Yixiang Fang, Chunxu Lin, Yingli Zhou. Efficient Parallel D-core Decomposition at Scale. Proceedings of the VLDB Endowment (VLDB), 17(10): 2654-2667, 2024.
  14. Yingli Zhou, Qingshuo Guo, Yixiang Fang, Chenhao Ma. A Counting-based Approach for Efficient 𝑘-Clique Densest Subgraph Discovery. Proceedings of the ACM on Management of Data (SIGMOD), 2(3):156:2-156:27, 2024.
  15. Yingli Zhou, Yixiang Fang, Wensheng Luo, Yunming Ye. Influential Community Search over Large Heterogeneous Information Networks. Proceedings of the VLDB Endowment (VLDB), 16(8): 2047-2059, 2023. | [pdf] | [Code]

⚙️ Database Systems & Industry Applications

Bridging theoretical research with industrial practice by building scalable, real-time data processing engines and robust database architectures for modern data warehouses.
  1. Fangyuan Zhang, Mengqi Wu, Chunlei Xu, Yunong Bao, Jiyu Qiao, Yingli Zhou, Hua Fan, Caihua Yin, Wenchao Zhou, Feifei Li. Streaming View: An Efficient Data Processing Engine for Modern Real-time Data Warehouse of Alibaba Cloud. Proceedings of the VLDB Endowment (VLDB), 2025.