Publications
🧭 Graph-based LLM Systems
Designing cost-efficient, high-performance graph-based LLM systems spanning Retrieval-Augmented Generation (RAG), structured agent memory, and large-scale social simulation.
- Yingli Zhou, Zixuan Wang, Yixiang Fang. HAMMER: An Automatic RAG Tuning System via Hierarchical Memory-Guided Monte Carlo Tree Search. Proceedings of the ACM on Management of Data (SIGMOD), 2026.
- Shu Wang, Yixiang Fang, Yingli Zhou, Xilin Liu, Yuchi Ma. ArchRAG: Attributed Community-based Hierarchical Retrieval-Augmented Generation. AAAI Conference on Artificial Intelligence (AAAI Oral), 2026.
- Yaodong Su, Yixiang Fang, Yingli Zhou, Chuanhui Yang. Clue-RAG: Towards Accurate and Cost-Efficient Graph-based RAG via Multi-Partite Graph-based Index. IEEE International Conference on Data Engineering (ICDE), 2026.
- Yingli Zhou, Yaodong Su, Youran Sun, Shu Wang, Taotao Wang, Runyuan He, Yongwei Zhang, Sicong Liang, Xilin Liu, Yuchi Ma, Yixiang Fang. In-depth Analysis of Graph-based RAG in a Unified Framework. Proceedings of the VLDB Endowment (VLDB), 18(13): 5623 - 5637. (High-Star Project)
- Yingli Zhou, Shu Wang. Towards the Next Generation of Agent Systems: From RAG to Agentic AI. Proceedings of the VLDB Endowment (VLDB), Graph+LLM Workshop, 2025.
- Fangyuan Zhang, Zhengjun Huang, Yingli Zhou^${*}$, Qingtian Guo, Wensheng Luo, Xiaofang Zhou. Scalable Graph-based Retrieval-Augmented Generation via Locality-Sensitive Hashing. Proceedings of the VLDB Endowment (VLDB), Graph+LLM Workshop, 2025.
🔨 Large (Language) Models for Data
Advancing database system optimizations and data management through the application of pre-trained models and LLMs, covering pivotal tasks such as dataset search, cardinality estimation, latency prediction, and automated testing.
- Yingli Zhou#, Tianjing Zeng#, Rong Zhu, Yingze Li, Junwei Lan, Zhewei Wei, Yixiang Fang, Bolin Ding, and Jingren Zhou. Lamba: A Pretrained Model for Latency Prediction over Distributed Databases. The VLDB Journal (VLDBJ), 2026.
- Yujia Chen, Yingli Zhou, Fangyuan Zhang, Cuiyun Gao. LLM-Based Test Case Generation in DBMS through Monte Carlo Tree Search. International Conference on Software Engineering (ICSE), Industry Challenge Track, 2026.
- Tianjing Zeng, Junwei Lan, Jiahong Ma, Wenqing Wei, Rong Zhu, Yingli Zhou, Pengfei Li, Bolin Ding, Defu Lian, Zhewei Wei, Jingren Zhou. PRICE: A Pretrained Model for Cross-Database Cardinality Estimation. Proceedings of the VLDB Endowment (VLDB),18(3): 637 - 650, 2025.
⚡️ Graph Mining & Algorithms
Developing highly efficient and scalable algorithms for graph mining and graph data management, including densest subgraph discovery, community search, and clique counting/listing.
- Yingli Zhou, Youran Sun, Yixiang Fang. Efficient Anchored Densest Subgraph Discovery: Improved Time Complexity and Practical Performance. Proceedings of the ACM on Management of Data (SIGMOD), 2026.
- Yingli Zhou, HuiZhong Wang, Chenhao Ma, Yixiang Fang. A Semantics-aware Approach for Graph Edit Distance Estimation over Knowledge Graphs. Proceedings of the VLDB Endowment (VLDB), 19(6): 1226 - 1239, 2026.
- Jingbang Chen#, Weinuo Li#, Yingli Zhou#, Hangrui Zhou, Qiuyang Mang, Can Wang, Yixiang Fang, Chenhao Ma. Scalable Approximate Biclique Counting over Large Bipartite Graphs. Proceedings of the VLDB Endowment (VLDB), 2026.
- Luocheng Liang, Yingli Zhou, Yixiang Fang. Accelerated Coordinate Descent for Directed Densest Subgraph Discovery. ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), 2026.
- Youran Sun, Yingli Zhou, Yixiang Fang, Cheng Chen, Yongmin Hu, Yingqian Hu. Efficient Influential Community Search over Dynamic Graphs. Proceedings of the ACM on Management of Data (SIGMOD), 2026.
- Yingli Zhou#, Yige Jiang#, Yixiang Fang, Wensheng Luo, Yongmin Hu, Yingqian Hu, Cheng Chen. Effective Durable Community Search in Large Temporal Graph. Proceedings of the VLDB Endowment (VLDB), 2026.
- Yingli Zhou, Luocheng Liang, Yixiang Fang. Efficient and Scalable Directed Densest Subgraph Discovery. Proceedings of the ACM on Management of Data (SIGMOD), 2026.
- Yingli Zhou#, Qingshuo Guo#, Yixiang Fang. Efficient 𝑘-Clique Densest Subgraph Discovery: Towards Bridging Practice and Theory. Proceedings of the VLDB Endowment (VLDB), 18(10): 3490-3503, 2025.
- Yue Zhang, Yankai Chen, Yingli Zhou, Yucan Guo, Xiaolin Han, Chenhao Ma. UTCS: Effective Unsupervised Temporal Community Search with Pre-training of Temporal Dynamics and Subgraph Knowledge. International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), 2025.
- Qiuyang Mang, Jingbang Chen, Hangrui Zhou, Yu Gao, Yingli Zhou, Qingyu Shi, Richard Peng, Yixiang Fang, Chenhao Ma. Efficient Historical Butterfly Counting in Large Temporal Bipartite Networks via Graph Structure-aware Index. Proceedings of the VLDB Endowment (VLDB), 18(6): 1607-1620, 2025.
- Yingli Zhou, Qingshuo Guo, Yi Yang, Yixiang Fang, Chenhao Ma, Laks V.S. Lakshmanan. In-depth Analysis of Densest Subgraph Discovery in a Unified Framework. Proceedings of the VLDB Endowment (VLDB), (PVLDB), 18(4): 1131-1144, 2025.
- Yingli Zhou, Yixiang Fang, Chenhao Ma, Tianci Hou, Xin Huang. Efficient Maximal Motif-Clique Enumeration over Large Heterogeneous Information Networks. Proceedings of the VLDB Endowment (VLDB), 17(11): 2946-2959, 2024.
- Wensheng Luo, Yixiang Fang, Chunxu Lin, Yingli Zhou. Efficient Parallel D-core Decomposition at Scale. Proceedings of the VLDB Endowment (VLDB), 17(10): 2654-2667, 2024.
- Yingli Zhou, Qingshuo Guo, Yixiang Fang, Chenhao Ma. A Counting-based Approach for Efficient 𝑘-Clique Densest Subgraph Discovery. Proceedings of the ACM on Management of Data (SIGMOD), 2(3):156:2-156:27, 2024.
- Yingli Zhou, Yixiang Fang, Wensheng Luo, Yunming Ye. Influential Community Search over Large Heterogeneous Information Networks. Proceedings of the VLDB Endowment (VLDB), 16(8): 2047-2059, 2023. | [pdf] | [Code]
⚙️ Database Systems & Industry Applications
Bridging theoretical research with industrial practice by building scalable, real-time data processing engines and robust database architectures for modern data warehouses.
- Fangyuan Zhang, Mengqi Wu, Chunlei Xu, Yunong Bao, Jiyu Qiao, Yingli Zhou, Hua Fan, Caihua Yin, Wenchao Zhou, Feifei Li. Streaming View: An Efficient Data Processing Engine for Modern Real-time Data Warehouse of Alibaba Cloud. Proceedings of the VLDB Endowment (VLDB), 2025.