Official implementation of A Large-scale Webpage-to-Code Dataset and Evaluation Framework for Multimodal LLMs.
- Institute: Mohamed bin Zayed University of Artificial Intelligence
- Resources: [Paper] [Project Page] [Web2Code Dataset][Croissant]
[2024/6/27] The paper and project page are released!
Explore our comprehensive benchmarks for evaluating webpage-related tasks.
Set up your environment, generate webpage screenshots, and run evaluations efficiently. Get started here: Webpage Code Generation Benchmark
Find clear instructions for setting up your environment, generating outputs, and running evaluations. Begin here: Webpage Understanding Benchmark
- LLaVA: the codebase we built upon. Thanks for their wonderful work.
- WebSRC, WebSight, Pix2Code: some high-quality web page and HTML code related dataset!
If you find our work helpful for your research, please consider giving a star ⭐ and citation 📝
@article{web2code2024,
title={Web2Code: A Large-scale Webpage-to-Code Dataset and Evaluation Framework for Multimodal LLMs},
journal={arXiv preprint},
year={2024}
}
Usage and License Notices: Usage and License Notices: The data is intended and licensed for research use only. The dataset is CC BY 4.0 (allowing only non-commercial use) and models trained using the dataset should not be used outside of research purposes.
