Screen2Words is a large-scale screen summarization dataset annotated by human workers. The dataset contains more than 112k language summarization across 22k unique UI screens. This dataset can be used for Mobile User Interface Summarization, which is a task where a model generates succinct language descriptions of mobile screens for conveying important contents and functionalities of the screen.
Paper | Code | Results | Date | Stars |
---|