1 code implementation • 23 Sep 2023 • Bram Willemsen, Livia Qian, Gabriel Skantze
Vision-language models (VLMs) have shown to be effective at image retrieval based on simple text queries, but text-image retrieval based on conversational input remains a challenge.