no code implementations • 12 Dec 2024 • Artemis Panagopoulou, Honglu Zhou, Silvio Savarese, Caiming Xiong, Chris Callison-Burch, Mark Yatskar, Juan Carlos Niebles
In our framework, a unit test is represented as a novel image and answer pair meant to verify the logical correctness of a program produced for a given query.
1 code implementation • 29 May 2024 • Artemis Panagopoulou, Coby Melkin, Chris Callison-Burch
Bistable images, also known as ambiguous or reversible images, present visual stimuli that can be seen in two distinct interpretations, though not simultaneously by the observer.
2 code implementations • 30 Nov 2023 • Artemis Panagopoulou, Le Xue, Ning Yu, Junnan Li, Dongxu Li, Shafiq Joty, ran Xu, Silvio Savarese, Caiming Xiong, Juan Carlos Niebles
To enable this framework, we devise a scalable pipeline that automatically generates high-quality, instruction-tuning datasets from readily available captioning data across different modalities, and contribute 24K QA data for audio and 250K QA data for 3D.
1 code implementation • 24 May 2023 • Tuhin Chakrabarty, Arkadiy Saakyan, Olivia Winn, Artemis Panagopoulou, Yue Yang, Marianna Apidianaki, Smaranda Muresan
We propose to solve the task through the collaboration between Large Language Models (LLMs) and Diffusion Models: Instruct GPT-3 (davinci-002) with Chain-of-Thought prompting generates text that represents a visual elaboration of the linguistic metaphor containing the implicit meaning and relevant objects, which is then used as input to the diffusion-based text-to-image models. Using a human-AI collaboration framework, where humans interact both with the LLM and the top-performing diffusion model, we create a high-quality dataset containing 6, 476 visual metaphors for 1, 540 linguistic metaphors and their associated visual elaborations.
1 code implementation • CVPR 2024 • Le Xue, Ning Yu, Shu Zhang, Artemis Panagopoulou, Junnan Li, Roberto Martín-Martín, Jiajun Wu, Caiming Xiong, ran Xu, Juan Carlos Niebles, Silvio Savarese
It achieves a new SOTA of 50. 6% (top-1) on Objaverse-LVIS and 84. 7% (top-1) on ModelNet40 in zero-shot classification.
Ranked #10 on
3D Point Cloud Classification
on ScanObjectNN
(using extra training data)
2 code implementations • CVPR 2023 • Yue Yang, Artemis Panagopoulou, Shenghao Zhou, Daniel Jin, Chris Callison-Burch, Mark Yatskar
Overall, LaBo demonstrates that inherently interpretable models can be widely applied at similar, or better, performance than black box approaches.
1 code implementation • 24 Oct 2022 • Yue Yang, Artemis Panagopoulou, Marianna Apidianaki, Mark Yatskar, Chris Callison-Burch
We propose to extract these properties from images and use them in an ensemble model, in order to complement the information that is extracted from language models.
no code implementations • 17 Nov 2021 • Yue Yang, Joongwon Kim, Artemis Panagopoulou, Mark Yatskar, Chris Callison-Burch
Schemata are structured representations of complex tasks that can aid artificial intelligence by allowing models to break down complex tasks into intermediate steps.
1 code implementation • EMNLP 2021 • Yue Yang, Artemis Panagopoulou, Qing Lyu, Li Zhang, Mark Yatskar, Chris Callison-Burch
Understanding what sequence of steps are needed to complete a goal can help artificial intelligence systems reason about human activities.
Ranked #1 on
VGSI
on wikiHow-image