no code implementations • 22 May 2025 • Weizhe Lin, Xing Li, Zhiyuan Yang, Xiaojin Fu, Hui-Ling Zhen, Yaoyuan Wang, Xianzhi Yu, Wulong Liu, Xiaosong Li, Mingxuan Yuan
Large Reasoning Models (LRMs) demonstrate exceptional capability in tackling complex mathematical, logical, and coding tasks by leveraging extended Chain-of-Thought (CoT) reasoning.
1 code implementation • 18 Feb 2025 • Jingbiao Mei, Jinghong Chen, Guangyu Yang, Weizhe Lin, Bill Byrne
Hateful memes have become a significant concern on the Internet, necessitating robust automated detection systems.
Ranked #1 on
Hateful Meme Classification
on Hateful Memes
1 code implementation • 12 Jan 2025 • Xinyi Zheng, Steve Zhang, Weizhe Lin, Aaron Zhang, Walterio W. Mayol-Cuevas, Junxiao Shen
The dataset enables seamless integration with multi-modal data, supporting a range of 3D applications, from architectural reconstruction to virtual tourism.
no code implementations • 12 Jan 2025 • Wenqi Zhou, Kai Cao, Hao Zheng, Xinyi Zheng, Miao Liu, Per Ola Kristensson, Walterio Mayol-Cuevas, Fan Zhang, Weizhe Lin, Junxiao Shen
Leveraging the advanced text processing capabilities of large language models (LLMs), X-LeBench develops a life-logging simulation pipeline that produces realistic, coherent daily plans aligned with real-world video data.
no code implementations • 19 Nov 2024 • Weizhe Lin, Junxiao Shen
The rapid evolution of artificial intelligence, especially through multi-modal large language models, has redefined user interactions, enabling responses that are contextually rich and human-like.
no code implementations • 1 Nov 2024 • Zihong He, Weizhe Lin, Hao Zheng, Fan Zhang, Matt W. Jones, Laurence Aitchison, Xuhai Xu, Miao Liu, Per Ola Kristensson, Junxiao Shen
With the rapid advancement of AI systems, their abilities to store, retrieve, and utilize information over the long term - referred to as long-term memory - have become increasingly significant.
no code implementations • 25 Sep 2024 • Jinghong Chen, Guangyu Yang, Weizhe Lin, Jingbiao Mei, Bill Byrne
We derive and investigate two DPO variants that explicitly model the possibility of declaring a tie in pair-wise comparisons.
1 code implementation • 10 Apr 2024 • Jinghong Chen, Weizhe Lin, Jingbiao Mei, Bill Byrne
The Directed Acyclic Transformer is a fast non-autoregressive (NAR) model that performs well in Neural Machine Translation.
no code implementations • 17 Mar 2024 • Igor Sterner, Weizhe Lin, Jinghong Chen, Bill Byrne
Two approaches have emerged to input images into large language models (LLMs).
1 code implementation • 13 Feb 2024 • Weizhe Lin, Jingbiao Mei, Jinghong Chen, Bill Byrne
Large Multimodal Models (LMMs) excel in natural language and visual understanding but are challenged by exacting tasks such as Knowledge-based Visual Question Answering (KB-VQA) which involve the retrieval of relevant information from document collections to use in shaping answers to questions.
Ranked #1 on
Retrieval
on InfoSeek
(using extra training data)
1 code implementation • 14 Nov 2023 • Guangyu Yang, Jinghong Chen, Weizhe Lin, Bill Byrne
Minimum Bayes Risk (MBR) decoding can significantly improve translation performance of Multilingual Large Language Models (MLLMs).
1 code implementation • 14 Nov 2023 • Jingbiao Mei, Jinghong Chen, Weizhe Lin, Bill Byrne, Marcus Tomalin
Hateful memes have emerged as a significant concern on the Internet.
Ranked #2 on
Meme Classification
on MultiOFF
1 code implementation • NeurIPS 2023 • Weizhe Lin, Jinghong Chen, Jingbiao Mei, Alexandru Coca, Bill Byrne
FLMR addresses two major limitations in RA-VQA's retriever: (1) the image representations obtained via image-to-text transforms can be incomplete and inaccurate and (2) relevance scores between queries and documents are computed with one-dimensional embeddings, which can be insensitive to finer-grained relevance.
Ranked #1 on
Retrieval
on OK-VQA
no code implementations • 23 Sep 2023 • Alexandru Coca, Bo-Hsiang Tseng, Jinghong Chen, Weizhe Lin, Weixuan Zhang, Tisha Anders, Bill Byrne
Schema-guided dialogue state trackers can generalise to new domains without further training, yet they are sensitive to the writing style of the schemata.
no code implementations • 19 Mar 2023 • Weizhe Lin, Zhilin Wang, Bill Byrne
The widely used Fact-based Visual Question Answering (FVQA) dataset contains visually-grounded questions that require information retrieval using common sense knowledge graphs to answer.
1 code implementation • 29 Jan 2023 • Jinghong Chen, Weizhe Lin, Bill Byrne
We show that SGSAcc can be applied to evaluate utterances generated from a wide range of dialogue actions in the Schema Guided Dialogue (SGD) dataset with good agreement with human judgment.
1 code implementation • 7 Oct 2022 • Weizhe Lin, Bill Byrne
The strong retrieval ability of our model significantly reduces the number of retrieved documents needed in training, yielding significant benefits in answer quality and computation required for training.
Ranked #2 on
Retrieval
on OK-VQA
no code implementations • 2 Apr 2022 • Weizhe Lin, Linjun Shou, Ming Gong, Pei Jian, Zhilin Wang, Bill Byrne, Daxin Jiang
Knowledge graph (KG) based Collaborative Filtering is an effective approach to personalizing recommendation systems for relatively static domains such as movies and books, by leveraging structured information from KG to enrich both item and user representations.
1 code implementation • EMNLP 2021 • Weizhe Lin, Bo-Hsiang Tseng, Bill Byrne
Dialogue State Tracking is central to multi-domain task-oriented dialogue systems, responsible for extracting information from user utterances.
Ranked #1 on
Multi-domain Dialogue State Tracking
on MULTIWOZ 2.0
1 code implementation • 26 Nov 2020 • QingBiao Li, Weizhe Lin, Zhe Liu, Amanda Prorok
Our Message-Aware Graph Attention neTwork (MAGAT) is based on a key-query-like mechanism that determines the relative importance of features in the messages received from various neighboring robots.
no code implementations • NAACL (NUSE) 2021 • Zhilin Wang, Weizhe Lin, Xiaodong Wu
While many different aspects of human experiences have been studied by the NLP community, none has captured its full richness.
1 code implementation • 31 Jul 2020 • Weizhe Lin, Indigo Orton, QingBiao Li, Gabriela Pavarini, Marwa Mahmoud
Compared to modalities such as face, head, and vocal, research investigating the use of the body modality for these tasks is relatively sparse.
Ranked #1 on
Anxiety Detection
on Well-being Dataset
no code implementations • 17 Mar 2020 • Xiaodong Wu, Weizhe Lin, Zhilin Wang, Elena Rastorgueva
Online forums and social media platforms provide noisy but valuable data every day.
no code implementations • WS 2019 • Zhilin Wang, Elena Rastorgueva, Weizhe Lin, Xiaodong Wu
This model is built upon the BERT Next Sentence Prediction model and reduces the time complexity for clustering all posts in a corpus from O(n{\^{}}2) to O(n) with respect to the number of posts.