no code implementations • EMNLP 2021 • Jing Lu, Gustavo Hernandez Abrego, Ji Ma, Jianmo Ni, Yinfei Yang
In the context of neural passage retrieval, we study three promising techniques: synthetic data generation, negative sampling, and fusion.
no code implementations • 16 Dec 2024 • Honglin Yang, Ji Ma, Xiao Yu
The optimization-based meta-learning approach is gaining increased traction because of its unique ability to quickly adapt to a new task using only small amounts of data.
no code implementations • 9 Dec 2024 • Wei Suo, Ji Ma, Mengyang Sun, Lin Yuanbo Wu, Peng Wang, Yanning Zhang
Although Large Vision-Language Models (LVLMs) have achieved impressive results, their high computational cost poses a significant barrier to wider application.
no code implementations • 28 Oct 2024 • Ji Ma
As Large Language Model (LLM)-based agents increasingly undertake real-world tasks and engage with human society, how well do we understand their behaviors?
no code implementations • 10 Aug 2024 • Jiang Yuan, Ji Ma, Bo wang, Weiming Hu
Implicit degradation modeling-based blind super-resolution (SR) has attracted more increasing attention in the community due to its excellent generalization to complex degradation scenarios and wide application range.
no code implementations • 24 Jul 2024 • Dongyang Xu, Qingfan Wang, Ji Ma, Xiangyun Zeng, Lei Chen
Accurate driver attention prediction can serve as a critical reference for intelligent vehicles in understanding traffic scenes and making informed driving decisions.
no code implementations • 30 Jun 2024 • Pengying Wu, Yao Mu, Kangjie Zhou, Ji Ma, Junting Chen, Chang Liu
Visual navigation tasks are critical for household service robots.
no code implementations • 21 May 2024 • Ji Ma, Wei Suo, Peng Wang, Yanning Zhang
Vision-Language Instruction Tuning (VLIT) is a critical training phase for Large Vision-Language Models (LVLMs).
1 code implementation • 25 Apr 2024 • Zhe Chen, Weiyun Wang, Hao Tian, Shenglong Ye, Zhangwei Gao, Erfei Cui, Wenwen Tong, Kongzhi Hu, Jiapeng Luo, Zheng Ma, Ji Ma, Jiaqi Wang, Xiaoyi Dong, Hang Yan, Hewei Guo, Conghui He, Botian Shi, Zhenjiang Jin, Chao Xu, Bin Wang, Xingjian Wei, Wei Li, Wenjian Zhang, Bo Zhang, Pinlong Cai, Licheng Wen, Xiangchao Yan, Min Dou, Lewei Lu, Xizhou Zhu, Tong Lu, Dahua Lin, Yu Qiao, Jifeng Dai, Wenhai Wang
Compared to both open-source and proprietary models, InternVL 1. 5 shows competitive performance, achieving state-of-the-art results in 8 of 18 benchmarks.
Ranked #6 on
Multiple-choice
on Neptune-Full
no code implementations • 29 Feb 2024 • Ji Ma, Hongming Dai, Yao Mu, Pengying Wu, Hao Wang, Xiaowei Chi, Yang Fei, Shanghang Zhang, Chang Liu
Zero-Shot Object Navigation (ZSON) requires agents to autonomously locate and approach unseen objects in unfamiliar environments and has emerged as a particularly challenging task within the domain of Embodied AI.
no code implementations • 5 Jan 2024 • Pengying Wu, Yao Mu, Bingxian Wu, Yi Hou, Ji Ma, Shanghang Zhang, Chang Liu
In the realm of household robotics, the Zero-Shot Object Navigation (ZSON) task empowers agents to adeptly traverse unfamiliar environments and locate objects from novel categories without prior explicit training.
1 code implementation • 19 Sep 2023 • Yang Gao, Ji Ma, Ivan Korotkov, Keith Hall, Dana Alon, Don Metzler
We propose the first multilingual scientific documents dataset, Open-access Multilingual Scientific Documents (OpenMSD), which has 74M papers in 103 languages and 778M citation pairs.
no code implementations • 20 Dec 2022 • Jing Lu, Keith Hall, Ji Ma, Jianmo Ni
We present Hybrid Infused Reranking for Passages Retrieval (HYRR), a framework for training rerankers based on a hybrid of BM25 and neural retrieval models.
1 code implementation • 15 Dec 2022 • Bernd Bohnet, Vinh Q. Tran, Pat Verga, Roee Aharoni, Daniel Andor, Livio Baldini Soares, Massimiliano Ciaramita, Jacob Eisenstein, Kuzman Ganchev, Jonathan Herzig, Kai Hui, Tom Kwiatkowski, Ji Ma, Jianmo Ni, Lierni Sestorain Saralegui, Tal Schuster, William W. Cohen, Michael Collins, Dipanjan Das, Donald Metzler, Slav Petrov, Kellie Webster
We take human annotations as a gold standard and show that a correlated automatic metric is suitable for development.
1 code implementation • 15 Nov 2022 • Priyanka Agrawal, Chris Alberti, Fantine Huot, Joshua Maynez, Ji Ma, Sebastian Ruder, Kuzman Ganchev, Dipanjan Das, Mirella Lapata
The availability of large, high-quality datasets has been one of the main drivers of recent progress in question answering (QA).
no code implementations • 12 Oct 2022 • Honglei Zhuang, Zhen Qin, Rolf Jagerman, Kai Hui, Ji Ma, Jing Lu, Jianmo Ni, Xuanhui Wang, Michael Bendersky
Recently, substantial progress has been made in text ranking based on pretrained language models such as BERT.
no code implementations • 23 Sep 2022 • Zhuyun Dai, Vincent Y. Zhao, Ji Ma, Yi Luan, Jianmo Ni, Jing Lu, Anton Bakalov, Kelvin Guu, Keith B. Hall, Ming-Wei Chang
To amplify the power of a few examples, we propose Prompt-base Query Generation for Retriever (Promptagator), which leverages large language models (LLM) as a few-shot query generator, and creates task-specific retrievers based on the generated data.
no code implementations • Findings (ACL) 2022 • Kai Hui, Honglei Zhuang, Tao Chen, Zhen Qin, Jing Lu, Dara Bahri, Ji Ma, Jai Prakash Gupta, Cicero Nogueira dos santos, Yi Tay, Don Metzler
This results in significant inference time speedups since the decoder-only architecture only needs to learn to interpret static encoder embeddings during inference.
2 code implementations • 15 Dec 2021 • Jianmo Ni, Chen Qu, Jing Lu, Zhuyun Dai, Gustavo Hernández Ábrego, Ji Ma, Vincent Y. Zhao, Yi Luan, Keith B. Hall, Ming-Wei Chang, Yinfei Yang
With multi-stage training, surprisingly, scaling up the model size brings significant improvement on a variety of retrieval tasks, especially for out-of-domain generalization.
Ranked #9 on
Zero-shot Text Search
on BEIR
2 code implementations • Findings (ACL) 2022 • Jianmo Ni, Gustavo Hernández Ábrego, Noah Constant, Ji Ma, Keith B. Hall, Daniel Cer, Yinfei Yang
To support our investigation, we establish a new sentence representation transfer benchmark, SentGLUE, which extends the SentEval toolkit to nine tasks from the GLUE benchmark.
no code implementations • 23 Oct 2020 • Jing Lu, Gustavo Hernandez Abrego, Ji Ma, Jianmo Ni, Yinfei Yang
In this paper we explore the effects of negative sampling in dual encoder models used to retrieve passages for automatic question answering.
no code implementations • 1 Oct 2020 • Michael Bendersky, Honglei Zhuang, Ji Ma, Shuguang Han, Keith Hall, Ryan Mcdonald
In this paper, we report the results of our participation in the TREC-COVID challenge.
no code implementations • EACL 2021 • Ji Ma, Ivan Korotkov, Yinfei Yang, Keith Hall, Ryan Mcdonald
The question generation system is trained on general domain data, but is applied to documents in the targeted domain.
no code implementations • 23 Apr 2020 • Shashi Narayan, Gonçalo Simoes, Ji Ma, Hannah Craighead, Ryan Mcdonald
Recent trends in natural language processing using pretraining have shifted focus towards pretraining and fine-tuning approaches for text generation.
no code implementations • 28 Jun 2019 • Jean-François Kagy, Tolga Kayadelen, Ji Ma, Afshin Rostamizadeh, Jana Strnadova
We tested in a live setting the use of active learning for selecting text sentences for human annotations used in training a Thai segmentation machine learning model.
1 code implementation • EMNLP 2018 • Ji Ma, Kuzman Ganchev, David Weiss
A wide variety of neural-network architectures have been proposed for the task of Chinese word segmentation.
1 code implementation • EMNLP 2017 • Jan A. Botha, Emily Pitler, Ji Ma, Anton Bakalov, Alex Salcianu, David Weiss, Ryan Mcdonald, Slav Petrov
We show that small and shallow feed-forward neural networks can achieve near state-of-the-art results on a range of unstructured and structured language processing tasks while being considerably cheaper in memory and computational requirements than deep recurrent models.
no code implementations • 15 Mar 2017 • Chris Alberti, Daniel Andor, Ivan Bogatyy, Michael Collins, Dan Gillick, Lingpeng Kong, Terry Koo, Ji Ma, Mark Omernick, Slav Petrov, Chayut Thanapirom, Zora Tung, David Weiss
We describe a baseline dependency parsing system for the CoNLL2017 Shared Task.