no code implementations • 31 Oct 2024 • Xiang Deng, Youxin Pang, Xiaochen Zhao, Chao Xu, Lizhen Wang, Hongjiang Xiao, Shi Yan, Hongwen Zhang, Yebin Liu
This paper introduces Stereo-Talker, a novel one-shot audio-driven human video synthesis system that generates 3D talking videos with precise lip synchronization, expressive body gestures, temporally consistent photo-realistic quality, and continuous viewpoint control.
1 code implementation • 28 Jul 2024 • Letian Shi, Qi Lv, Xiang Deng, Liqiang Nie
To address the real-world egocentric task planning problem, we introduce a novel planning framework which comprises three stages: long-term memory Extraction, context-awared Planning, and multi-iteration Decision, named EPD.
1 code implementation • 11 Jul 2024 • Marcin Paluch, Florian Bolli, Xiang Deng, Antonio Rios Navarro, Chang Gao, Tobi Delbruck
We demonstrate kHz control rates for a physical cartpole and offloading control to the FPGA hardware on the F1TENTH car.
1 code implementation • 8 Jun 2024 • Qi Lv, Xiang Deng, Gongwei Chen, Michael Yu Wang, Liqiang Nie
To capture the relationship among state-action-RTG triplets, a fine-grained SSM module is designed and integrated into the original coarse-grained SSM in mamba, resulting in a novel mamba architecture tailored for offline RL.
no code implementations • CVPR 2024 • Jihyung Kil, Chan Hee Song, Boyuan Zheng, Xiang Deng, Yu Su, Wei-Lun Chao
Automatic web navigation aims to build a web agent that can follow language instructions to execute complex and diverse tasks on real-world websites.
1 code implementation • CVPR 2024 • Gongwei Chen, Leyang Shen, Rui Shao, Xiang Deng, Liqiang Nie
1) Progressive incorporation of fine-grained spatial-aware visual knowledge.
no code implementations • CVPR 2024 • Xiang Deng, Zerong Zheng, Yuxiang Zhang, Jingxiang Sun, Chao Xu, Xiaodong Yang, Lizhen Wang, Yebin Liu
This paper focuses on advancing the applicability of human avatar learning methods by proposing RAM-Avatar which learns a Real-time photo-realistic Avatar that supports full-body control from Monocular videos.
no code implementations • 12 Dec 2023 • Yibo Xia, Lizhen Wang, Xiang Deng, Xiaoyan Luo, Yebin Liu
Finally, we propose a personalized emotion-guided head generator with an emotion mapping network that can synthesize high-fidelity and faithful emotional video portraits.
1 code implementation • 20 Nov 2023 • Gongwei Chen, Leyang Shen, Rui Shao, Xiang Deng, Liqiang Nie
1) Progressive incorporation of fine-grained spatial-aware visual knowledge.
1 code implementation • 7 Aug 2023 • Xiao Liu, Hao Yu, Hanchen Zhang, Yifan Xu, Xuanyu Lei, Hanyu Lai, Yu Gu, Hangliang Ding, Kaiwen Men, Kejuan Yang, Shudan Zhang, Xiang Deng, Aohan Zeng, Zhengxiao Du, Chenhui Zhang, Sheng Shen, Tianjun Zhang, Yu Su, Huan Sun, Minlie Huang, Yuxiao Dong, Jie Tang
We present AgentBench, a multi-dimensional evolving benchmark that currently consists of 8 distinct environments to assess LLM-as-Agent's reasoning and decision-making abilities in a multi-turn open-ended generation setting.
1 code implementation • 29 Jul 2023 • Lingbo Mo, Shijie Chen, Ziru Chen, Xiang Deng, Ashley Lewis, Sunit Singh, Samuel Stevens, Chang-You Tai, Zhen Wang, Xiang Yue, Tianshu Zhang, Yu Su, Huan Sun
We introduce TacoBot, a user-centered task-oriented digital assistant designed to guide users through complex real-world tasks with multiple steps.
1 code implementation • NeurIPS 2023 • Xiang Deng, Yu Gu, Boyuan Zheng, Shijie Chen, Samuel Stevens, Boshi Wang, Huan Sun, Yu Su
We introduce Mind2Web, the first dataset for developing and evaluating generalist agents for the web that can follow language instructions to complete complex tasks on any website.
no code implementations • 23 May 2023 • Chang-You Tai, Ziru Chen, Tianshu Zhang, Xiang Deng, Huan Sun
Thus, we systematically study how to enhance LLMs' reasoning ability through chain of thought (CoT) style prompting, including the original chain-of-thought prompting (Wei et al., 2022b) and least-to-most prompting (Zhou et al., 2023).
no code implementations • 21 Dec 2022 • Xiang Deng, Vasilisa Bashlovkina, Feng Han, Simon Baumgartner, Michael Bendersky
Market sentiment analysis on social media content requires knowledge of both financial markets and social media jargon, which makes it a challenging task for human raters.
2 code implementations • 20 Dec 2022 • Boshi Wang, Sewon Min, Xiang Deng, Jiaming Shen, You Wu, Luke Zettlemoyer, Huan Sun
Chain-of-Thought (CoT) prompting can dramatically improve the multi-step reasoning abilities of large language models (LLMs).
2 code implementations • 19 Dec 2022 • Yu Gu, Xiang Deng, Yu Su
Most existing work for grounded language understanding uses LMs to directly generate plans that can be executed in the environment to achieve the desired effects.
no code implementations • 11 Jul 2022 • Shijie Chen, Ziru Chen, Xiang Deng, Ashley Lewis, Lingbo Mo, Samuel Stevens, Zhen Wang, Xiang Yue, Tianshu Zhang, Yu Su, Huan Sun
We present TacoBot, a task-oriented dialogue system built for the inaugural Alexa Prize TaskBot Challenge, which assists users in completing multi-step cooking and home improvement tasks.
1 code implementation • 16 Mar 2022 • Xiang Deng, Yun Xiao, Bo Long, Zhongfei Zhang
Deep neural networks (DNNs) have been widely applied in various domains in artificial intelligence including computer vision and natural language processing.
1 code implementation • 16 Mar 2022 • Boshi Wang, Xiang Deng, Huan Sun
While Pre-trained Language Models (PLMs) internalize a great amount of world knowledge, they have been shown incapable of recalling these knowledge to solve tasks requiring complex & multi-step reasoning.
1 code implementation • 25 Jan 2022 • Xiang Deng, Prashant Shiralkar, Colin Lockard, Binxuan Huang, Huan Sun
We argue that the text and HTML structure together convey important semantics of the content and therefore warrant a special treatment for their representation learning.
Ranked #2 on Attribute Extraction on SWDE
1 code implementation • NeurIPS 2021 • Xiang Deng, Zhongfei Zhang
Knowledge distillation (KD) addresses model compression by distilling knowledge from a large model (teacher) to a smaller one (student).
1 code implementation • EMNLP 2021 • Xiang Deng, Yu Su, Alyssa Lees, You Wu, Cong Yu, Huan Sun
We present ReasonBert, a pre-training method that augments language models with the ability to reason over long-range relations and multiple, possibly hybrid contexts.
Ranked #1 on Semantic Parsing on GraphQuestions
1 code implementation • 16 May 2021 • Xiang Deng, Zhongfei Zhang
In this paper, we propose to our best knowledge the first dedicated approach to distilling knowledge from a GNN without graph data.
no code implementations • 1 Jan 2021 • Xiang Deng, Zhongfei Zhang
By designing exploratory experiments, we find that model capacity differences are not necessarily the root reason, and the distillation data matters when the student capacity is greater than a threshold.
3 code implementations • 24 Dec 2020 • Xiang Deng, Zhongfei Zhang
Deep neural networks have been successfully deployed in various domains of artificial intelligence, including computer vision and natural language processing.
no code implementations • 1 Nov 2020 • Xiang Deng, Zhongfei Zhang
However, the existing approaches to training ternary weight networks cannot control the sparsity (i. e., percentage of 0s) of the ternary weights, which undermines the advantage of ternary weights.
no code implementations • NAACL 2021 • Xiang Deng, Ahmed Hassan Awadallah, Christopher Meek, Oleksandr Polozov, Huan Sun, Matthew Richardson
Additionally, to evaluate different methods under more realistic text-table alignment settings, we create a new evaluation set Spider-Realistic based on Spider dev set with explicit mentions of column names removed, and adopt eight existing text-to-SQL datasets for cross-database evaluation.
no code implementations • 9 Oct 2020 • Xiang Deng, Zhongfei, Zhang
To the end, the student is able to better capture the local shape of the teacher function and thus achieves a better performance.
no code implementations • 17 Sep 2020 • Xiang Deng, Zhongfei, Zhang
It is well observed that in deep learning and computer vision literature, visual data are always represented in a manually designed coding scheme (eg., RGB images are represented as integers ranging from 0 to 255 for each channel) when they are input to an end-to-end deep neural network (DNN) for any learning task.
1 code implementation • 26 Jun 2020 • Xiang Deng, Huan Sun, Alyssa Lees, You Wu, Cong Yu
In this paper, we present TURL, a novel framework that introduces the pre-training/fine-tuning paradigm to relational Web tables.
Ranked #1 on Column Type Annotation on WikipediaGS-CTA
no code implementations • 27 Feb 2020 • Xiang Deng, Zhongfei Zhang
In this paper, we propose a novel meta-learning based training procedure (MLTP) for DNNs and demonstrate that the meta-learning idea can indeed improve the generalization abilities of DNNs.
no code implementations • 5 Dec 2019 • Jie Zhao, Xiang Deng, Huan Sun
This paper makes one of the first efforts toward automatically generating complex questions from knowledge graphs.
no code implementations • 20 Sep 2019 • Bortik Bandyopadhyay, Xiang Deng, Goonmeet Bajaj, Huan Sun, Srinivasan Parthasarathy
In this work, we propose to resolve a new type of heterogeneous query viz: tabular query, which contains a natural language query description, column names of the desired table, and an example row.
1 code implementation • IJCNLP 2019 • Xiang Deng, Huan Sun
Given two entities, distant supervision exploits sentences that directly mention them for predicting their semantic relation.