Search Results for author: Yudong Liu

Found 20 papers, 13 papers with code

Generating Descriptive and Rules-Adhering Spells for Dungeons & Dragons Fifth Edition

no code implementations games (LREC) 2022 Pax Newman, Yudong Liu

We examine the task of generating unique content for the spell system of the tabletop roleplaying game Dungeons and Dragons Fifth Edition using several generative language models.

Descriptive

Keyframe-oriented Vision Token Pruning: Enhancing Efficiency of Large Vision Language Models on Long-Form Video Processing

1 code implementation13 Mar 2025 Yudong Liu, Jingwei Sun, Yueqian Lin, Jingyang Zhang, Ming Yin, Qinsi Wang, Jianyi Zhang, Hai Li, Yiran Chen

In this work, we propose KVTP (Keyframe-oriented Vision Token Pruning), a novel framework that overcomes the drawbacks of token pruning and keyframe selection.

EgoSchema Form +1

SpeechPrune: Context-aware Token Pruning for Speech Information Retrieval

no code implementations16 Dec 2024 Yueqian Lin, Yuzhe Fu, Jingyang Zhang, Yudong Liu, Jianyi Zhang, Jingwei Sun, Hai "Helen" Li, Yiran Chen

We introduce Speech Information Retrieval (SIR), a new long-context task for Speech Large Language Models (Speech LLMs), and present SPIRAL, a 1, 012-sample benchmark testing models' ability to extract critical details from approximately 90-second spoken inputs.

Form Information Retrieval +2

Towards Automated Model Design on Recommender Systems

1 code implementation12 Nov 2024 Tunhou Zhang, Dehua Cheng, Yuchen He, Zhengxing Chen, Xiaoliang Dai, Liang Xiong, Yudong Liu, Feng Cheng, Yufan Cao, Feng Yan, Hai Li, Yiran Chen, Wei Wen

Designing recommender systems using deep neural networks requires careful architecture design, and further optimization demands extensive co-design efforts on jointly optimizing model architecture and hardware.

AutoML Click-Through Rate Prediction +2

Implicit Filtering for Learning Neural Signed Distance Functions from 3D Point Clouds

no code implementations18 Jul 2024 Shengtao Li, Ge Gao, Yudong Liu, Ming Gu, Yu-Shen Liu

The neural network typically fits the shape with a rough surface and omits fine-grained geometric details such as shape edges and corners.

Surface Reconstruction

Practical offloading for fine-tuning LLM on commodity GPU via learned sparse projectors

1 code implementation14 Jun 2024 Siyuan Chen, Zhuofeng Wang, Zelong Guan, Yudong Liu, Phillip B. Gibbons

However, this approach is hampered by the limited bandwidth of commodity hardware, which constrains communication between the CPU and GPU, and by slower matrix multiplications on the CPU.

CMMMU: A Chinese Massive Multi-discipline Multimodal Understanding Benchmark

1 code implementation22 Jan 2024 Ge Zhang, Xinrun Du, Bei Chen, Yiming Liang, Tongxu Luo, Tianyu Zheng, Kang Zhu, Yuyang Cheng, Chunpu Xu, Shuyue Guo, Haoran Zhang, Xingwei Qu, Junjie Wang, Ruibin Yuan, Yizhi Li, Zekun Wang, Yudong Liu, Yu-Hsuan Tsai, Fengji Zhang, Chenghua Lin, Wenhao Huang, Jie Fu

We introduce CMMMU, a new Chinese Massive Multi-discipline Multimodal Understanding benchmark designed to evaluate LMMs on tasks demanding college-level subject knowledge and deliberate reasoning in a Chinese context.

Why does Prediction Accuracy Decrease over Time? Uncertain Positive Learning for Cloud Failure Prediction

no code implementations8 Jan 2024 Haozhe Li, Minghua Ma, Yudong Liu, Pu Zhao, Lingling Zheng, Ze Li, Yingnong Dang, Murali Chintalapati, Saravan Rajmohan, QIngwei Lin, Dongmei Zhang

Using two real-world datasets of disk failure prediction and conducting node prediction experiments in Microsoft Azure, which is a top-tier cloud provider that serves millions of users, we demonstrate Uptake can significantly improve the failure prediction accuracy by 5% on average.

Cloud Computing Prediction

GridFormer: Point-Grid Transformer for Surface Reconstruction

1 code implementation4 Jan 2024 Shengtao Li, Ge Gao, Yudong Liu, Yu-Shen Liu, Ming Gu

Our method maximizes the spatial expressiveness of grid features and maintains computational efficiency.

Computational Efficiency Surface Reconstruction

MusiLingo: Bridging Music and Text with Pre-trained Language Models for Music Captioning and Query Response

1 code implementation15 Sep 2023 Zihao Deng, Yinghao Ma, Yudong Liu, Rongchen Guo, Ge Zhang, Wenhu Chen, Wenhao Huang, Emmanouil Benetos

Large Language Models (LLMs) have shown immense potential in multimodal applications, yet the convergence of textual and musical domains remains not well-explored.

Caption Generation Language Modelling +1

ImDiffusion: Imputed Diffusion Models for Multivariate Time Series Anomaly Detection

1 code implementation3 Jul 2023 Yuhang Chen, Chaoyun Zhang, Minghua Ma, Yudong Liu, Ruomeng Ding, Bowen Li, Shilin He, Saravan Rajmohan, QIngwei Lin, Dongmei Zhang

To the best of our knowledge, ImDiffusion represents a pioneering approach that combines imputation-based techniques with time series anomaly detection, while introducing the novel use of diffusion models to the field.

Anomaly Detection Imputation +2

Multimodal Learning Without Labeled Multimodal Data: Guarantees and Applications

1 code implementation7 Jun 2023 Paul Pu Liang, Chun Kai Ling, Yun Cheng, Alex Obolenskiy, Yudong Liu, Rohan Pandey, Alex Wilf, Louis-Philippe Morency, Ruslan Salakhutdinov

In many machine learning systems that jointly learn from multiple modalities, a core research question is to understand the nature of multimodal interactions: how modalities combine to provide new task-relevant information that was not present in either alone.

Self-Supervised Learning

High-Modality Multimodal Transformer: Quantifying Modality & Interaction Heterogeneity for High-Modality Representation Learning

1 code implementation2 Mar 2022 Paul Pu Liang, Yiwei Lyu, Xiang Fan, Jeffrey Tsaw, Yudong Liu, Shentong Mo, Dani Yogatama, Louis-Philippe Morency, Ruslan Salakhutdinov

Many real-world problems are inherently multimodal, from spoken language, gestures, and paralinguistics humans use to communicate, to force, proprioception, and visual sensors on robots.

Representation Learning Time Series Analysis +2

CBNet: A Composite Backbone Network Architecture for Object Detection

4 code implementations1 Jul 2021 TingTing Liang, Xiaojie Chu, Yudong Liu, Yongtao Wang, Zhi Tang, Wei Chu, Jingdong Chen, Haibin Ling

With multi-scale testing, we push the current best single model result to a new record of 60. 1% box AP and 52. 3% mask AP without using extra training data.

Ranked #2 on Instance Segmentation on COCO test-dev (using extra training data)

Instance Segmentation Object +2

Some q-supercongruences from Rahman's summation formula

no code implementations11 Mar 2021 Yudong Liu, Xiaoxia Wang

Inspired by the recent work on $q$-congruences and the quadratic summation formula of Rahman, we provide some new $q$-supercongruences.

Combinatorics 33D15, 11A07, 11B65

CBNet: A Novel Composite Backbone Network Architecture for Object Detection

6 code implementations9 Sep 2019 Yudong Liu, Yongtao Wang, Siwei Wang, Ting-Ting Liang, Qijie Zhao, Zhi Tang, Haibin Ling

In existing CNN based detectors, the backbone network is a very important component for basic feature extraction, and the performance of the detectors highly depends on it.

Instance Segmentation object-detection +2

Cannot find the paper you are looking for? You can Submit a new open access paper.