Search Results for author: ZiYi Yang

Found 48 papers, 17 papers with code

Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone

no code implementations22 Apr 2024 Marah Abdin, Sam Ade Jacobs, Ammar Ahmad Awan, Jyoti Aneja, Ahmed Awadallah, Hany Awadalla, Nguyen Bach, Amit Bahree, Arash Bakhtiari, Harkirat Behl, Alon Benhaim, Misha Bilenko, Johan Bjorck, Sébastien Bubeck, Martin Cai, Caio César Teodoro Mendes, Weizhu Chen, Vishrav Chaudhary, Parul Chopra, Allie Del Giorno, Gustavo de Rosa, Matthew Dixon, Ronen Eldan, Dan Iter, Amit Garg, Abhishek Goswami, Suriya Gunasekar, Emman Haider, Junheng Hao, Russell J. Hewett, Jamie Huynh, Mojan Javaheripi, Xin Jin, Piero Kauffmann, Nikos Karampatziakis, Dongwoo Kim, Mahoud Khademi, Lev Kurilenko, James R. Lee, Yin Tat Lee, Yuanzhi Li, Chen Liang, Weishung Liu, Eric Lin, Zeqi Lin, Piyush Madan, Arindam Mitra, Hardik Modi, Anh Nguyen, Brandon Norick, Barun Patra, Daniel Perez-Becker, Thomas Portet, Reid Pryzant, Heyang Qin, Marko Radmilac, Corby Rosset, Sambudha Roy, Olatunji Ruwase, Olli Saarikivi, Amin Saied, Adil Salim, Michael Santacroce, Shital Shah, Ning Shang, Hiteshi Sharma, Xia Song, Masahiro Tanaka, Xin Wang, Rachel Ward, Guanhua Wang, Philipp Witte, Michael Wyatt, Can Xu, Jiahang Xu, Sonali Yadav, Fan Yang, ZiYi Yang, Donghan Yu, Chengruidong Zhang, Cyril Zhang, Jianwen Zhang, Li Lyna Zhang, Yi Zhang, Yue Zhang, Yunan Zhang, Xiren Zhou

We introduce phi-3-mini, a 3. 8 billion parameter language model trained on 3. 3 trillion tokens, whose overall performance, as measured by both academic benchmarks and internal testing, rivals that of models such as Mixtral 8x7B and GPT-3. 5 (e. g., phi-3-mini achieves 69% on MMLU and 8. 38 on MT-bench), despite being small enough to be deployed on a phone.

3DGSR: Implicit Surface Reconstruction with 3D Gaussian Splatting

no code implementations30 Mar 2024 Xiaoyang Lyu, Yang-tian Sun, Yi-Hua Huang, Xiuzhe Wu, ZiYi Yang, Yilun Chen, Jiangmiao Pang, Xiaojuan Qi

In this paper, we present an implicit surface reconstruction method with 3D Gaussian Splatting (3DGS), namely 3DGSR, that allows for accurate 3D reconstruction with intricate details while inheriting the high efficiency and rendering quality of 3DGS.

3D Reconstruction Surface Reconstruction

FuseChat: Knowledge Fusion of Chat Models

1 code implementation25 Feb 2024 Fanqi Wan, ZiYi Yang, Longguang Zhong, Xiaojun Quan, Xinting Huang, Wei Bi

Recently, \textsc{FuseLLM} introduced the concept of knowledge fusion to transfer the collective knowledge of multiple structurally varied LLMs into a target LLM through lightweight continual training.

Spec-Gaussian: Anisotropic View-Dependent Appearance for 3D Gaussian Splatting

no code implementations24 Feb 2024 ZiYi Yang, Xinyu Gao, Yangtian Sun, Yihua Huang, Xiaoyang Lyu, Wen Zhou, Shaohui Jiao, Xiaojuan Qi, Xiaogang Jin

The recent advancements in 3D Gaussian splatting (3D-GS) have not only facilitated real-time rendering through modern GPU rasterization pipelines but have also attained state-of-the-art rendering quality.

SC-GS: Sparse-Controlled Gaussian Splatting for Editable Dynamic Scenes

1 code implementation4 Dec 2023 Yi-Hua Huang, Yang-tian Sun, ZiYi Yang, Xiaoyang Lyu, Yan-Pei Cao, Xiaojuan Qi

During learning, the location and number of control points are adaptively adjusted to accommodate varying motion complexities in different regions, and an ARAP loss following the principle of as rigid as possible is developed to enforce spatial continuity and local rigidity of learned motions.

Novel View Synthesis

CoDi-2: In-Context, Interleaved, and Interactive Any-to-Any Generation

no code implementations30 Nov 2023 Zineng Tang, ZiYi Yang, Mahmoud Khademi, Yang Liu, Chenguang Zhu, Mohit Bansal

We present CoDi-2, a versatile and interactive Multimodal Large Language Model (MLLM) that can follow complex multimodal interleaved instructions, conduct in-context learning (ICL), reason, chat, edit, etc., in an any-to-any input-output modality paradigm.

Image Generation In-Context Learning +3

Soft Convex Quantization: Revisiting Vector Quantization with Convex Optimization

no code implementations4 Oct 2023 Tanmay Gautam, Reid Pryzant, ZiYi Yang, Chenguang Zhu, Somayeh Sojoudi

SCQ works like a differentiable convex optimization (DCO) layer: in the forward pass, we solve for the optimal convex combination of codebook vectors that quantize the inputs.

Image Reconstruction Quantization

Deformable 3D Gaussians for High-Fidelity Monocular Dynamic Scene Reconstruction

1 code implementation22 Sep 2023 ZiYi Yang, Xinyu Gao, Wen Zhou, Shaohui Jiao, Yuqing Zhang, Xiaogang Jin

Implicit neural representation has paved the way for new approaches to dynamic scene reconstruction and rendering.

Neural Rendering Novel View Synthesis

Plug in the Safety Chip: Enforcing Constraints for LLM-driven Robot Agents

no code implementations18 Sep 2023 ZiYi Yang, Shreyas S. Raman, Ankit Shah, Stefanie Tellex

Recent advancements in large language models (LLMs) have enabled a new research domain, LLM agents, for solving robotics and planning tasks by leveraging the world knowledge and general reasoning abilities of LLMs obtained during pretraining.

World Knowledge

A General Implicit Framework for Fast NeRF Composition and Rendering

no code implementations9 Aug 2023 Xinyu Gao, ZiYi Yang, Yunlu Zhao, Yuxiang Sun, Xiaogang Jin, Changqing Zou

Mainly, our work introduces a new surface representation known as Neural Depth Fields (NeDF) that quickly determines the spatial relationship between objects by allowing direct intersection computation between rays and implicit surfaces.

Multi-task Bioassay Pre-training for Protein-ligand Binding Affinity Prediction

1 code implementation8 Jun 2023 Jiaxian Yan, Zhaofeng Ye, ZiYi Yang, Chengqiang Lu, Shengyu Zhang, Qi Liu, Jiezhong Qiu

By introducing multi-task pre-training to treat the prediction of different affinity labels as different tasks and classifying relative rankings between samples from the same bioassay, MBP learns robust and transferrable structural knowledge from our new ChEMBL-Dock dataset with varied and noisy labels.

Drug Discovery

i-Code Studio: A Configurable and Composable Framework for Integrative AI

no code implementations23 May 2023 Yuwei Fang, Mahmoud Khademi, Chenguang Zhu, ZiYi Yang, Reid Pryzant, Yichong Xu, Yao Qian, Takuya Yoshioka, Lu Yuan, Michael Zeng, Xuedong Huang

Artificial General Intelligence (AGI) requires comprehensive understanding and generation capabilities for a variety of tasks spanning different modalities and functionalities.

Question Answering Retrieval +4

i-Code V2: An Autoregressive Generation Framework over Vision, Language, and Speech Data

no code implementations21 May 2023 ZiYi Yang, Mahmoud Khademi, Yichong Xu, Reid Pryzant, Yuwei Fang, Chenguang Zhu, Dongdong Chen, Yao Qian, Mei Gao, Yi-Ling Chen, Robert Gmyr, Naoyuki Kanda, Noel Codella, Bin Xiao, Yu Shi, Lu Yuan, Takuya Yoshioka, Michael Zeng, Xuedong Huang

The convergence of text, visual, and audio data is a key step towards human-like artificial intelligence, however the current Vision-Language-Speech landscape is dominated by encoder-only models which lack generative abilities.

Any-to-Any Generation via Composable Diffusion

1 code implementation NeurIPS 2023 Zineng Tang, ZiYi Yang, Chenguang Zhu, Michael Zeng, Mohit Bansal

We present Composable Diffusion (CoDi), a novel generative model capable of generating any combination of output modalities, such as language, image, video, or audio, from any combination of input modalities.

Audio Generation

Real-Time Audio-Visual End-to-End Speech Enhancement

no code implementations13 Mar 2023 Zirun Zhu, Hemin Yang, Min Tang, ZiYi Yang, Sefik Emre Eskimez, Huaming Wang

In this paper, we propose a low-latency real-time audio-visual end-to-end enhancement (AV-E3Net) model based on the recently proposed end-to-end enhancement network (E3Net).

Speech Enhancement Task 2

Grounding Complex Natural Language Commands for Temporal Tasks in Unseen Environments

no code implementations22 Feb 2023 Jason Xinyu Liu, ZiYi Yang, Ifrah Idrees, Sam Liang, Benjamin Schornstein, Stefanie Tellex, Ankit Shah

We propose Lang2LTL, a modular system and a software package that leverages large language models (LLMs) to ground temporal navigational commands to LTL specifications in environments without prior language data.

APOLLO: A Simple Approach for Adaptive Pretraining of Language Models for Logical Reasoning

no code implementations19 Dec 2022 Soumya Sanyal, Yichong Xu, Shuohang Wang, ZiYi Yang, Reid Pryzant, Wenhao Yu, Chenguang Zhu, Xiang Ren

Logical reasoning of text is an important ability that requires understanding the information present in the text, their interconnections, and then reasoning through them to infer new conclusions.

Data Augmentation Language Modelling +3

Unifying Vision, Text, and Layout for Universal Document Processing

2 code implementations CVPR 2023 Zineng Tang, ZiYi Yang, Guoxin Wang, Yuwei Fang, Yang Liu, Chenguang Zhu, Michael Zeng, Cha Zhang, Mohit Bansal

UDOP leverages the spatial correlation between textual content and document image to model image, text, and layout modalities with one uniform representation.

Ranked #5 on Visual Question Answering (VQA) on InfographicVQA (using extra training data)

document understanding Image Reconstruction +1

UniSumm and SummZoo: Unified Model and Diverse Benchmark for Few-Shot Summarization

1 code implementation17 Nov 2022 Yulong Chen, Yang Liu, Ruochen Xu, ZiYi Yang, Chenguang Zhu, Michael Zeng, Yue Zhang

The high annotation costs and diverse demands of various summarization tasks motivate the development of few-shot summarization.

MACSum: Controllable Summarization with Mixed Attributes

1 code implementation9 Nov 2022 Yusen Zhang, Yang Liu, ZiYi Yang, Yuwei Fang, Yulong Chen, Dragomir Radev, Chenguang Zhu, Michael Zeng, Rui Zhang

We propose two simple and effective parameter-efficient approaches for the new task of mixed controllable summarization based on hard prompt tuning and soft prefix tuning.

Attribute Specificity

Language Models with Image Descriptors are Strong Few-Shot Video-Language Learners

1 code implementation22 May 2022 Zhenhailong Wang, Manling Li, Ruochen Xu, Luowei Zhou, Jie Lei, Xudong Lin, Shuohang Wang, ZiYi Yang, Chenguang Zhu, Derek Hoiem, Shih-Fu Chang, Mohit Bansal, Heng Ji

The goal of this work is to build flexible video-language models that can generalize to various video-to-text tasks from few examples, such as domain-specific captioning, question answering, and future event prediction.

Attribute Automatic Speech Recognition +6

ODBO: Bayesian Optimization with Search Space Prescreening for Directed Protein Evolution

2 code implementations19 May 2022 Lixue Cheng, ZiYi Yang, ChangYu Hsieh, Benben Liao, Shengyu Zhang

Directed evolution is a versatile technique in protein engineering that mimics the process of natural selection by iteratively alternating between mutagenesis and screening in order to search for sequences that optimize a given property of interest, such as catalytic activity and binding affinity to a specified target.

Bayesian Optimization Experimental Design +1

Automatic Rule Induction for Interpretable Semi-Supervised Learning

1 code implementation18 May 2022 Reid Pryzant, ZiYi Yang, Yichong Xu, Chenguang Zhu, Michael Zeng

Semi-supervised learning has shown promise in allowing NLP models to generalize from small amounts of labeled data.

Relation Extraction

GASCN: Graph Attention Shape Completion Network

no code implementations20 Jan 2022 Haojie Huang, ZiYi Yang, Robert Platt

Shape completion, the problem of inferring the complete geometry of an object given a partial point cloud, is an important problem in robotics and computer vision.

Graph Attention

SPLDExtraTrees: Robust machine learning approach for predicting kinase inhibitor resistance

no code implementations15 Nov 2021 ZiYi Yang, Zhaofeng Ye, Yijia Xiao, ChangYu Hsieh, Shengyu Zhang

Drug resistance is a major threat to the global health and a significant concern throughout the clinical treatment of diseases and drug development.

BIG-bench Machine Learning

Natural Language for Human-Robot Collaboration: Problems Beyond Language Grounding

no code implementations9 Oct 2021 Seth Pate, Wei Xu, ZiYi Yang, Maxwell Love, Siddarth Ganguri, Lawson L. S. Wong

To enable robots to instruct humans in collaborations, we identify several aspects of language processing that are not commonly studied in this context.

A Simple and Effective Method To Eliminate the Self Language Bias in Multilingual Representations

1 code implementation EMNLP 2021 ZiYi Yang, Yinfei Yang, Daniel Cer, Eric Darve

A simple but highly effective method "Language Information Removal (LIR)" factors out language identity information from semantic related components in multilingual representations pre-trained on multi-monolingual data.

Cross-Lingual Transfer Retrieval

Universal Sentence Representations Learning with Conditional Masked Language Model

no code implementations1 Jan 2021 ZiYi Yang, Yinfei Yang, Daniel M Cer, Jax Law, Eric Darve

This paper presents a novel training method, Conditional Masked Language Modeling (CMLM), to effectively learn sentence representations on large scale unlabeled corpora.

Language Modelling Masked Language Modeling +4

Universal Sentence Representation Learning with Conditional Masked Language Model

no code implementations EMNLP 2021 ZiYi Yang, Yinfei Yang, Daniel Cer, Jax Law, Eric Darve

This paper presents a novel training method, Conditional Masked Language Modeling (CMLM), to effectively learn sentence representations on large scale unlabeled corpora.

Language Modelling Masked Language Modeling +4

Multi-Constitutive Neural Network for Large Deformation Poromechanics Problem

no code implementations11 Oct 2020 Qi Zhang, Yilin Chen, ZiYi Yang, Eric Darve

We propose a novel method "multi-constitutive neural network" (MCNN) such that one model can solve several different constitutive laws.

Select-ProtoNet: Learning to Select for Few-Shot Disease Subtype Prediction

no code implementations2 Sep 2020 Ziyi Yang, Jun Shu, Yong Liang, Deyu Meng, Zongben Xu

Current machine learning has made great progress on computer vision and many other fields attributed to the large amount of high-quality training samples, while it does not work very well on genomic data analysis, since they are notoriously known as small data.

feature selection Few-Shot Image Classification +1

Filtered Inner Product Projection for Crosslingual Embedding Alignment

no code implementations ICLR 2021 Vin Sachidananda, ZiYi Yang, Chenguang Zhu

Due to widespread interest in machine translation and transfer learning, there are numerous algorithms for mapping multiple embeddings to a shared representation space.

Machine Translation Transfer Learning +1

Anomaly Detection with Domain Adaptation

no code implementations5 Jun 2020 Ziyi Yang, Iman Soltani Bozchalooi, Eric Darve

We study the problem of semi-supervised anomaly detection with domain adaptation.

Domain Adaptation Object Recognition +2

Memory Augmented Generative Adversarial Networks for Anomaly Detection

no code implementations7 Feb 2020 Ziyi Yang, Teng Zhang, Iman Soltani Bozchalooi, Eric Darve

Decoded memory units in MEMGAN are more interpretable and disentangled than previous methods, which further demonstrates the effectiveness of the memory mechanism.

Anomaly Detection

Leveraging Lead Bias for Zero-shot Abstractive News Summarization

no code implementations25 Dec 2019 Chenguang Zhu, Ziyi Yang, Robert Gmyr, Michael Zeng, Xuedong Huang

A typical journalistic convention in news articles is to deliver the most salient information in the beginning, also known as the lead bias.

Domain Adaptation News Summarization

Make Lead Bias in Your Favor: A Simple and Effective Method for News Summarization

no code implementations25 Sep 2019 Chenguang Zhu, ZiYi Yang, Robert Gmyr, Michael Zeng, Xuedong Huang

For example, the pretrained model without finetuning outperforms pointer-generator network on CNN/DailyMail dataset.

News Summarization

Embedding Imputation with Grounded Language Information

1 code implementation ACL 2019 Ziyi Yang, Chenguang Zhu, Sachidan, Vin a, Eric Darve

In this paper, we propose an approach for embedding imputation which uses grounded information in the form of a knowledge graph.

Imputation

Out-of-Vocabulary Embedding Imputation with Grounded Language Information by Graph Convolutional Networks

no code implementations ACL 2019 Ziyi Yang, Chenguang Zhu, Vin Sachidananda, Eric Darve

In this paper, we propose an approach for embedding imputation which uses grounded information in the form of a knowledge graph.

Imputation

Parameter-free Sentence Embedding via Orthogonal Basis

1 code implementation IJCNLP 2019 Ziyi Yang, Chenguang Zhu, Weizhu Chen

Inspired by the Gram-Schmidt Process in geometric theory, we build an orthogonal basis of the subspace spanned by a word and its surrounding context in a sentence.

Sentence Sentence Embedding +2

Language Distribution Prediction based on Batch Markov Monte Carlo Simulation with Migration

no code implementations26 Feb 2018 XingYu Fu, ZiYi Yang, XiuWen Duan

To model the randomness of language spreading, we propose the Batch Markov Monte Carlo Simulation with Migration(BMMCSM) algorithm, in which each agent is treated as a language stack.

Cultural Vocal Bursts Intensity Prediction

Cannot find the paper you are looking for? You can Submit a new open access paper.