Search Results for author: Zhen Zhang

Found 147 papers, 55 papers with code

PCEE-BERT: Accelerating BERT Inference via Patient and Confident Early Exiting

1 code implementation Findings (NAACL) 2022 Zhen Zhang, Wei Zhu, Jinfan Zhang, Peng Wang, Rize Jin, Tae-Sun Chung

In this work, we propose Patient and Confident Early Exiting BERT (PCEE-BERT), an off-the-shelf sample-dependent early exiting method that can work with different PLMs and can also work along with popular model compression methods.

Model Compression

HwTscSU’s Submissions on WAT 2022 Shared Task

no code implementations WAT 2022 Yilun Liu, Zhen Zhang, Shimin Tao, Junhui Li, Hao Yang

In this paper we describe our submission to the shared tasks of the 9th Workshop on Asian Translation (WAT 2022) on NICT–SAP under the team name ”HwTscSU”.

Domain Adaptation NMT +1

Hybrid Polynomial Zonotopes: A Set Representation for Reachability Analysis in Hybrid Nonaffine Systems

no code implementations16 Jun 2025 Peng Xie, Zhen Zhang, Amr Alanwar

Reachability analysis for hybrid nonaffine systems remains computationally challenging, as existing set representations--including constrained, polynomial, and hybrid zonotopes--either lose tightness under high-order nonaffine maps or suffer exponential blow-up after discrete jumps.

Computational Efficiency

TTrace: Lightweight Error Checking and Diagnosis for Distributed Training

no code implementations10 Jun 2025 Haitian Jiang, Shaowei Zhu, Zhen Zhang, Zhenyu Song, Xinwei Fu, Zhen Jia, Yida Wang, Jinyang Li

Effectively detecting and localizing such silent bugs in distributed training is challenging.

Manipulating Elasto-Plastic Objects With 3D Occupancy and Learning-Based Predictive Control

no code implementations22 May 2025 Zhen Zhang, Xiangyu Chu, Yunxi Tang, Lulu Zhao, Jing Huang, Zhongliang Jiang, K. W. Samuel Au

Manipulating elasto-plastic objects remains a significant challenge due to severe self-occlusion, difficulties of representation, and complicated dynamics.

Graph Neural Network

Soft Thinking: Unlocking the Reasoning Potential of LLMs in Continuous Concept Space

1 code implementation21 May 2025 Zhen Zhang, Xuehai He, Weixiang Yan, Ao Shen, Chenyang Zhao, Shuohang Wang, Yelong Shen, Xin Eric Wang

In this work, we introduce Soft Thinking, a training-free method that emulates human-like "soft" reasoning by generating soft, abstract concept tokens in a continuous concept space.

APEX: Empowering LLMs with Physics-Based Task Planning for Real-time Insight

1 code implementation20 May 2025 Wanjing Huang, Weixiang Yan, Zhen Zhang, Ambuj Singh

Large Language Models (LLMs) demonstrate strong reasoning and task planning capabilities but remain fundamentally limited in physical interaction modeling.

Causal Inference Decision Making +3

LGBQPC: Local Granular-Ball Quality Peaks Clustering

no code implementations16 May 2025 Zihang Jia, Zhen Zhang, Witold Pedrycz

These modifications substantially improve the performance of GBDPC on datasets with complex manifold structures or non-uniform density distributions.

Clustering Computational Efficiency +1

Replication and Exploration of Generative Retrieval over Dynamic Corpora

no code implementations24 Apr 2025 Zhen Zhang, Xinyu Ma, Weiwei Sun, Pengjie Ren, Zhumin Chen, Shuaiqiang Wang, Dawei Yin, Maarten de Rijke, Zhaochun Ren

We observe that the more fine-grained the docid design in the GR model, the better its performance over dynamic corpora, surpassing BM25 and even being comparable to dense retrieval methods.

Information Retrieval Retrieval

Equilibrium Conserving Neural Operators for Super-Resolution Learning

no code implementations18 Apr 2025 Vivek Oommen, Andreas E. Robertson, Daniel Diaz, Coleman Alleman, Zhen Zhang, Anthony D. Rollett, George E. Karniadakis, Rémi Dingreville

We evaluate this ECO-based super-resolution framework that strongly enforces conservation-laws in the predicted solutions on two working examples: embedded pores in a homogenized matrix and randomly textured polycrystalline materials.

Super-Resolution

On the Value of Cross-Modal Misalignment in Multimodal Representation Learning

1 code implementation14 Apr 2025 Yichao Cai, Yuhang Liu, Erdun Gao, Tianjiao Jiang, Zhen Zhang, Anton Van Den Hengel, Javen Qinfeng Shi

Multimodal representation learning, exemplified by multimodal contrastive learning (MMCL) using image-text pairs, aims to learn powerful representations by aligning cues across modalities.

Contrastive Learning Representation Learning +1

Data-Driven Nonconvex Reachability Analysis using Exact Multiplication

1 code implementation2 Apr 2025 Zhen Zhang, M. Umar B. Niazi, Michelle S. Chong, Karl H. Johansson, Amr Alanwar

We propose a novel approach using constrained polynomial zonotopes to describe reachable sets for unknown LTI systems.

Exploring Training and Inference Scaling Laws in Generative Retrieval

1 code implementation24 Mar 2025 Hongru Cai, Yongqi Li, Ruifeng Yuan, Wenjie Wang, Zhen Zhang, Wenjie Li, Tat-Seng Chua

Generative retrieval has emerged as a novel paradigm that leverages large language models (LLMs) to autoregressively generate document identifiers.

Decoder Retrieval

Analytic DAG Constraints for Differentiable DAG Learning

no code implementations24 Mar 2025 Zhen Zhang, Ignavier Ng, Dong Gong, Yuhang Liu, Mingming Gong, Biwei Huang, Kun Zhang, Anton Van Den Hengel, Javen Qinfeng Shi

By developing the necessary theory to establish a connection between analytic functions and DAG constraints, we demonstrate that analytic functions from the set $\{f(x) = c_0 + \sum_{i=1}^{\infty}c_ix^i | \forall i > 0, c_i > 0; r = \lim_{i\rightarrow \infty}c_{i}/c_{i+1} > 0\}$ can be employed to formulate effective DAG constraints.

PyGDA: A Python Library for Graph Domain Adaptation

1 code implementation13 Mar 2025 Zhen Zhang, Meihan Liu, Bingsheng He

As the first comprehensive library in this area, PyGDA covers more than 20 widely used graph domain adaptation methods together with different types of graph datasets.

Domain Adaptation GRAPH DOMAIN ADAPTATION +1

Learning Cascade Ranking as One Network

no code implementations12 Mar 2025 Yunli Wang, Zhen Zhang, Zhiqiang Wang, Zixuan Yang, Yu Li, Jian Yang, Shiyang Wen, Peng Jiang, Kun Gai

Recent advances such as RankFlow and FS-LTR have introduced interaction-aware training paradigms but still struggle to 1) align training objectives with the goal of the entire cascade ranking (i. e., end-to-end recall) and 2) learn effective collaboration patterns for different stages.

Detecting Knowledge Boundary of Vision Large Language Models by Sampling-Based Inference

no code implementations25 Feb 2025 Zhuo Chen, Xinyu Wang, Yong Jiang, Zhen Zhang, Xinyu Geng, Pengjun Xie, Fei Huang, Kewei Tu

To mitigate the dependence on retrieval and simultaneously maintain, or even improve, the performance benefits provided by retrieval, we propose a method to detect the knowledge boundary of VLLMs, allowing for more efficient use of techniques like RAG.

Question Answering RAG +3

MaZO: Masked Zeroth-Order Optimization for Multi-Task Fine-Tuning of Large Language Models

no code implementations17 Feb 2025 Zhen Zhang, Yifan Yang, Kai Zhen, Nathan Susanj, Athanasios Mouchtaris, Siegfried Kunzmann, Zheng Zhang

Large language models have demonstrated exceptional capabilities across diverse tasks, but their fine-tuning demands significant memory, posing challenges for resource-constrained environments.

Multi-Task Learning

Aggregate to Adapt: Node-Centric Aggregation for Multi-Source-Free Graph Domain Adaptation

no code implementations5 Feb 2025 Zhen Zhang, Bingsheng He

Unsupervised graph domain adaptation (UGDA) focuses on transferring knowledge from labeled source graph to unlabeled target graph under domain discrepancies.

Domain Adaptation GRAPH DOMAIN ADAPTATION

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

4 code implementations22 Jan 2025 DeepSeek-AI, Daya Guo, Dejian Yang, Haowei Zhang, Junxiao Song, Ruoyu Zhang, Runxin Xu, Qihao Zhu, Shirong Ma, Peiyi Wang, Xiao Bi, Xiaokang Zhang, Xingkai Yu, Yu Wu, Z. F. Wu, Zhibin Gou, Zhihong Shao, Zhuoshu Li, Ziyi Gao, Aixin Liu, Bing Xue, Bingxuan Wang, Bochao Wu, Bei Feng, Chengda Lu, Chenggang Zhao, Chengqi Deng, Chenyu Zhang, Chong Ruan, Damai Dai, Deli Chen, Dongjie Ji, Erhang Li, Fangyun Lin, Fucong Dai, Fuli Luo, Guangbo Hao, Guanting Chen, Guowei Li, H. Zhang, Han Bao, Hanwei Xu, Haocheng Wang, Honghui Ding, Huajian Xin, Huazuo Gao, Hui Qu, Hui Li, JianZhong Guo, Jiashi Li, Jiawei Wang, Jingchang Chen, Jingyang Yuan, Junjie Qiu, Junlong Li, J. L. Cai, Jiaqi Ni, Jian Liang, Jin Chen, Kai Dong, Kai Hu, Kaige Gao, Kang Guan, Kexin Huang, Kuai Yu, Lean Wang, Lecong Zhang, Liang Zhao, Litong Wang, Liyue Zhang, Lei Xu, Leyi Xia, Mingchuan Zhang, Minghua Zhang, Minghui Tang, Meng Li, Miaojun Wang, Mingming Li, Ning Tian, Panpan Huang, Peng Zhang, Qiancheng Wang, Qinyu Chen, Qiushi Du, Ruiqi Ge, Ruisong Zhang, Ruizhe Pan, Runji Wang, R. J. Chen, R. L. Jin, Ruyi Chen, Shanghao Lu, Shangyan Zhou, Shanhuang Chen, Shiyu Wang, Shuiping Yu, Shunfeng Zhou, Shuting Pan, S. S. Li, Shuang Zhou, Shaoqing Wu, Shengfeng Ye, Tao Yun, Tian Pei, Tianyu Sun, T. Wang, Wangding Zeng, Wanjia Zhao, Wen Liu, Wenfeng Liang, Wenjun Gao, Wenqin Yu, Wentao Zhang, W. L. Xiao, Wei An, Xiaodong Liu, Xiaohan Wang, Xiaokang Chen, Xiaotao Nie, Xin Cheng, Xin Liu, Xin Xie, Xingchao Liu, Xinyu Yang, Xinyuan Li, Xuecheng Su, Xuheng Lin, X. Q. Li, Xiangyue Jin, Xiaojin Shen, Xiaosha Chen, Xiaowen Sun, Xiaoxiang Wang, Xinnan Song, Xinyi Zhou, Xianzu Wang, Xinxia Shan, Y. K. Li, Y. Q. Wang, Y. X. Wei, Yang Zhang, Yao Li, Yao Zhao, Yaofeng Sun, Yaohui Wang, Yi Yu, Yichao Zhang, Yifan Shi, Yiliang Xiong, Ying He, Yishi Piao, Yisong Wang, Yixuan Tan, Yiyang Ma, Yiyuan Liu, Yongqiang Guo, Yuan Ou, Yuduan Wang, Yue Gong, Yuheng Zou, Yujia He, Yunfan Xiong, Yuxiang Luo, Yuxiang You, Yuxuan Liu, Yuyang Zhou, Y. X. Zhu, Yanhong Xu, Yanping Huang, Yaohui Li, Yi Zheng, Yuchen Zhu, Yunxian Ma, Ying Tang, Yukun Zha, Yuting Yan, Z. Z. Ren, Zehui Ren, Zhangli Sha, Zhe Fu, Zhean Xu, Zhenda Xie, Zhengyan Zhang, Zhewen Hao, Zhicheng Ma, Zhigang Yan, Zhiyu Wu, Zihui Gu, Zijia Zhu, Zijun Liu, Zilin Li, Ziwei Xie, Ziyang Song, Zizheng Pan, Zhen Huang, Zhipeng Xu, Zhongyu Zhang, Zhen Zhang

We introduce our first-generation reasoning models, DeepSeek-R1-Zero and DeepSeek-R1.

Mathematical Reasoning Multi-task Language Understanding +2

A wideband amplifying and filtering reconfigurable intelligent surface for wireless relay

no code implementations31 Dec 2024 Lijie Wu, Qun Yan Zhou, Jun Yan Dai, Siran Wang, Junwei Zhang, Zhen Jie Qi, Hanqing Yang, Ruizhe Jiang, Zheng Xing Wang, Huidong Li, Zhen Zhang, Jiang Luo, Qiang Cheng, Tie Jun Cui

Programmable metasurfaces have garnered significant attention due to their exceptional ability to manipulate electromagnetic (EM) waves in real time, leading to the emergence of a prominent area in wireless communication, namely reconfigurable intelligent surfaces (RISs), to control the signal propagation and coverage.

DeepSeek-V3 Technical Report

4 code implementations27 Dec 2024 DeepSeek-AI, Aixin Liu, Bei Feng, Bing Xue, Bingxuan Wang, Bochao Wu, Chengda Lu, Chenggang Zhao, Chengqi Deng, Chenyu Zhang, Chong Ruan, Damai Dai, Daya Guo, Dejian Yang, Deli Chen, Dongjie Ji, Erhang Li, Fangyun Lin, Fucong Dai, Fuli Luo, Guangbo Hao, Guanting Chen, Guowei Li, H. Zhang, Han Bao, Hanwei Xu, Haocheng Wang, Haowei Zhang, Honghui Ding, Huajian Xin, Huazuo Gao, Hui Li, Hui Qu, J. L. Cai, Jian Liang, JianZhong Guo, Jiaqi Ni, Jiashi Li, Jiawei Wang, Jin Chen, Jingchang Chen, Jingyang Yuan, Junjie Qiu, Junlong Li, Junxiao Song, Kai Dong, Kai Hu, Kaige Gao, Kang Guan, Kexin Huang, Kuai Yu, Lean Wang, Lecong Zhang, Lei Xu, Leyi Xia, Liang Zhao, Litong Wang, Liyue Zhang, Meng Li, Miaojun Wang, Mingchuan Zhang, Minghua Zhang, Minghui Tang, Mingming Li, Ning Tian, Panpan Huang, Peiyi Wang, Peng Zhang, Qiancheng Wang, Qihao Zhu, Qinyu Chen, Qiushi Du, R. J. Chen, R. L. Jin, Ruiqi Ge, Ruisong Zhang, Ruizhe Pan, Runji Wang, Runxin Xu, Ruoyu Zhang, Ruyi Chen, S. S. Li, Shanghao Lu, Shangyan Zhou, Shanhuang Chen, Shaoqing Wu, Shengfeng Ye, Shirong Ma, Shiyu Wang, Shuang Zhou, Shuiping Yu, Shunfeng Zhou, Shuting Pan, T. Wang, Tao Yun, Tian Pei, Tianyu Sun, W. L. Xiao, Wangding Zeng, Wanjia Zhao, Wei An, Wen Liu, Wenfeng Liang, Wenjun Gao, Wenqin Yu, Wentao Zhang, X. Q. Li, Xiangyue Jin, Xianzu Wang, Xiao Bi, Xiaodong Liu, Xiaohan Wang, Xiaojin Shen, Xiaokang Chen, Xiaokang Zhang, Xiaosha Chen, Xiaotao Nie, Xiaowen Sun, Xiaoxiang Wang, Xin Cheng, Xin Liu, Xin Xie, Xingchao Liu, Xingkai Yu, Xinnan Song, Xinxia Shan, Xinyi Zhou, Xinyu Yang, Xinyuan Li, Xuecheng Su, Xuheng Lin, Y. K. Li, Y. Q. Wang, Y. X. Wei, Y. X. Zhu, Yang Zhang, Yanhong Xu, Yanping Huang, Yao Li, Yao Zhao, Yaofeng Sun, Yaohui Li, Yaohui Wang, Yi Yu, Yi Zheng, Yichao Zhang, Yifan Shi, Yiliang Xiong, Ying He, Ying Tang, Yishi Piao, Yisong Wang, Yixuan Tan, Yiyang Ma, Yiyuan Liu, Yongqiang Guo, Yu Wu, Yuan Ou, Yuchen Zhu, Yuduan Wang, Yue Gong, Yuheng Zou, Yujia He, Yukun Zha, Yunfan Xiong, Yunxian Ma, Yuting Yan, Yuxiang Luo, Yuxiang You, Yuxuan Liu, Yuyang Zhou, Z. F. Wu, Z. Z. Ren, Zehui Ren, Zhangli Sha, Zhe Fu, Zhean Xu, Zhen Huang, Zhen Zhang, Zhenda Xie, Zhengyan Zhang, Zhewen Hao, Zhibin Gou, Zhicheng Ma, Zhigang Yan, Zhihong Shao, Zhipeng Xu, Zhiyu Wu, Zhongyu Zhang, Zhuoshu Li, Zihui Gu, Zijia Zhu, Zijun Liu, Zilin Li, Ziwei Xie, Ziyang Song, Ziyi Gao, Zizheng Pan

We present DeepSeek-V3, a strong Mixture-of-Experts (MoE) language model with 671B total parameters with 37B activated for each token.

Language Modeling Language Modelling +1

Adaptive$^2$: Adaptive Domain Mining for Fine-grained Domain Adaptation Modeling

no code implementations11 Dec 2024 Wenxuan Sun, Zixuan Yang, Yunli Wang, Zhen Zhang, Zhiqiang Wang, Yu Li, Jian Yang, Yiming Yang, Shiyang Wen, Peng Jiang, Kun Gai

To the best of our knowledge, Adaptive$^2$ is the first approach to automatically learn both domain identification and adaptation in online advertising, opening new research directions for this area.

Domain Adaptation

Scaling Laws for Online Advertisement Retrieval

no code implementations20 Nov 2024 Yunli Wang, Zixuan Yang, Zhen Zhang, Zhiqiang Wang, Jian Yang, Shiyang Wen, Peng Jiang, Kun Gai

To the best of our knowledge, this is the first work to study the scaling laws for online advertisement retrieval of real-world systems, showing great potential for scaling law in advertising system optimization.

Retrieval

Exploring Knowledge Boundaries in Large Language Models for Retrieval Judgment

no code implementations9 Nov 2024 Zhen Zhang, Xinyu Wang, Yong Jiang, Zhuo Chen, Feiteng Mu, Mengting Hu, Pengjun Xie, Fei Huang

Actually, we find that the impact of RAG on the question answering capabilities of LLMs can be categorized into three groups: beneficial, neutral, and harmful.

Question Answering RAG +2

Benchmarking Multimodal Retrieval Augmented Generation with Dynamic VQA Dataset and Self-adaptive Planning Agent

1 code implementation5 Nov 2024 Yangning Li, Yinghui Li, Xinyu Wang, Yong Jiang, Zhen Zhang, Xinran Zheng, Hui Wang, Hai-Tao Zheng, Pengjun Xie, Philip S. Yu, Fei Huang, Jingren Zhou

To bridge the dataset gap, we first construct Dyn-VQA dataset, consisting of three types of "dynamic" questions, which require complex knowledge retrieval strategies variable in query, tool, and time: (1) Questions with rapidly changing answers.

Benchmarking Hallucination +4

DOFS: A Real-world 3D Deformable Object Dataset with Full Spatial Information for Dynamics Model Learning

no code implementations29 Oct 2024 Zhen Zhang, Xiangyu Chu, Yunxi Tang, K. W. Samuel Au

This work proposes DOFS, a pilot dataset of 3D deformable objects (DOs) (e. g., elasto-plastic objects) with full spatial information (i. e., top, side, and bottom information) using a novel and low-cost data collection platform with a transparent operating plane.

ChannelGPT: A Large Model to Generate Digital Twin Channel for 6G Environment Intelligence

no code implementations17 Oct 2024 Li Yu, Lianzheng Shi, Jianhua Zhang, Jialin Wang, Zhen Zhang, Yuxiang Zhang, Guangyi Liu

In practice, we also establish a ChannelGPT prototype to generate high-fidelity channel data for varied scenarios to validate the accuracy and generalization ability based on environment intelligence.

PipeFill: Using GPUs During Bubbles in Pipeline-parallel LLM Training

no code implementations23 Sep 2024 Daiyaan Arfeen, Zhen Zhang, Xinwei Fu, Gregory R. Ganger, Yida Wang

Unfortunately, PP model training can use GPUs inefficiently, especially at large scale, due to idle GPU time caused by pipeline bubbles, which are often 15-30% and can exceed 60% of the training job's GPU allocation.

8k

Integrating Neural Operators with Diffusion Models Improves Spectral Representation in Turbulence Modeling

1 code implementation13 Sep 2024 Vivek Oommen, Aniruddha Bora, Zhen Zhang, George Em Karniadakis

We integrate neural operators with diffusion models to address the spectral limitations of neural operators in surrogate modeling of turbulent flows.

Computational Efficiency

An incremental preference elicitation-based approach to learning potentially non-monotonic preferences in multi-criteria sorting

no code implementations4 Sep 2024 Zhuolin Li, Zhen Zhang, Witold Pedrycz

Specifically, we first construct a max-margin optimization-based model to model potentially non-monotonic preferences and inconsistent assignment example preference information in each iteration of the incremental preference elicitation process.

Active Learning Question Selection

Lexicographic optimization-based approaches to learning a representative model for multi-criteria sorting with non-monotonic criteria

no code implementations3 Sep 2024 Zhen Zhang, Zhuolin Li, Wenyu Yu

Deriving a representative model using value function-based methods from the perspective of preference disaggregation has emerged as a prominent and growing topic in multi-criteria sorting (MCS) problems.

SympGNNs: Symplectic Graph Neural Networks for identifiying high-dimensional Hamiltonian systems and node classification

no code implementations29 Aug 2024 Alan John Varghese, Zhen Zhang, George Em Karniadakis

Herein, we introduce Symplectic Graph Neural Networks (SympGNNs) that can effectively handle system identification in high-dimensional Hamiltonian systems, as well as node classification.

Node Classification

Rethinking State Disentanglement in Causal Reinforcement Learning

no code implementations24 Aug 2024 Haiyao Cao, Zhen Zhang, Panpan Cai, Yuhang Liu, Jinan Zou, Ehsan Abbasnejad, Biwei Huang, Mingming Gong, Anton Van Den Hengel, Javen Qinfeng Shi

We revisit this research line and find that incorporating RL-specific context can reduce unnecessary assumptions in previous identifiability analyses for latent states.

Disentanglement reinforcement-learning +2

Can Wireless Environmental Information Decrease Pilot Overhead: A CSI Prediction Example

no code implementations13 Aug 2024 Lianzheng Shi, Jianhua Zhang, Li Yu, Yuxiang Zhang, Zhen Zhang, Yichen Cai, Guangyi Liu

Finally, a CNN-based channel prediction network is designed to predict the complete CSI, using the environmental feature map and partial CSI.

Prediction

Dynamic Graph Transformer with Correlated Spatial-Temporal Positional Encoding

1 code implementation24 Jul 2024 Zhe Wang, Sheng Zhou, Jiawei Chen, Zhen Zhang, Binbin Hu, Yan Feng, Chun Chen, Can Wang

To this end, we propose a novel Correlated Spatial-Temporal Positional encoding that incorporates a parameter-free personalized interaction intensity estimation under the weak assumption of the Poisson Point Process.

Representation Learning

Revisiting, Benchmarking and Understanding Unsupervised Graph Domain Adaptation

1 code implementation9 Jul 2024 Meihan Liu, Zhen Zhang, Jiachen Tang, Jiajun Bu, Bingsheng He, Sheng Zhou

Unsupervised Graph Domain Adaptation (UGDA) involves the transfer of knowledge from a label-rich source graph to an unlabeled target graph under domain discrepancies.

Benchmarking Domain Adaptation +1

Resource Allocation and Workload Scheduling for Large-Scale Distributed Deep Learning: A Survey

no code implementations12 Jun 2024 Feng Liang, Zhen Zhang, Haifeng Lu, Chengming Li, Victor C. M. Leung, Yanyi Guo, Xiping Hu

The large-scale environment with large volumes of datasets, models, and computational and communication resources raises various unique challenges for resource allocation and workload scheduling in distributed deep learning, such as scheduling complexity, resource and workload heterogeneity, and fault tolerance.

Deep Learning Scheduling +1

DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model

5 code implementations7 May 2024 DeepSeek-AI, Aixin Liu, Bei Feng, Bin Wang, Bingxuan Wang, Bo Liu, Chenggang Zhao, Chengqi Dengr, Chong Ruan, Damai Dai, Daya Guo, Dejian Yang, Deli Chen, Dongjie Ji, Erhang Li, Fangyun Lin, Fuli Luo, Guangbo Hao, Guanting Chen, Guowei Li, H. Zhang, Hanwei Xu, Hao Yang, Haowei Zhang, Honghui Ding, Huajian Xin, Huazuo Gao, Hui Li, Hui Qu, J. L. Cai, Jian Liang, JianZhong Guo, Jiaqi Ni, Jiashi Li, Jin Chen, Jingyang Yuan, Junjie Qiu, Junxiao Song, Kai Dong, Kaige Gao, Kang Guan, Lean Wang, Lecong Zhang, Lei Xu, Leyi Xia, Liang Zhao, Liyue Zhang, Meng Li, Miaojun Wang, Mingchuan Zhang, Minghua Zhang, Minghui Tang, Mingming Li, Ning Tian, Panpan Huang, Peiyi Wang, Peng Zhang, Qihao Zhu, Qinyu Chen, Qiushi Du, R. J. Chen, R. L. Jin, Ruiqi Ge, Ruizhe Pan, Runxin Xu, Ruyi Chen, S. S. Li, Shanghao Lu, Shangyan Zhou, Shanhuang Chen, Shaoqing Wu, Shengfeng Ye, Shirong Ma, Shiyu Wang, Shuang Zhou, Shuiping Yu, Shunfeng Zhou, Size Zheng, T. Wang, Tian Pei, Tian Yuan, Tianyu Sun, W. L. Xiao, Wangding Zeng, Wei An, Wen Liu, Wenfeng Liang, Wenjun Gao, Wentao Zhang, X. Q. Li, Xiangyue Jin, Xianzu Wang, Xiao Bi, Xiaodong Liu, Xiaohan Wang, Xiaojin Shen, Xiaokang Chen, Xiaosha Chen, Xiaotao Nie, Xiaowen Sun, Xiaoxiang Wang, Xin Liu, Xin Xie, Xingkai Yu, Xinnan Song, Xinyi Zhou, Xinyu Yang, Xuan Lu, Xuecheng Su, Y. Wu, Y. K. Li, Y. X. Wei, Y. X. Zhu, Yanhong Xu, Yanping Huang, Yao Li, Yao Zhao, Yaofeng Sun, Yaohui Li, Yaohui Wang, Yi Zheng, Yichao Zhang, Yiliang Xiong, Yilong Zhao, Ying He, Ying Tang, Yishi Piao, Yixin Dong, Yixuan Tan, Yiyuan Liu, Yongji Wang, Yongqiang Guo, Yuchen Zhu, Yuduan Wang, Yuheng Zou, Yukun Zha, Yunxian Ma, Yuting Yan, Yuxiang You, Yuxuan Liu, Z. Z. Ren, Zehui Ren, Zhangli Sha, Zhe Fu, Zhen Huang, Zhen Zhang, Zhenda Xie, Zhewen Hao, Zhihong Shao, Zhiniu Wen, Zhipeng Xu, Zhongyu Zhang, Zhuoshu Li, Zihan Wang, Zihui Gu, Zilin Li, Ziwei Xie

MLA guarantees efficient inference through significantly compressing the Key-Value (KV) cache into a latent vector, while DeepSeekMoE enables training strong models at an economical cost through sparse computation.

Language Modeling Language Modelling +2

Cross-IQA: Unsupervised Learning for Image Quality Assessment

no code implementations7 May 2024 Zhen Zhang

Automatic perception of image quality is a challenging problem that impacts billions of Internet and social media users daily.

Image Reconstruction NR-IQA

Advancing the Robustness of Large Language Models through Self-Denoised Smoothing

1 code implementation18 Apr 2024 Jiabao Ji, Bairu Hou, Zhen Zhang, Guanhua Zhang, Wenqi Fan, Qing Li, Yang Zhang, Gaowen Liu, Sijia Liu, Shiyu Chang

Although large language models (LLMs) have achieved significant success, their vulnerability to adversarial perturbations, including recent jailbreak attacks, has raised considerable concerns.

Communication-Efficient Large-Scale Distributed Deep Learning: A Comprehensive Survey

no code implementations9 Apr 2024 Feng Liang, Zhen Zhang, Haifeng Lu, Victor C. M. Leung, Yanyi Guo, Xiping Hu

Due to intensive synchronization of models and sharing of data across GPUs and computing nodes during distributed training and inference processes, communication efficiency becomes the bottleneck for achieving high performance at a large scale.

Data Compression Deep Learning +2

The Double-Edged Sword of Input Perturbations to Robust Accurate Fairness

no code implementations1 Apr 2024 Xuran Li, Peng Wu, Yanting Chen, Xingjun Ma, Zhen Zhang, Kaixiang Dong

Deep neural networks (DNNs) are known to be sensitive to adversarial input perturbations, leading to a reduction in either prediction accuracy or individual fairness.

Adversarial Attack Fairness

Identifiable Latent Neural Causal Models

no code implementations23 Mar 2024 Yuhang Liu, Zhen Zhang, Dong Gong, Mingming Gong, Biwei Huang, Anton Van Den Hengel, Kun Zhang, Javen Qinfeng Shi

This work establishes a {sufficient} and {necessary} condition characterizing the types of distribution shifts for identifiability in the context of latent additive noise models.

Representation Learning

A Causal Inspired Early-Branching Structure for Domain Generalization

1 code implementation13 Mar 2024 Liang Chen, Yong Zhang, Yibing Song, Zhen Zhang, Lingqiao Liu

By d-separation, we observe that the causal feature can be further characterized by being independent of the domain conditioned on the object, and we propose the following two strategies as complements for the basic framework.

Domain Generalization

Collaborate to Adapt: Source-Free Graph Domain Adaptation via Bi-directional Adaptation

1 code implementation3 Mar 2024 Zhen Zhang, Meihan Liu, Anhui Wang, Hongyang Chen, Zhao Li, Jiajun Bu, Bingsheng He

Unsupervised Graph Domain Adaptation (UGDA) has emerged as a practical solution to transfer knowledge from a label-rich source graph to a completely unlabelled target graph.

Contrastive Learning Domain Adaptation +1

BuffGraph: Enhancing Class-Imbalanced Node Classification via Buffer Nodes

no code implementations20 Feb 2024 Qian Wang, Zemin Liu, Zhen Zhang, Bingsheng He

Class imbalance in graph-structured data, where minor classes are significantly underrepresented, poses a critical challenge for Graph Neural Networks (GNNs).

Classification Node Classification

Distillation Enhanced Generative Retrieval

1 code implementation16 Feb 2024 Yongqi Li, Zhen Zhang, Wenjie Wang, Liqiang Nie, Wenjie Li, Tat-Seng Chua

Generative retrieval is a promising new paradigm in text retrieval that generates identifier strings of relevant passages as the retrieval target.

Text Retrieval

LinkNER: Linking Local Named Entity Recognition Models to Large Language Models using Uncertainty

1 code implementation16 Feb 2024 Zhen Zhang, Yuhua Zhao, Hang Gao, Mengting Hu

Named Entity Recognition (NER) serves as a fundamental task in natural language understanding, bearing direct implications for web content analysis, search engines, and information retrieval systems.

In-Context Learning Information Retrieval +4

Beyond DAGs: A Latent Partial Causal Model for Multimodal Learning

no code implementations9 Feb 2024 Yuhang Liu, Zhen Zhang, Dong Gong, Erdun Gao, Biwei Huang, Mingming Gong, Anton Van Den Hengel, Kun Zhang, Javen Qinfeng Shi

In this work, we propose a novel latent partial causal model for multimodal data, featuring two latent coupled variables, connected by an undirected edge, to represent the transfer of knowledge across modalities.

Contrastive Learning Disentanglement +2

Rethinking Propagation for Unsupervised Graph Domain Adaptation

1 code implementation8 Feb 2024 Meihan Liu, Zeyu Fang, Zhen Zhang, Ming Gu, Sheng Zhou, Xin Wang, Jiajun Bu

Motivated by our empirical analysis, we reevaluate the role of GNNs in graph domain adaptation and uncover the pivotal role of the propagation process in GNNs for adapting to different graph domains.

Domain Adaptation GRAPH DOMAIN ADAPTATION

PolyTOPS: Reconfigurable and Flexible Polyhedral Scheduler

no code implementations12 Jan 2024 Gianpietro Consolaro, Zhen Zhang, Harenome Razanajato, Nelson Lossing, Nassim Tchoulak, Adilla Susungi, Artur Cesar Araujo Alves, Renwei Zhang, Denis Barthou, Corinne Ancourt, Cedric Bastoul

Different scenarios, depending on the target architecture, compilation environment, and application domain, may require different kinds of optimization to best exploit the architecture feature set.

Scheduling

A Video Coding Method Based on Neural Network for CLIC2024

no code implementations8 Jan 2024 Zhengang Li, Jingchi Zhang, Yonghua Wang, Xing Zeng, Zhen Zhang, Yunlin Long, Menghu Jia, Ning Wang

Meanwhile, the deep learning methods propose a convolutional neural network-based loop filter (CNNLF), which is turned on/off based on the rate-distortion optimization at the CTU and frame level.

Deep Learning Quantization

GBSS:a global building semantic segmentation dataset for large-scale remote sensing building extraction

no code implementations2 Jan 2024 Yuping Hu, Xin Huang, Jiayi Li, Zhen Zhang

Semantic segmentation techniques for extracting building footprints from high-resolution remote sensing images have been widely used in many fields such as urban planning.

Diversity Segmentation +2

Coreference Graph Guidance for Mind-Map Generation

1 code implementation19 Dec 2023 Zhuowei Zhang, Mengting Hu, Yinhao Bai, Zhen Zhang

Then we employ a coreference graph encoder to mine the potential governing relations between sentences.

Contrastive Learning

CLAP: Isolating Content from Style through Contrastive Learning with Augmented Prompts

1 code implementation28 Nov 2023 Yichao Cai, Yuhang Liu, Zhen Zhang, Javen Qinfeng Shi

To address this limitation, we adopt a causal generative perspective for multimodal data and propose contrastive learning with data augmentation to disentangle content features from the original representations.

Contrastive Learning Image Augmentation +1

Identifiable Latent Polynomial Causal Models Through the Lens of Change

no code implementations24 Oct 2023 Yuhang Liu, Zhen Zhang, Dong Gong, Mingming Gong, Biwei Huang, Anton Van Den Hengel, Kun Zhang, Javen Qinfeng Shi

However, this progress rests on the assumption that the causal relationships among latent causal variables adhere strictly to linear Gaussian models.

Representation Learning

Interactive Navigation in Environments with Traversable Obstacles Using Large Language and Vision-Language Models

no code implementations13 Oct 2023 Zhen Zhang, Anran Lin, Chun Wai Wong, Xiangyu Chu, Qi Dou, K. W. Samuel Au

This paper proposes an interactive navigation framework by using large language and vision-language models, allowing robots to navigate in environments with traversable obstacles.

Language Modeling Language Modelling +2

EX-Graph: A Pioneering Dataset Bridging Ethereum and X

1 code implementation2 Oct 2023 Qian Wang, Zhen Zhang, Zemin Liu, Shengliang Lu, Bingqiao Luo, Bingsheng He

While numerous public blockchain datasets are available, their utility is constrained by an exclusive focus on blockchain data.

Link Prediction

Oobleck: Resilient Distributed Training of Large Models Using Pipeline Templates

2 code implementations15 Sep 2023 Insu Jang, Zhenning Yang, Zhen Zhang, Xin Jin, Mosharaf Chowdhury

Oobleck enables resilient distributed training of large DNN models with guaranteed fault tolerance.

RADE: Reference-Assisted Dialogue Evaluation for Open-Domain Dialogue

no code implementations15 Sep 2023 Zhengliang Shi, Weiwei Sun, Shuo Zhang, Zhen Zhang, Pengjie Ren, Zhaochun Ren

To this end, we propose the Reference-Assisted Dialogue Evaluation (RADE) approach under the multi-task learning framework, which leverages the pre-created utterance as reference other than the gold response to relief the one-to-many problem.

Dialogue Evaluation Multi-Task Learning +1

CFDBench: A Large-Scale Benchmark for Machine Learning Methods in Fluid Dynamics

1 code implementation13 Sep 2023 Yining Luo, Yingfa Chen, Zhen Zhang

Appropriate modifications were made to apply popular deep neural networks to CFDBench and enable the accommodation of more changing inputs.

Depth analysis of battery performance based on a data-driven approach

no code implementations30 Aug 2023 Zhen Zhang, Hongrui Sun, Hui Sun

Capacity attenuation is one of the most intractable issues in the current of application of the cells.

Factor Graph Neural Networks

no code implementations NeurIPS 2020 Zhen Zhang, Mohammed Haroon Dupty, Fan Wu, Javen Qinfeng Shi, Wee Sun Lee

In recent years, we have witnessed a surge of Graph Neural Networks (GNNs), most of which can learn powerful representations in an end-to-end fashion with great success in many real-world applications.

Graph Neural Network Representation Learning

Discovering a reaction-diffusion model for Alzheimer's disease by combining PINNs with symbolic regression

no code implementations16 Jul 2023 Zhen Zhang, Zongren Zou, Ellen Kuhl, George Em Karniadakis

Specifically, we integrate physics informed neural networks (PINNs) and symbolic regression to discover a reaction-diffusion type partial differential equation for tau protein misfolding and spreading.

regression Symbolic Regression

Certified Robustness for Large Language Models with Self-Denoising

1 code implementation14 Jul 2023 Zhen Zhang, Guanhua Zhang, Bairu Hou, Wenqi Fan, Qing Li, Sijia Liu, Yang Zhang, Shiyu Chang

This largely falls into the study of certified robust LLMs, i. e., all predictions of LLM are certified to be correct in a local region around the input.

Denoising

OpenDelta: A Plug-and-play Library for Parameter-efficient Adaptation of Pre-trained Models

1 code implementation5 Jul 2023 Shengding Hu, Ning Ding, Weilin Zhao, Xingtai Lv, Zhen Zhang, Zhiyuan Liu, Maosong Sun

The scale of large pre-trained models (PTMs) poses significant challenges in adapting to downstream tasks due to the high optimization overhead and storage costs associated with full-parameter fine-tuning.

Uncertainty in Natural Language Processing: Sources, Quantification, and Applications

no code implementations5 Jun 2023 Mengting Hu, Zhen Zhang, Shiwan Zhao, Minlie Huang, Bingzhe Wu

Therefore, in this survey, we provide a comprehensive review of uncertainty-relevant works in the NLP field.

Uncertainty Quantification

E-NER: Evidential Deep Learning for Trustworthy Named Entity Recognition

1 code implementation29 May 2023 Zhen Zhang, Mengting Hu, Shiwan Zhao, Minlie Huang, Haotian Wang, Lemao Liu, Zhirui Zhang, Zhe Liu, Bingzhe Wu

Most named entity recognition (NER) systems focus on improving model performance, ignoring the need to quantify model uncertainty, which is critical to the reliability of NER systems in open environments.

Deep Learning named-entity-recognition +2

RobustFair: Adversarial Evaluation through Fairness Confusion Directed Gradient Search

1 code implementation18 May 2023 Xuran Li, Peng Wu, Kaixiang Dong, Zhen Zhang, Yanting Chen

This matrix categorizes predictions as true fair, true biased, false fair, and false biased, and the perturbations guided by it can produce a dual impact on instances and their similar counterparts to either undermine prediction accuracy (robustness) or cause biased predictions (individual fairness).

Data Augmentation Fairness +1

Parameter-Efficient Cross-lingual Transfer of Vision and Language Models via Translation-based Alignment

1 code implementation2 May 2023 Zhen Zhang, Jialu Wang, Xin Eric Wang

Extensive experiments on XTD and Multi30K datasets, covering 11 languages under zero-shot, few-shot, and full-dataset learning scenarios, show that our framework significantly reduces the multilingual disparities among languages and improves cross-lingual transfer results, especially in low-resource scenarios, while only keeping and fine-tuning an extremely small number of parameters compared to the full model (e. g., Our framework only requires 0. 16\% additional parameters of a full-model for each language in the few-shot learning scenario).

Cross-Lingual Transfer Few-Shot Learning +2

A robust design of time-varying internal model principle-based control for ultra-precision tracking in a direct-drive servo stage

no code implementations13 Apr 2023 Yue Cao, Zhen Zhang

By means of the ESO feedback, the plant model is kept as nominal, and hence the structural robustness is achieved for the time-varying internal model.

Robust Design

BERT4ETH: A Pre-trained Transformer for Ethereum Fraud Detection

1 code implementation29 Mar 2023 Sihao Hu, Zhen Zhang, Bingqiao Luo, Shengliang Lu, Bingsheng He, Ling Liu

As various forms of fraud proliferate on Ethereum, it is imperative to safeguard against these malicious activities to protect susceptible users from being victimized.

Fraud Detection

Multi-pooling 3D Convolutional Neural Network for fMRI Classification of Visual Brain States

no code implementations25 Mar 2023 Zhen Zhang, Masaki Takeda, Makoto Iwata

Neural decoding of visual object classification via functional magnetic resonance imaging (fMRI) data is challenging and is vital to understand underlying brain mechanisms.

Classification Object

Slapo: A Schedule Language for Progressive Optimization of Large Deep Learning Model Training

1 code implementation16 Feb 2023 Hongzheng Chen, Cody Hao Yu, Shuai Zheng, Zhen Zhang, Zhiru Zhang, Yida Wang

Specifically, Slapo works on a PyTorch model and uses a set of schedule primitives to convert the model for common model training optimizations such as high-performance kernels, effective 3D parallelism, and efficient activation checkpointing.

Scheduling

Cost-minimization predictive energy management of a postal-delivery fuel cell electric vehicle with intelligent battery State-of-Charge Planner

no code implementations28 Dec 2022 Yang Zhou, Fuzeng Li, Xianfeng Xu, Zhen Zhang, Alexandre Ravey, Marie-Cécile Péra, Ruiqing Ma

Fuel cell electric vehicles have earned substantial attentions in recent decades due to their high-efficiency and zero-emission features, while the high operating costs remain the major barrier towards their large-scale commercialization.

energy management Management +1

Sparse Structure Search for Delta Tuning

1 code implementation NIPS 2022 Shengding Hu, Zhen Zhang, Ning Ding, Yadao Wang, Yasheng Wang, Zhiyuan Liu, Maosong Sun

Generally, DT methods exquisitely design delta modules (DT modules) which could be applied to arbitrary fine-grained positions inside PTMs.

Uncertainty Sentence Sampling by Virtual Adversarial Perturbation

no code implementations26 Oct 2022 Hanshan Zhang, Zhen Zhang, Hongfei Jiang, Yang song

Active learning for sentence understanding attempts to reduce the annotation cost by identifying the most informative examples.

Active Learning Diversity +2

How to Define the Propagation Environment Semantics and Its Application in Scatterer-Based Beam Prediction

no code implementations17 Sep 2022 Yutong Sun, Jianhua Zhang, Li Yu, Zhen Zhang, Ping Zhang

Inspired by task-oriented semantic communication and machine learning (ML) powered environment-channel mapping methods, this work aims to provide a new view of the environment from the semantic level, which defines the propagation environment semantics (PES) as a limited set of propagation environment semantic symbols (PESS) for diverse application tasks.

Beam Prediction Semantic Communication

Latent Covariate Shift: Unlocking Partial Identifiability for Multi-Source Domain Adaptation

no code implementations30 Aug 2022 Yuhang Liu, Zhen Zhang, Dong Gong, Mingming Gong, Biwei Huang, Anton Van Den Hengel, Kun Zhang, Javen Qinfeng Shi

Within this new paradigm, we present an intricate causal generative model by introducing latent noises across domains, along with a latent content variable and a latent style variable to achieve more nuanced rendering of observational data.

Domain Adaptation

Truncated Matrix Power Iteration for Differentiable DAG Learning

1 code implementation30 Aug 2022 Zhen Zhang, Ignavier Ng, Dong Gong, Yuhang Liu, Ehsan M Abbasnejad, Mingming Gong, Kun Zhang, Javen Qinfeng Shi

Recovering underlying Directed Acyclic Graph (DAG) structures from observational data is highly challenging due to the combinatorial nature of the DAG-constrained optimization problem.

Identifying Weight-Variant Latent Causal Models

no code implementations30 Aug 2022 Yuhang Liu, Zhen Zhang, Dong Gong, Mingming Gong, Biwei Huang, Anton Van Den Hengel, Kun Zhang, Javen Qinfeng Shi

The task of causal representation learning aims to uncover latent higher-level causal representations that affect lower-level observations.

Representation Learning

Sparse Structure Search for Parameter-Efficient Tuning

no code implementations15 Jun 2022 Shengding Hu, Zhen Zhang, Ning Ding, Yadao Wang, Yasheng Wang, Zhiyuan Liu, Maosong Sun

The searched structures preserve more than 99\% fine-tuning performance with 0. 01\% trainable parameters.

NTIRE 2022 Challenge on High Dynamic Range Imaging: Methods and Results

no code implementations25 May 2022 Eduardo Pérez-Pellitero, Sibi Catley-Chandar, Richard Shaw, Aleš Leonardis, Radu Timofte, Zexin Zhang, Cen Liu, Yunbo Peng, Yue Lin, Gaocheng Yu, Jin Zhang, Zhe Ma, Hongbin Wang, Xiangyu Chen, Xintao Wang, Haiwei Wu, Lin Liu, Chao Dong, Jiantao Zhou, Qingsen Yan, Song Zhang, Weiye Chen, Yuhang Liu, Zhen Zhang, Yanning Zhang, Javen Qinfeng Shi, Dong Gong, Dan Zhu, Mengdi Sun, Guannan Chen, Yang Hu, Haowei Li, Baozhu Zou, Zhen Liu, Wenjie Lin, Ting Jiang, Chengzhi Jiang, Xinpeng Li, Mingyan Han, Haoqiang Fan, Jian Sun, Shuaicheng Liu, Juan Marín-Vega, Michael Sloth, Peter Schneider-Kamp, Richard Röttger, Chunyang Li, Long Bao, Gang He, Ziyao Xu, Li Xu, Gen Zhan, Ming Sun, Xing Wen, Junlin Li, Shuang Feng, Fei Lei, Rui Liu, Junxiang Ruan, Tianhong Dai, Wei Li, Zhan Lu, Hengyan Liu, Peian Huang, Guangyu Ren, Yonglin Luo, Chang Liu, Qiang Tu, Fangya Li, Ruipeng Gang, Chenghua Li, Jinjing Li, Sai Ma, Chenming Liu, Yizhen Cao, Steven Tel, Barthelemy Heyrman, Dominique Ginhac, Chul Lee, Gahyeon Kim, Seonghyun Park, An Gia Vien, Truong Thanh Nhat Mai, Howoon Yoon, Tu Vo, Alexander Holston, Sheir Zaheer, Chan Y. Park

The challenge is composed of two tracks with an emphasis on fidelity and complexity constraints: In Track 1, participants are asked to optimize objective fidelity scores while imposing a low-complexity constraint (i. e. solutions can not exceed a given number of operations).

Image Restoration Vocal Bursts Intensity Prediction

MolMiner: You only look once for chemical structure recognition

no code implementations23 May 2022 Youjun Xu, Jinchuan Xiao, Chia-Han Chou, Jianhang Zhang, Jintao Zhu, Qiwan Hu, Hemin Li, Ningsheng Han, Bingyu Liu, Shuaipeng Zhang, Jinyu Han, Zhen Zhang, Shuhao Zhang, Weilin Zhang, Luhua Lai, Jianfeng Pei

Due to a backlog of decades and an increasing amount of these printed literature, there is a high demand for the translation of printed depictions into machine-readable formats, which is known as Optical Chemical Structure Recognition (OCSR).

object-detection Object Detection +1

Adversarial Training-Aided Time-Varying Channel Prediction for TDD/FDD Systems

no code implementations25 Apr 2022 Zhen Zhang, Yuxiang Zhang, Jianhua Zhang, Feifei Gao

In this paper, a time-varying channel prediction method based on conditional generative adversarial network (CPcGAN) is proposed for time division duplexing/frequency division duplexing (TDD/FDD) systems.

Generative Adversarial Network Prediction

Sequence-Based Target Coin Prediction for Cryptocurrency Pump-and-Dump

1 code implementation21 Apr 2022 Sihao Hu, Zhen Zhang, Shengliang Lu, Bingsheng He, Zhao Li

With the proliferation of pump-and-dump schemes (P&Ds) in the cryptocurrency market, it becomes imperative to detect such fraudulent activities in advance to alert potentially susceptible investors.

Enhanced Contour Tracking: a Time-Varying Internal Model Principle-Based Approach

no code implementations23 Mar 2022 Yue Cao, Zhen Zhang

The proposed TV-IMCC is twofold, including an extended position domain framework with master-slave structures for contour regulation, and a time-varying internal model principle-based controller for each axial tracking precision improvement.

Position

Systems Biology: Identifiability analysis and parameter identification via systems-biology informed neural networks

2 code implementations3 Feb 2022 Mitchell Daneker, Zhen Zhang, George Em Karniadakis, Lu Lu

The dynamics of systems biological processes are usually modeled by a system of ordinary differential equations (ODEs) with many unknown parameters that need to be inferred from noisy and sparse measurements.

parameter estimation

SympOCnet: Solving optimal control problems with applications to high-dimensional multi-agent path planning problems

1 code implementation14 Jan 2022 Tingwei Meng, Zhen Zhang, Jérôme Darbon, George Em Karniadakis

Solving high-dimensional optimal control problems in real-time is an important but challenging problem, with applications to multi-agent path planning problems, which have drawn increased attention given the growing popularity of drones in recent years.

GFINNs: GENERIC Formalism Informed Neural Networks for Deterministic and Stochastic Dynamical Systems

1 code implementation31 Aug 2021 Zhen Zhang, Yeonjong Shin, George Em Karniadakis

We propose the GENERIC formalism informed neural networks (GFINNs) that obey the symmetric degeneracy conditions of the GENERIC formalism.

Adaptive Optimizers with Sparse Group Lasso for Neural Networks in CTR Prediction

2 code implementations30 Jul 2021 Yun Yue, Yongchao Liu, Suo Tong, Minghao Li, Zhen Zhang, Chunyang Wen, Huanjun Bao, Lihong Gu, Jinjie Gu, Yixiang Mu

We develop a novel framework that adds the regularizers of the sparse group lasso to a family of adaptive optimizers in deep learning, such as Momentum, Adagrad, Adam, AMSGrad, AdaHessian, and create a new class of optimizers, which are named Group Momentum, Group Adagrad, Group Adam, Group AMSGrad and Group AdaHessian, etc., accordingly.

Click-Through Rate Prediction

Uncertainty-Guided Mixup for Semi-Supervised Domain Adaptation without Source Data

no code implementations14 Jul 2021 Ning Ma, Jiajun Bu, Zhen Zhang, Sheng Zhou

Present domain adaptation methods usually perform explicit representation alignment by simultaneously accessing the source data and target data.

Privacy Preserving Semi-supervised Domain Adaptation +1

Semi-Supervised Hypothesis Transfer for Source-Free Domain Adaptation

no code implementations14 Jul 2021 Ning Ma, Jiajun Bu, Lixian Lu, Jun Wen, Zhen Zhang, Sheng Zhou, Xifeng Yan

Domain Adaptation has been widely used to deal with the distribution shift in vision, language, multimedia etc.

Source-Free Domain Adaptation

Adaptive Optimizers with Sparse Group Lasso

no code implementations1 Jan 2021 Yun Yue, Suo Tong, Zhen Zhang, Yongchao Liu, Chunyang Wen, Huanjun Bao, Jinjie Gu, Yixiang Mu

We develop a novel framework that adds the regularizers to a family of adaptive optimizers in deep learning, such as MOMENTUM, ADAGRAD, ADAM, AMSGRAD, ADAHESSIAN, and create a new class of optimizers, which are named GROUP MOMENTUM, GROUP ADAGRAD, GROUP ADAM, GROUP AMSGRAD and GROUP ADAHESSIAN, etc., accordingly.

Deep Learning

Market Impact in Trader-Agents: Adding Multi-Level Order-Flow Imbalance-Sensitivity to Automated Trading Systems

no code implementations23 Dec 2020 Zhen Zhang, Dave Cliff

We demonstrate that the new imbalance-sensitive trader-agents introduced here do exhibit market impact effects, and hence are better-suited to operating in markets where impact is a factor of concern or interest, but do not suffer the weaknesses of the methods used by Church & Cliff.

Learning Poisson systems and trajectories of autonomous systems via Poisson neural networks

1 code implementation5 Dec 2020 Pengzhan Jin, Zhen Zhang, Ioannis G. Kevrekidis, George Em Karniadakis

We propose the Poisson neural networks (PNNs) to learn Poisson systems and trajectories of autonomous systems from data.

Cyclic Label Propagation for Graph Semi-supervised Learning

no code implementations24 Nov 2020 Zhao Li, Yixin Liu, Zhen Zhang, Shirui Pan, Jianliang Gao, Jiajun Bu

To overcome these limitations, we introduce a novel framework for graph semi-supervised learning termed as Cyclic Label Propagation (CycProp for abbreviation), which integrates GNNs into the process of label propagation in a cyclic and mutually reinforcing manner to exploit the advantages of both GNNs and LPA.

Node Classification

Deep Reinforcement Learning of Transition States

no code implementations13 Nov 2020 Jun Zhang, Yao-Kun Lei, Zhen Zhang, Xu Han, Maodong Li, Lijiang Yang, Yi Isaac Yang, Yi Qin Gao

Combining reinforcement learning (RL) and molecular dynamics (MD) simulations, we propose a machine-learning approach (RL$^\ddag$) to automatically unravel chemical reaction mechanisms.

Deep Reinforcement Learning reinforcement-learning +1

CL-MAPF: Multi-Agent Path Finding for Car-Like Robots with Kinematic and Spatiotemporal Constraints

1 code implementation1 Nov 2020 Licheng Wen, Zhen Zhang, Zhe Chen, Xiangrui Zhao, Yong liu

In this paper, we give a mathematical formalization of Multi-Agent Path Finding for Car-Like robots (CL-MAPF) problem.

Robotics Multiagent Systems

Brain Tumor Segmentation Network Using Attention-based Fusion and Spatial Relationship Constraint

no code implementations29 Oct 2020 Chenyu Liu, Wangbin Ding, Lei LI, Zhen Zhang, Chenhao Pei, Liqin Huang, Xiahai Zhuang

Considering that multi-modal MR images can reflect different tumor biological properties, we develop a novel multi-modal tumor segmentation network (MMTSN) to robustly segment brain tumors based on multi-modal MR images.

Brain Tumor Segmentation Tumor Segmentation

Kagome quantum anomalous Hall effect with high Chern number and large band gap

no code implementations15 Oct 2020 Zhen Zhang, Jing-Yang You, Xing-Yu Ma, Bo Gu, Gang Su

For the bilayer compound Co6Sn5Se4, it becomes a half-metal, with a relatively flat plateau in its anomalous Hall conductivity corresponding to |C| = 3 near the Fermi level.

Materials Science

Multi-Modality Pathology Segmentation Framework: Application to Cardiac Magnetic Resonance Images

1 code implementation13 Aug 2020 Zhen Zhang, Chenyu Liu, Wangbin Ding, Sihan Wang, Chenhao Pei, Mingjing Yang, Liqin Huang

The PRSN is designed to segment pathological region based on the result of ASSN, in which a fusion block based on channel attention is proposed to better aggregate multi-modality information from multi-modality CMR images.

Denoising Segmentation

Is Network the Bottleneck of Distributed Training?

1 code implementation17 Jun 2020 Zhen Zhang, Chaokun Chang, Haibin Lin, Yida Wang, Raman Arora, Xin Jin

As such, we advocate that the real challenge of distributed training is for the network community to develop high-performance network transport to fully utilize the network capacity and achieve linear scale-out.

Time-Stretched Femtosecond Lidar Using Microwave Photonic Signal Processing

no code implementations29 May 2020 Lijie Zhao, Haiyun Xia, Yihua Hu, Tengfei Wu, Zhen Zhang, Jibo Han, Yunbin Wu, Tiancheng Luo

After that, the frequency variation of the microwave pulse is uploaded to the first order sidebands.

Multiplicative Gaussian Particle Filter

no code implementations29 Feb 2020 Xuan Su, Wee Sun Lee, Zhen Zhang

We propose a new sampling-based approach for approximate inference in filtering problems.

SympNets: Intrinsic structure-preserving symplectic networks for identifying Hamiltonian systems

1 code implementation11 Jan 2020 Pengzhan Jin, Zhen Zhang, Aiqing Zhu, Yifa Tang, George Em. Karniadakis

We propose new symplectic networks (SympNets) for identifying Hamiltonian systems from data based on a composition of linear, activation and gradient modules.

KerGM: Kernelized Graph Matching

1 code implementation NeurIPS 2019 Zhen Zhang, Yijian Xiang, Lingfei Wu, Bing Xue, Arye Nehorai

Graph matching plays a central role in such fields as computer vision, pattern recognition, and bioinformatics.

Graph Matching

Visual Relationship Detection with Low Rank Non-Negative Tensor Decomposition

no code implementations22 Nov 2019 Mohammed Haroon Dupty, Zhen Zhang, Wee Sun Lee

We address the problem of Visual Relationship Detection (VRD) which aims to describe the relationships between pairs of objects in the form of triplets of (subject, predicate, object).

Form Relationship Detection +3

Hierarchical Graph Pooling with Structure Learning

3 code implementations14 Nov 2019 Zhen Zhang, Jiajun Bu, Martin Ester, Jianfeng Zhang, Chengwei Yao, Zhi Yu, Can Wang

HGP-SL incorporates graph pooling and structure learning into a unified module to generate hierarchical representations of graphs.

Graph Classification Graph Neural Network +1

Learning Clustered Representation for Complex Free Energy Landscapes

no code implementations7 Jun 2019 Jun Zhang, Yao-Kun Lei, Xing Che, Zhen Zhang, Yi Isaac Yang, Yi Qin Gao

In this paper we first analyzed the inductive bias underlying the data scattered across complex free energy landscapes (FEL), and exploited it to train deep neural networks which yield reduced and clustered representation for the FEL.

Clustering Dimensionality Reduction +1

Factor Graph Neural Network

1 code implementation3 Jun 2019 Zhen Zhang, Fan Wu, Wee Sun Lee

Most of the successful deep neural network architectures are structured, often consisting of elements like convolutional neural networks and gated recurrent neural networks.

Graph Neural Network

phq: a Fortran code to compute phonon quasiparticle properties and dispersions

1 code implementation18 Feb 2019 Zhen Zhang, Dong-Bo Zhang, Tao Sun, Renata Wentzcovitch

We here introduce a Fortran code that computes anharmonic free energy of solids from first-principles based on our phonon quasiparticle approach.

Materials Science

Aligning Infinite-Dimensional Covariance Matrices in Reproducing Kernel Hilbert Spaces for Domain Adaptation

no code implementations CVPR 2018 Zhen Zhang, Mianzhi Wang, Yan Huang, Arye Nehorai

Domain shift, which occurs when there is a mismatch between the distributions of training (source) and testing (target) datasets, usually results in poor performance of the trained model on the target domain.

Domain Adaptation

OmicsMapNet: Transforming omics data to take advantage of Deep Convolutional Neural Network for discovery

no code implementations14 Apr 2018 Shiyong Ma, Zhen Zhang

We developed OmicsMapNet approach to take advantage of existing deep leaning frameworks to analyze high-dimensional omics data as 2-dimensional images.

Learning Deep Gradient Descent Optimization for Image Deconvolution

1 code implementation10 Apr 2018 Dong Gong, Zhen Zhang, Qinfeng Shi, Anton Van Den Hengel, Chunhua Shen, Yanning Zhang

Extensive experiments on synthetic benchmarks and challenging real-world images demonstrate that the proposed deep optimization method is effective and robust to produce favorable results as well as practical for real-world image deblurring applications.

Image Deblurring Image Deconvolution

Depth and Image Restoration From Light Field in a Scattering Medium

no code implementations ICCV 2017 Jiandong Tian, Zachary Murez, Tong Cui, Zhen Zhang, David Kriegman, Ravi Ramamoorthi

First, we present a new single image restoration algorithm which removes backscatter and attenuation from images better than existing methods, and apply it to each view in the light field.

Depth Estimation Image Restoration

Joint Probabilistic Matching Using m-Best Solutions

no code implementations CVPR 2016 Seyed Hamid Rezatofighi, Anton Milan, Zhen Zhang, Qinfeng Shi, Anthony Dick, Ian Reid

Matching between two sets of objects is typically approached by finding the object pairs that collectively maximize the joint matching score.

Person Re-Identification

Joint Probabilistic Data Association Revisited

1 code implementation ICCV 2015 Seyed Hamid Rezatofighi, Anton Milan, Zhen Zhang, Qinfeng Shi, Anthony Dick, Ian Reid

In this paper, we revisit the joint probabilistic data association (JPDA) technique and propose a novel solution based on recent developments in finding the m-best solutions to an integer linear program.

Constraint Reduction using Marginal Polytope Diagrams for MAP LP Relaxations

no code implementations17 Dec 2013 Zhen Zhang, Qinfeng Shi, Yanning Zhang, Chunhua Shen, Anton Van Den Hengel

We show that using Marginal Polytope Diagrams allows the number of constraints to be reduced without loosening the LP relaxations.

A feasible roadmap for unsupervised deconvolution of two-source mixed gene expressions

no code implementations25 Oct 2013 Niya Wang, Eric P. Hoffman, Robert Clarke, Zhen Zhang, David M. Herrington, Ie-Ming Shih, Douglas A. Levine, Guoqiang Yu, Jianhua Xuan, Yue Wang

Tissue heterogeneity is a major confounding factor in studying individual populations that cannot be resolved directly by global profiling.

Cannot find the paper you are looking for? You can Submit a new open access paper.