Search Results for author: Cheng Tan

Found 59 papers, 35 papers with code

Advances of Deep Learning in Protein Science: A Comprehensive Survey

no code implementations8 Mar 2024 Bozhen Hu, Cheng Tan, Lirong Wu, Jiangbin Zheng, Jun Xia, Zhangyang Gao, Zicheng Liu, Fandi Wu, Guijun Zhang, Stan Z. Li

Protein representation learning plays a crucial role in understanding the structure and function of proteins, which are essential biomolecules involved in various biological processes.

Drug Discovery Protein Function Prediction +2

Decoupling Weighing and Selecting for Integrating Multiple Graph Pre-training Tasks

1 code implementation3 Mar 2024 Tianyu Fan, Lirong Wu, Yufei Huang, Haitao Lin, Cheng Tan, Zhangyang Gao, Stan Z. Li

In this paper, we identify two important collaborative processes for this topic: (1) select: how to select an optimal task combination from a given task pool based on their compatibility, and (2) weigh: how to weigh the selected tasks based on their importance.

Graph Representation Learning

Re-Dock: Towards Flexible and Realistic Molecular Docking with Diffusion Bridge

no code implementations18 Feb 2024 Yufei Huang, Odin Zhang, Lirong Wu, Cheng Tan, Haitao Lin, Zhangyang Gao, Siyuan Li, Stan. Z. Li

Accurate prediction of protein-ligand binding structures, a task known as molecular docking is crucial for drug design but remains challenging.

Molecular Docking

Switch EMA: A Free Lunch for Better Flatness and Sharpness

2 code implementations14 Feb 2024 Siyuan Li, Zicheng Liu, Juanxi Tian, Ge Wang, Zedong Wang, Weiyang Jin, Di wu, Cheng Tan, Tao Lin, Yang Liu, Baigui Sun, Stan Z. Li

Exponential Moving Average (EMA) is a widely used weight averaging (WA) regularization to learn flat optima for better generalizations without extra cost in deep neural network (DNN) optimization.

Attribute Image Classification +7

PSC-CPI: Multi-Scale Protein Sequence-Structure Contrasting for Efficient and Generalizable Compound-Protein Interaction Prediction

1 code implementation13 Feb 2024 Lirong Wu, Yufei Huang, Cheng Tan, Zhangyang Gao, Bozhen Hu, Haitao Lin, Zicheng Liu, Stan Z. Li

Compound-Protein Interaction (CPI) prediction aims to predict the pattern and strength of compound-protein interactions for rational drug discovery.

Drug Discovery

A Graph is Worth $K$ Words: Euclideanizing Graph using Pure Transformer

no code implementations4 Feb 2024 Zhangyang Gao, Daize Dong, Cheng Tan, Jun Xia, Bozhen Hu, Stan Z. Li

Despite recent GNN and Graphformer efforts encoding graphs as Euclidean vectors, recovering original graph from the vectors remains a challenge.

Graph Classification Graph Generation +1

MLIP: Enhancing Medical Visual Representation with Divergence Encoder and Knowledge-guided Contrastive Learning

no code implementations3 Feb 2024 Zhe Li, Laurence T. Yang, Bocheng Ren, Xin Nie, Zhangyang Gao, Cheng Tan, Stan Z. Li

The scarcity of annotated data has sparked significant interest in unsupervised pre-training methods that leverage medical reports as auxiliary signals for medical visual representation learning.

Contrastive Learning Image Classification +5

DCS-Net: Pioneering Leakage-Free Point Cloud Pretraining Framework with Global Insights

no code implementations3 Feb 2024 Zhe Li, Zhangyang Gao, Cheng Tan, Stan Z. Li, Laurence T. Yang

Experimental results demonstrate that our method enhances the expressive capacity of existing point cloud models and effectively addresses the issue of information leakage.

Deep Manifold Transformation for Protein Representation Learning

no code implementations12 Jan 2024 Bozhen Hu, Zelin Zang, Cheng Tan, Stan Z. Li

Protein representation learning is critical in various tasks in biology, such as drug design and protein structure or function prediction, which has primarily benefited from protein language models and graph neural networks.

Representation Learning

Deep Manifold Graph Auto-Encoder for Attributed Graph Embedding

no code implementations12 Jan 2024 Bozhen Hu, Zelin Zang, Jun Xia, Lirong Wu, Cheng Tan, Stan Z. Li

Representing graph data in a low-dimensional space for subsequent tasks is the purpose of attributed graph embedding.

Graph Embedding

Masked Modeling for Self-supervised Representation Learning on Vision and Beyond

1 code implementation31 Dec 2023 Siyuan Li, Luyuan Zhang, Zedong Wang, Di wu, Lirong Wu, Zicheng Liu, Jun Xia, Cheng Tan, Yang Liu, Baigui Sun, Stan Z. Li

As the deep learning revolution marches on, self-supervised learning has garnered increasing attention in recent years thanks to its remarkable representation learning ability and the low dependence on labeled data.

Representation Learning Self-Supervised Learning

Efficiently Predicting Protein Stability Changes Upon Single-point Mutation with Large Language Models

no code implementations7 Dec 2023 Yijie Zhang, Zhangyang Gao, Cheng Tan, Stan Z. Li

Predicting protein stability changes induced by single-point mutations has been a persistent challenge over the years, attracting immense interest from numerous researchers.

Computational Efficiency

Boosting the Power of Small Multimodal Reasoning Models to Match Larger Models with Self-Consistency Training

1 code implementation23 Nov 2023 Cheng Tan, Jingxuan Wei, Zhangyang Gao, Linzhuang Sun, Siyuan Li, Xihong Yang, Stan Z. Li

Remarkably, we show that even smaller base models, when equipped with our proposed approach, can achieve results comparable to those of larger models, illustrating the potential of our approach in harnessing the power of rationales for improved multimodal reasoning.

Multimodal Reasoning

Segment Anything in Defect Detection

no code implementations17 Nov 2023 Bozhen Hu, Bin Gao, Cheng Tan, Tongle Wu, Stan Z. Li

Defect detection plays a crucial role in infrared non-destructive testing systems, offering non-contact, safe, and efficient inspection capabilities.

Defect Detection

General Point Model with Autoencoding and Autoregressive

no code implementations25 Oct 2023 Zhe Li, Zhangyang Gao, Cheng Tan, Stan Z. Li, Laurence T. Yang

This model is versatile, allowing fine-tuning for downstream point cloud representation tasks, as well as unconditional and conditional generation tasks.

Language Modelling Large Language Model +2

Revisiting the Temporal Modeling in Spatio-Temporal Predictive Learning under A Unified View

no code implementations9 Oct 2023 Cheng Tan, Jue Wang, Zhangyang Gao, Siyuan Li, Lirong Wu, Jun Xia, Stan Z. Li

In this paper, we re-examine the two dominant temporal modeling approaches within the realm of spatio-temporal predictive learning, offering a unified perspective.

Self-Supervised Learning

Dropout Attacks

no code implementations4 Sep 2023 Andrew Yuan, Alina Oprea, Cheng Tan

DROPOUTATTACK attacks the dropout operator by manipulating the selection of neurons to drop instead of selecting them uniformly at random.

CONVERT:Contrastive Graph Clustering with Reliable Augmentation

2 code implementations17 Aug 2023 Xihong Yang, Cheng Tan, Yue Liu, Ke Liang, Siwei Wang, Sihang Zhou, Jun Xia, Stan Z. Li, Xinwang Liu, En Zhu

To address these problems, we propose a novel CONtrastiVe Graph ClustEring network with Reliable AugmenTation (CONVERT).

Clustering Contrastive Learning +4

Enhancing Human-like Multi-Modal Reasoning: A New Challenging Dataset and Comprehensive Framework

1 code implementation24 Jul 2023 Jingxuan Wei, Cheng Tan, Zhangyang Gao, Linzhuang Sun, Siyuan Li, Bihui Yu, Ruifeng Guo, Stan Z. Li

Multimodal reasoning is a critical component in the pursuit of artificial intelligence systems that exhibit human-like intelligence, especially when tackling complex tasks.

Contrastive Learning Multimodal Reasoning +2

OpenSTL: A Comprehensive Benchmark of Spatio-Temporal Predictive Learning

2 code implementations NeurIPS 2023 Cheng Tan, Siyuan Li, Zhangyang Gao, Wenfei Guan, Zedong Wang, Zicheng Liu, Lirong Wu, Stan Z. Li

Spatio-temporal predictive learning is a learning paradigm that enables models to learn spatial and temporal patterns by predicting future frames from given past frames in an unsupervised manner.

Weather Forecasting

Knowledge-Design: Pushing the Limit of Protein Design via Knowledge Refinement

1 code implementation20 May 2023 Zhangyang Gao, Cheng Tan, Stan Z. Li

After witnessing the great success of pretrained models on diverse protein-related tasks and the fact that recovery is highly correlated with confidence, we wonder whether this knowledge can push the limits of protein design further.

Protein Design Retrieval +1

Cross-Gate MLP with Protein Complex Invariant Embedding is A One-Shot Antibody Designer

1 code implementation21 Apr 2023 Cheng Tan, Zhangyang Gao, Lirong Wu, Jun Xia, Jiangbin Zheng, Xihong Yang, Yue Liu, Bozhen Hu, Stan Z. Li

In this paper, we propose a \textit{simple yet effective} model that can co-design 1D sequences and 3D structures of CDRs in a one-shot manner.

Specificity

Lightweight Contrastive Protein Structure-Sequence Transformation

no code implementations19 Mar 2023 Jiangbin Zheng, Ge Wang, Yufei Huang, Bozhen Hu, Siyuan Li, Cheng Tan, Xinwen Fan, Stan Z. Li

In this work, we introduce a novel unsupervised protein structure representation pretraining with a robust protein language model.

Masked Language Modeling Protein Design +1

CVT-SLR: Contrastive Visual-Textual Transformation for Sign Language Recognition with Variational Alignment

1 code implementation CVPR 2023 Jiangbin Zheng, Yile Wang, Cheng Tan, Siyuan Li, Ge Wang, Jun Xia, Yidong Chen, Stan Z. Li

In this work, we propose a novel contrastive visual-textual transformation for SLR, CVT-SLR, to fully explore the pretrained knowledge of both the visual and language modalities.

Sign Language Recognition

PrefixMol: Target- and Chemistry-aware Molecule Design via Prefix Embedding

no code implementations14 Feb 2023 Zhangyang Gao, Yuqi Hu, Cheng Tan, Stan Z. Li

Is there a unified model for generating molecules considering different conditions, such as binding pockets and chemical properties?

Multi-Task Learning

RDesign: Hierarchical Data-efficient Representation Learning for Tertiary Structure-based RNA Design

1 code implementation25 Jan 2023 Cheng Tan, Yijie Zhang, Zhangyang Gao, Bozhen Hu, Siyuan Li, Zicheng Liu, Stan Z. Li

We crafted a large, well-curated benchmark dataset and designed a comprehensive structural modeling approach to represent the complex RNA tertiary structure.

Contrastive Learning Protein Design +2

DiffSDS: A language diffusion model for protein backbone inpainting under geometric conditions and constraints

1 code implementation22 Jan 2023 Zhangyang Gao, Cheng Tan, Stan Z. Li

Have you ever been troubled by the complexity and computational cost of SE(3) protein structure modeling and been amazed by the simplicity and power of language modeling?

Denoising Language Modelling

RFold: RNA Secondary Structure Prediction with Decoupled Optimization

1 code implementation2 Dec 2022 Cheng Tan, Zhangyang Gao, Stan Z. Li

The secondary structure of ribonucleic acid (RNA) is more stable and accessible in the cell than its tertiary structure, making it essential for functional prediction.

Protein Language Models and Structure Prediction: Connection and Progression

1 code implementation30 Nov 2022 Bozhen Hu, Jun Xia, Jiangbin Zheng, Cheng Tan, Yufei Huang, Yongjie Xu, Stan Z. Li

The prediction of protein structures from sequences is an important task for function prediction, drug design, and related biological processes understanding.

Protein Folding Protein Language Model +1

Personalized Reward Learning with Interaction-Grounded Learning (IGL)

1 code implementation28 Nov 2022 Jessica Maghakian, Paul Mineiro, Kishan Panaganti, Mark Rucker, Akanksha Saran, Cheng Tan

In an era of countless content offerings, recommender systems alleviate information overload by providing users with personalized content suggestions.

Recommendation Systems

SimVP: Towards Simple yet Powerful Spatiotemporal Predictive Learning

2 code implementations22 Nov 2022 Cheng Tan, Zhangyang Gao, Siyuan Li, Stan Z. Li

Without introducing any extra tricks and strategies, SimVP can achieve superior performance on various benchmark datasets.

Video Prediction

MogaNet: Multi-order Gated Aggregation Network

6 code implementations7 Nov 2022 Siyuan Li, Zedong Wang, Zicheng Liu, Cheng Tan, Haitao Lin, Di wu, ZhiYuan Chen, Jiangbin Zheng, Stan Z. Li

Notably, MogaNet hits 80. 0\% and 87. 8\% accuracy with 5. 2M and 181M parameters on ImageNet-1K, outperforming ParC-Net and ConvNeXt-L, while saving 59\% FLOPs and 17M parameters, respectively.

3D Human Pose Estimation Image Classification +6

Leveraging Graph-based Cross-modal Information Fusion for Neural Sign Language Translation

no code implementations1 Nov 2022 Jiangbin Zheng, Siyuan Li, Cheng Tan, Chong Wu, Yidong Chen, Stan Z. Li

Therefore, we propose to introduce additional word-level semantic knowledge of sign language linguistics to assist in improving current end-to-end neural SLT models.

Sign Language Translation Translation

PiFold: Toward effective and efficient protein inverse folding

1 code implementation22 Sep 2022 Zhangyang Gao, Cheng Tan, Pablo Chacón, Stan Z. Li

How can we design protein sequences folding into the desired structures effectively and efficiently?

Protein Design

OpenMixup: A Comprehensive Mixup Benchmark for Visual Classification

1 code implementation11 Sep 2022 Siyuan Li, Zedong Wang, Zicheng Liu, Di wu, Cheng Tan, Weiyang Jin, Stan Z. Li

Data mixing, or mixup, is a data-dependent augmentation technique that has greatly enhanced the generalizability of modern deep neural networks.

Benchmarking Classification +3

A Survey on Generative Diffusion Model

1 code implementation6 Sep 2022 Hanqun Cao, Cheng Tan, Zhangyang Gao, Yilun Xu, Guangyong Chen, Pheng-Ann Heng, Stan Z. Li

Deep generative models are a prominent approach for data generation, and have been used to produce high quality samples in various domains.

Dimensionality Reduction

NNSmith: Generating Diverse and Valid Test Cases for Deep Learning Compilers

1 code implementation26 Jul 2022 Jiawei Liu, JinKun Lin, Fabian Ruffy, Cheng Tan, Jinyang Li, Aurojit Panda, Lingming Zhang

In this work, we propose a new fuzz testing approach for finding bugs in deep-learning compilers.

valid

CoSP: Co-supervised pretraining of pocket and ligand

no code implementations23 Jun 2022 Zhangyang Gao, Cheng Tan, Lirong Wu, Stan Z. Li

Can we inject the pocket-ligand interaction knowledge into the pre-trained model and jointly learn their chemical space?

Contrastive Learning Specificity

SimVP: Simpler yet Better Video Prediction

3 code implementations CVPR 2022 Zhangyang Gao, Cheng Tan, Lirong Wu, Stan Z. Li

From CNN, RNN, to ViT, we have witnessed remarkable advancements in video prediction, incorporating auxiliary inputs, elaborate neural architectures, and sophisticated training strategies.

Video Prediction

Hyperspherical Consistency Regularization

1 code implementation CVPR 2022 Cheng Tan, Zhangyang Gao, Lirong Wu, Siyuan Li, Stan Z. Li

Though it benefits from taking advantage of both feature-dependent information from self-supervised learning and label-dependent information from supervised learning, this scheme remains suffering from bias of the classifier.

Contrastive Learning Self-Supervised Learning +1

Generative De Novo Protein Design with Global Context

1 code implementation21 Apr 2022 Cheng Tan, Zhangyang Gao, Jun Xia, Bozhen Hu, Stan Z. Li

Thus, we propose the Global-Context Aware generative de novo protein design method (GCA), consisting of local and global modules.

Protein Design Protein Structure Prediction

Harnessing Hard Mixed Samples with Decoupled Regularizer

1 code implementation NeurIPS 2023 Zicheng Liu, Siyuan Li, Ge Wang, Cheng Tan, Lirong Wu, Stan Z. Li

However, we found that the extra optimizing step may be redundant because label-mismatched mixed samples are informative hard mixed samples for deep models to localize discriminative features.

Data Augmentation

I-GCN: A Graph Convolutional Network Accelerator with Runtime Locality Enhancement through Islandization

no code implementations7 Mar 2022 Tong Geng, Chunshu Wu, Yongan Zhang, Cheng Tan, Chenhao Xie, Haoran You, Martin C. Herbordt, Yingyan Lin, Ang Li

In this paper we propose a novel hardware accelerator for GCN inference, called I-GCN, that significantly improves data locality and reduces unnecessary computation.

SemiRetro: Semi-template framework boosts deep retrosynthesis prediction

no code implementations12 Feb 2022 Zhangyang Gao, Cheng Tan, Lirong Wu, Stan Z. Li

Experimental results show that SemiRetro significantly outperforms both existing TB and TF methods.

Graph Learning Retrosynthesis

Target-aware Molecular Graph Generation

no code implementations10 Feb 2022 Cheng Tan, Zhangyang Gao, Stan Z. Li

Building on the recent advantages of flow-based molecular generation models, we propose SiamFlow, which forces the flow to fit the distribution of target sequence embeddings in latent space.

Drug Discovery Graph Generation +1

AlphaDesign: A graph protein design method and benchmark on AlphaFoldDB

1 code implementation1 Feb 2022 Zhangyang Gao, Cheng Tan, Stan Z. Li

While DeepMind has tentatively solved protein folding, its inverse problem -- protein design which predicts protein sequences from their 3D structures -- still faces significant challenges.

Protein Design Protein Folding

Prediction of GPU Failures Under Deep Learning Workloads

no code implementations27 Jan 2022 Heting Liu, Zhichao Li, Cheng Tan, Rongqiu Yang, Guohong Cao, Zherui Liu, Chuanxiong Guo

To improve the precision and stability of predictions, we propose several techniques, including parallel and cascade model-ensemble mechanisms and a sliding training method.

An Empirical Study: Extensive Deep Temporal Point Process

1 code implementation19 Oct 2021 Haitao Lin, Cheng Tan, Lirong Wu, Zhangyang Gao, Stan. Z. Li

In this paper, we first review recent research emphasis and difficulties in modeling asynchronous event sequences with deep temporal point process, which can be concluded into four fields: encoding of history sequence, formulation of conditional intensity function, relational discovery of events and learning approaches for optimization.

Graph structure learning Variational Inference

Git: Clustering Based on Graph of Intensity Topology

2 code implementations4 Oct 2021 Zhangyang Gao, Haitao Lin, Cheng Tan, Lirong Wu, Stan. Z Li

\textbf{A}ccuracy, \textbf{R}obustness to noises and scales, \textbf{I}nterpretability, \textbf{S}peed, and \textbf{E}asy to use (ARISE) are crucial requirements of a good clustering algorithm.

Clustering Clustering Algorithms Evaluation

Co-learning: Learning from Noisy Labels with Self-supervision

1 code implementation5 Aug 2021 Cheng Tan, Jun Xia, Lirong Wu, Stan Z. Li

Noisy labels, resulting from mistakes in manual labeling or webly data collecting for supervised learning, can cause neural networks to overfit the misleading information and degrade the generalization performance.

Learning with noisy labels Self-Supervised Learning

Self-supervised Learning on Graphs: Contrastive, Generative,or Predictive

1 code implementation16 May 2021 Lirong Wu, Haitao Lin, Zhangyang Gao, Cheng Tan, Stan. Z. Li

In this survey, we extend the concept of SSL, which first emerged in the fields of computer vision and natural language processing, to present a timely and comprehensive review of existing SSL techniques for graph data.

Self-Supervised Learning

Recursive Exponential Weighting for Online Non-convex Optimization

no code implementations13 Sep 2017 Lin Yang, Cheng Tan, Wing Shing Wong

In this paper, we investigate the online non-convex optimization problem which generalizes the classic {online convex optimization problem by relaxing the convexity assumption on the cost function.

Cannot find the paper you are looking for? You can Submit a new open access paper.