no code implementations • NAACL 2022 • Jiangang Bai, Yujing Wang, Hong Sun, Ruonan Wu, Tianmeng Yang, Pengfei Tang, Defu Cao, Mingliang Zhang1, Yunhai Tong, Yaming Yang, Jing Bai, Ruofei Zhang, Hao Sun, Wei Shen
Large-scale pre-trained language models have attracted extensive attentions in the research community and shown promising results on various tasks of natural language processing.
no code implementations • 12 May 2025 • Xiaokun Wang, Chris, Jiangbo Pei, Wei Shen, Yi Peng, Yunzhuo Hao, Weijie Qiu, Ai Jian, Tianyidan Xie, Xuchen Song, Yang Liu, Yahui Zhou
We propose Skywork-VL Reward, a multimodal reward model that provides reward signals for both multimodal understanding and reasoning tasks.
no code implementations • 6 May 2025 • Fangling Jiang, Qi Li, Weining Wang, Wei Shen, Bing Liu, Zhenan Sun
Experimental results on nine datasets demonstrate that the learned prompts effectively transfer the knowledge of vision-language models, enabling state-of-the-art generalization ability against diverse unknown attack types across unseen target domains without using any spoof face images.
1 code implementation • 23 Apr 2025 • Chris, Yichen Wei, Yi Peng, Xiaokun Wang, Weijie Qiu, Wei Shen, Tianyidan Xie, Jiangbo Pei, Jianhao Zhang, Yunzhuo Hao, Xuchen Song, Yang Liu, Yahui Zhou
We present Skywork R1V2, a next-generation multimodal reasoning model and a major leap forward from its predecessor, Skywork R1V.
no code implementations • 20 Apr 2025 • Yunhui Xia, Wei Shen, Yan Wang, Jason Klein Liu, Huifeng Sun, Siyue Wu, Jian Hu, Xiaolong Xu
We introduce LeetCodeDataset, a high-quality benchmark for evaluating and training code-generation models, addressing two key challenges in LLM research: the lack of reasoning-focused coding benchmarks and self-contained training testbeds.
1 code implementation • 12 Apr 2025 • Jialun Zhong, Wei Shen, Yanzeng Li, Songyang Gao, Hua Lu, Yicheng Chen, Yang Zhang, Wei Zhou, Jinjie Gu, Lei Zou
Reward Model (RM) has demonstrated impressive potential for enhancing Large Language Models (LLM), as RM can serve as a proxy for human preferences, providing signals to guide LLMs' behavior in various tasks.
no code implementations • 10 Apr 2025 • ByteDance Seed, :, Jiaze Chen, Tiantian Fan, Xin Liu, Lingjun Liu, Zhiqi Lin, Mingxuan Wang, Chengyi Wang, Xiangpeng Wei, Wenyuan Xu, Yufeng Yuan, Yu Yue, Lin Yan, Qiying Yu, Xiaochen Zuo, Chi Zhang, Ruofei Zhu, Zhecheng An, Zhihao Bai, Yu Bao, Xingyan Bin, Jiangjie Chen, Feng Chen, Hongmin Chen, Riwei Chen, Liangqiang Chen, Zixin Chen, Jinsong Chen, Siyan Chen, Kaiyuan Chen, Zhi Chen, Jin Chen, Jiecao Chen, Jinxin Chi, Weinan Dai, Ning Dai, Jiahui Dai, Shihan Dou, Yantao Du, Zhengyin Du, Jianhui Duan, Chen Dun, Ting-Han Fan, Jiazhan Feng, Junda Feng, Ziyuan Feng, Yuwei Fu, Wenqi Fu, Hanjie Fu, Hao Ge, Hongyi Guo, Mingji Han, Li Han, Wenhao Hao, Xintong Hao, Qianyu He, Jerry He, Feng He, Wen Heng, Zehua Hong, Qi Hou, Liang Hu, Shengding Hu, Nan Hu, Kai Hua, Qi Huang, Ziyue Huang, Hongzhi Huang, Zihao Huang, Ting Huang, Wenhao Huang, Wei Jia, Bin Jia, Xiaoying Jia, Yuhua Jiang, Haobin Jiang, Ziheng Jiang, Kaihua Jiang, Chengquan Jiang, Jianpeng Jiao, Xiaoran Jin, Xing Jin, Xunhao Lai, Xiang Li, Liyi Li, Hongkai Li, Zheng Li, Shengxian Wan, Ya Wang, Yunshui Li, Chenggang Li, Niuniu Li, Siyu Li, Xi Li, Xiao Li, Aoyan Li, Yuntao Li, Nianning Liang, Xinnian Liang, Haibin Lin, Weijian Lin, Ye Lin, Zhicheng Liu, Guanlin Liu, Chenxiao Liu, Yan Liu, Gaohong Liu, Juncai Liu, Chundian Liu, Deyi Liu, Kaibo Liu, Siyao Liu, Qi Liu, Yongfei Liu, Kang Liu, Gan Liu, Boyi Liu, Rui Long, Weiqiang Lou, Chenwei Lou, Xiang Luo, Yao Luo, Caiping Lv, Heyang Lv, Bole Ma, Qianli Ma, Hongzhi Ma, Yiyuan Ma, Jin Ma, Wenchang Ma, Tingting Ma, Chen Mao, Qiyang Min, Zhe Nan, Guanghan Ning, Jinxiang Ou, Haojie Pan, Renming Pang, Yanghua Peng, Tao Peng, Lihua Qian, Mu Qiao, Meng Qu, Cheng Ren, Hongbin Ren, Yong Shan, Wei Shen, Ke Shen, Kai Shen, Guangming Sheng, Jinlong Shi, Wenlei Shi, Guang Shi, Shuai Shuai Cao, Yuxin Song, Zuquan Song, Jing Su, Yifan Sun, Tao Sun, Zewei Sun, Borui Wan, Xiaohui Wang, Xi Wang, Shuguang Wang, Jun Wang, Qinlong Wang, Chenyuan Wang, Shuai Wang, Zihan Wang, Changbao Wang, Jiaqiang Wang, Shihang Wang, Xuwu Wang, Zaiyuan Wang, Yuxuan Wang, Wenqi Wang, Taiqing Wang, Chengzhi Wei, Houmin Wei, Ziyun Wei, Shufa Wei, Zheng Wu, Yonghui Wu, Yangjun Wu, Bohong Wu, Shuang Wu, Jingqiao Wu, Ning Wu, Shuangzhi Wu, Jianmin Wu, Chenguang Xi, Fan Xia, Yuqiao Xian, Liang Xiang, Boren Xiang, Bowen Xiao, Zhen Xiao, Xia Xiao, Yongsheng Xiao, Chao Xin, Shulin Xin, Yuwen Xiong, Jingjing Xu, Ziwen Xu, Chenyin Xu, Jiayi Xu, Yifan Xu, Wei Xu, Yufei Xu, Shikun Xu, Shipeng Yan, Shen Yan, Qingping Yang, Xi Yang, Tianhao Yang, Yuehang Yang, Yuan Yang, Ximing Yang, Zeyu Yang, Guang Yang, Yifan Yang, Xuesong Yao, Bairen Yi, Fan Yin, Jianian Yin, Ziqiang Ying, Xiangyu Yu, Hongli Yu, Song Yu, Menghan Yu, Huan Yu, Siyu Yuan, Jun Yuan, Yutao Zeng, Tianyang Zhan, Zheng Zhang, Yun Zhang, Mofan Zhang, Wang Zhang, Ru Zhang, Zhi Zhang, Tianqi Zhang, Xinyi Zhang, Zhexi Zhang, Sijun Zhang, Wenqiang Zhang, Xiangxiang Zhang, Yongtao Zhang, Yuyu Zhang, Ge Zhang, He Zhang, Yue Zhang, Renjie Zheng, Ningxin Zheng, Zhuolin Zheng, Yaowei Zheng, Chen Zheng, Xiaoyun Zhi, Wanjun Zhong, Cheng Zhong, Zheng Zhong, Baoquan Zhong, Xun Zhou, Na Zhou, Huan Zhou, Hang Zhu, Defa Zhu, Wenjia Zhu, Lei Zuo
We introduce Seed1. 5-Thinking, capable of reasoning through thinking before responding, resulting in improved performance on a wide range of benchmarks.
no code implementations • 28 Mar 2025 • Wei Shen, Guanlin Liu, Zheng Wu, Ruofei Zhu, Qingping Yang, Chao Xin, Yu Yue, Lin Yan
This work highlights the importance of careful data construction and provides practical methods to overcome performance barriers in RLHF.
1 code implementation • 21 Mar 2025 • Jichen Hu, Chen Yang, Zanwei Zhou, Jiemin Fang, Xiaokang Yang, Qi Tian, Wei Shen
Reflection removal of a single image remains a highly challenging task due to the complex entanglement between target scenes and unwanted reflections.
Ranked #1 on
Reflection Removal
on Nature
no code implementations • 18 Mar 2025 • Zining Wang, Tongkun Guan, Pei Fu, Chen Duan, Qianyi Jiang, Zhentao Guo, Shan Guo, Junfeng Luo, Wei Shen, Xiaokang Yang
Multi-modal Large Language Models (MLLMs) have introduced a novel dimension to document understanding, i. e., they endow large language models with visual comprehension capabilities; however, how to design a suitable image-text pre-training task for bridging the visual and language modality in document-level MLLMs remains underexplored.
no code implementations • 4 Mar 2025 • Tongkun Guan, Zining Wang, Pei Fu, Zhengtao Guo, Wei Shen, Kai Zhou, Tiezhu Yue, Chen Duan, Hao Sun, Qianyi Jiang, Junfeng Luo, Xiaokang Yang
In recent years, general visual foundation models (VFMs) have witnessed increasing adoption, particularly as image encoders for popular multi-modal large language models (MLLMs).
no code implementations • 24 Feb 2025 • ShiJie Lin, Boxiang Yun, Wei Shen, Qingli Li, Anqiang Yang, Yan Wang
Medical Hyperspectral Imaging (MHSI) offers potential for computational pathology and precision medicine.
1 code implementation • 19 Feb 2025 • Yuliang Liu, Junjie Lu, Zhaoling Chen, Chaofeng Qu, Jason Klein Liu, Chonghan Liu, Zefan Cai, Yunhui Xia, Li Zhao, Jiang Bian, Chuheng Zhang, Wei Shen, Zhouhan Lin
Current approaches for training Process Reward Models (PRMs) often involve breaking down responses into multiple reasoning steps using rule-based techniques, such as using predefined placeholder tokens or setting the reasoning step's length into a fixed size.
no code implementations • 18 Jan 2025 • Chongjie Si, Jingjing Jiang, Wei Shen
We find that transformation weights can be derived from Gaussian noise, and they primarily serve to increase the standard deviation of pre-trained weights, with their standard deviation growing with layer depth.
1 code implementation • 30 Dec 2024 • Wei Shen, Ming Fang, Yuxia Wang, Jiafeng Xiao, Diping Li, Huangqun Chen, Ling Xu, Weifeng Zhang
Prior works adopt image and text encoders pre-trained on unimodal data to extract global and local features from image and text respectively, and then global-local alignment is achieved explicitly.
no code implementations • 12 Dec 2024 • Yabo Chen, Chen Yang, Jiemin Fang, Xiaopeng Zhang, Lingxi Xie, Wei Shen, Wenrui Dai, Hongkai Xiong, Qi Tian
Single-image 3D reconstruction remains a fundamental challenge in computer vision due to inherent geometric ambiguities and limited viewpoint information.
no code implementations • 3 Dec 2024 • Kailing Wang, Chen Yang, Keyang Zhao, Xiaokang Yang, Wei Shen
To build a surgical simulation environment, we maintain a canonical 3D scene composed of 3D Gaussians coupled with a deformation field to represent a dynamic surgical scene.
no code implementations • 31 Oct 2024 • Zheng Ruan, Ruixuan Liu, Shimin Chen, Mengying Zhou, Xinquan Yang, Wei Li, Chen Chen, Wei Shen
In the task of dense video captioning of Soccernet dataset, we propose to generate a video caption of each soccer action and locate the timestamp of the caption.
no code implementations • 21 Oct 2024 • Ming Li, Wei Shen, Qingli Li, Yan Wang
The fundamental idea of label filling is to supervise the segmentation model by a subset of pixels with trustworthy labels, meanwhile filling labels of other pixels by mixed supervision.
no code implementations • 15 Oct 2024 • Wei Shen, Ruida Zhou, Jing Yang, Cong Shen
While transformers have demonstrated impressive capacities for in-context learning (ICL) in practice, theoretical understanding of the underlying mechanism enabling transformers to perform ICL is still in its infant stage.
1 code implementation • 13 Oct 2024 • Enyu Zhou, Guodong Zheng, Binghai Wang, Zhiheng Xi, Shihan Dou, Rong Bao, Wei Shen, Limao Xiong, Jessica Fan, Yurong Mou, Rui Zheng, Tao Gui, Qi Zhang, Xuanjing Huang
However, the current evaluation of RMs may not directly correspond to their alignment performance due to the limited distribution of evaluation data and evaluation methods that are not closely related to alignment objectives.
no code implementations • 12 Oct 2024 • Jialian Li, Yipin Zhang, Wei Shen, Yuzi Yan, Jian Xie, Dong Yan
Logical reasoning is a crucial task for Large Language Models (LLMs), enabling them to tackle complex problems.
no code implementations • 1 Oct 2024 • Ziyi Ye, Xiangsheng Li, Qiuchi Li, Qingyao Ai, Yujia Zhou, Wei Shen, Dong Yan, Yiqun Liu
Conventionally, preference data is learned and encoded into a scalar reward model that connects a value head with an LLM to produce a scalar score as preference or reward.
no code implementations • 1 Oct 2024 • Xingzhou Lou, Dong Yan, Wei Shen, Yuzi Yan, Jian Xie, Junge Zhang
Reward models (RM) play a critical role in aligning generations of large language models (LLM) to human expectations.
1 code implementation • 11 Sep 2024 • Wei Shen, Chuheng Zhang
Reinforcement learning from human feedback (RLHF) is one of the key techniques that helps large language models (LLMs) to follow instructions and provide helpful and harmless responses.
1 code implementation • 2 Sep 2024 • Chongjie Si, Zhiyi Shi, Shifan Zhang, Xiaokang Yang, Hanspeter Pfister, Wei Shen
Additionally, based on our exploration of TSD, we focus on an important issue in PEFT: the initialization of LoRA.
1 code implementation • 31 Aug 2024 • Zhiyuan Hu, Yuliang Liu, Jinman Zhao, Suyuchen Wang, Yan Wang, Wei Shen, Qing Gu, Anh Tuan Luu, See-Kiong Ng, Zhiwei Jiang, Bryan Hooi
Large language models (LLMs) face significant challenges in handling long-context tasks because of their limited effective context window size during pretraining, which restricts their ability to generalize over extended sequences.
no code implementations • 19 Aug 2024 • Haoyu Zhao, Hao Wang, Chen Yang, Wei Shen
Existing approaches for human avatar generation--both NeRF-based and 3D Gaussian Splatting (3DGS) based--struggle with maintaining 3D consistency and exhibit degraded detail reconstruction, particularly when training with sparse inputs.
no code implementations • 19 Aug 2024 • Haoyu Zhao, Chen Yang, Hao Wang, Xingyue Zhao, Wei Shen
To capture the explicit topological structure of the human body, we employ a 3D network that integrates both topological and geometric associations for human avatar deformation.
1 code implementation • 15 Aug 2024 • Jing Zhou, Chenglin Jiang, Wei Shen, Xiao Zhou, Xiaonan He
Most large language models are fine-tuned using either expensive human-annotated data or GPT-4 generated data which cannot guarantee performance in certain domains.
1 code implementation • 30 Jul 2024 • Huiyu Duan, Xiongkuo Min, Sijing Wu, Wei Shen, Guangtao Zhai
In this paper, we propose a text-induced unified image processor for low-level vision tasks, termed UniProcessor, which can effectively process various degradation types and levels, and support multimodal control.
1 code implementation • 10 Jul 2024 • Tongkun Guan, Chengyu Lin, Wei Shen, Xiaokang Yang
To overcome this challenge, we propose a position forest transformer (PosFormer) for HMER, which jointly optimizes two tasks: expression recognition and position recognition, to explicitly enable position-aware symbol feature representation learning.
1 code implementation • 7 Jul 2024 • Chongjie Si, Xiaokang Yang, Wei Shen
The rapid expansion of large foundation models within the pre-training and fine-tuning framework has underscored that larger models often yield better results.
no code implementations • 29 May 2024 • Zelin Peng, Zhengqin Xu, Zhilin Zeng, Yaoming Wang, Wei Shen
Since the PEFT strategy is conducted symmetrically to the two CLIP modalities, the misalignment between them is mitigated.
Open Vocabulary Semantic Segmentation
Open-Vocabulary Semantic Segmentation
+2
no code implementations • 28 May 2024 • Yong Qi, Gabriel Kyebambo, Siyuan Xie, Wei Shen, ShengHui Wang, Bitao Xie, Bin He, Zhipeng Wang, Shuo Jiang
Safety limitations in service robotics across various industries have raised significant concerns about the need for robust mechanisms ensuring that robots adhere to safe practices, thereby preventing actions that might harm humans or cause property damage.
1 code implementation • 25 May 2024 • Mang Ye, Wei Shen, Bo Du, Eduard Snezhko, Vassili Kovalev, Pong C. Yuen
Vertical Federated Learning (VFL) is a privacy-preserving distributed learning paradigm where different parties collaboratively learn models using partitioned features of shared samples, without leaking private data.
1 code implementation • 23 May 2024 • Chongjie Si, Xuehui Wang, Xue Yang, Zhengqin Xu, Qingyun Li, Jifeng Dai, Yu Qiao, Xiaokang Yang, Wei Shen
To tackle the diversity of dimensional spaces across different foundation models and provide a more precise representation of the changes within these spaces, this paper introduces a generalized parameter-efficient fine-tuning framework, FLoRA, designed for various dimensional parameter space.
no code implementations • 18 Apr 2024 • Chongjie Si, Xuehui Wang, Xiaokang Yang, Wei Shen
However, a scenario usually arises where a pixel is concurrently predicted as an old class by the pre-trained segmentation model and a new class by the seed areas.
no code implementations • 18 Apr 2024 • Zunran Wang, Zhonghua Li, Wei Shen, Qi Ye, Liqiang Nie
To effectively enrich the feature context representations of term weight, the Feature Context Module (FCM) is introduced, which leverages the power of BERT's representation to determine dynamic weights for each element in the embedding.
no code implementations • 22 Mar 2024 • Kailing Wang, Chen Yang, Yuehao Wang, Sikuang Li, Yan Wang, Qi Dou, Xiaokang Yang, Wei Shen
Precise camera tracking, high-fidelity 3D tissue reconstruction, and real-time online visualization are critical for intrabody medical imaging devices such as endoscopes and capsule robots.
no code implementations • 12 Mar 2024 • Wei Shen, Xiaoying Zhang, Yuanshun Yao, Rui Zheng, Hongyi Guo, Yang Liu
Reinforcement learning from human feedback (RLHF) is the mainstream paradigm used to align large language models (LLMs) with human preferences.
no code implementations • 8 Mar 2024 • Xiaoying Zhang, Jean-Francois Ton, Wei Shen, Hongning Wang, Yang Liu
We introduce Adversarial Policy Optimization (AdvPO), a novel solution to the pervasive issue of reward over-optimization in Reinforcement Learning from Human Feedback (RLHF) for Large Language Models (LLMs).
2 code implementations • 15 Feb 2024 • Chen Yang, Sikuang Li, Jiemin Fang, Ruofan Liang, Lingxi Xie, Xiaopeng Zhang, Wei Shen, Qi Tian
Then we construct a Gaussian repair model based on diffusion models to supplement the omitted object information, where Gaussians are further refined.
1 code implementation • 8 Feb 2024 • Zhiheng Xi, Wenxiang Chen, Boyang Hong, Senjie Jin, Rui Zheng, wei he, Yiwen Ding, Shichun Liu, Xin Guo, Junzhe Wang, Honglin Guo, Wei Shen, Xiaoran Fan, Yuhao Zhou, Shihan Dou, Xiao Wang, Xinbo Zhang, Peng Sun, Tao Gui, Qi Zhang, Xuanjing Huang
In this paper, we propose R$^3$: Learning Reasoning through Reverse Curriculum Reinforcement Learning (RL), a novel method that employs only outcome supervision to achieve the benefits of process supervision for large language models.
1 code implementation • 2 Feb 2024 • Shihan Dou, Yan Liu, Haoxiang Jia, Limao Xiong, Enyu Zhou, Wei Shen, Junjie Shan, Caishuang Huang, Xiao Wang, Xiaoran Fan, Zhiheng Xi, Yuhao Zhou, Tao Ji, Rui Zheng, Qi Zhang, Xuanjing Huang, Tao Gui
The advancement of large language models (LLMs) has significantly propelled the field of code generation.
no code implementations • 30 Jan 2024 • Danning Lao, Qi Liu, Jiazi Bu, Junchi Yan, Wei Shen
As computer vision continues to advance and finds widespread applications across various domains, the need for interpretability in deep learning models becomes paramount.
1 code implementation • 21 Jan 2024 • Songyang Gao, Qiming Ge, Wei Shen, Shihan Dou, Junjie Ye, Xiao Wang, Rui Zheng, Yicheng Zou, Zhi Chen, Hang Yan, Qi Zhang, Dahua Lin
This reliance limits the applicability of RLHF and hinders the development of professional assistants tailored to diverse human preferences.
1 code implementation • 11 Jan 2024 • Binghai Wang, Rui Zheng, Lu Chen, Yan Liu, Shihan Dou, Caishuang Huang, Wei Shen, Senjie Jin, Enyu Zhou, Chenyu Shi, Songyang Gao, Nuo Xu, Yuhao Zhou, Xiaoran Fan, Zhiheng Xi, Jun Zhao, Xiao Wang, Tao Ji, Hang Yan, Lixing Shen, Zhan Chen, Tao Gui, Qi Zhang, Xipeng Qiu, Xuanjing Huang, Zuxuan Wu, Yu-Gang Jiang
We introduce a series of novel methods to mitigate the influence of incorrect and ambiguous preferences in the dataset and fully leverage high-quality preference data.
no code implementations • 6 Jan 2024 • Hongyi Guo, Yuanshun Yao, Wei Shen, Jiaheng Wei, Xiaoying Zhang, Zhaoran Wang, Yang Liu
The key idea is to first retrieve high-quality samples related to the target domain and use them as In-context Learning examples to generate more samples.
2 code implementations • 23 Dec 2023 • Chen Yang, Kailing Wang, Yuehao Wang, Qi Dou, Xiaokang Yang, Wei Shen
Intraoperative imaging techniques for reconstructing deformable tissues in vivo are pivotal for advanced surgical systems.
no code implementations • 18 Dec 2023 • Chongjie Si, Xuehui Wang, Yan Wang, Xiaokang Yang, Wei Shen
In partial label learning (PLL), each instance is associated with a set of candidate labels among which only one is ground-truth.
1 code implementation • 15 Dec 2023 • Shihan Dou, Enyu Zhou, Yan Liu, Songyang Gao, Jun Zhao, Wei Shen, Yuhao Zhou, Zhiheng Xi, Xiao Wang, Xiaoran Fan, ShiLiang Pu, Jiang Zhu, Rui Zheng, Tao Gui, Qi Zhang, Xuanjing Huang
Supervised fine-tuning (SFT) is a crucial step for large language models (LLMs), enabling them to align with human instructions and enhance their capabilities in downstream tasks.
1 code implementation • 8 Dec 2023 • Tongkun Guan, Wei Shen, Xue Yang, Xuehui Wang, Xiaokang Yang
Existing scene text detection methods typically rely on extensive real data for training.
no code implementations • 1 Dec 2023 • Jiazhong Cen, Jiemin Fang, Chen Yang, Lingxi Xie, Xiaopeng Zhang, Wei Shen, Qi Tian
This is achieved by attaching an scale-gated affinity feature to each 3D Gaussian to endow it a new property towards multi-granularity segmentation.
no code implementations • CVPR 2024 • Zelin Peng, Zhengqin Xu, Zhilin Zeng, Lingxi Xie, Qi Tian, Wei Shen
Parameter-efficient fine-tuning (PEFT) is an effective methodology to unleash the potential of large foundation models in novel scenarios with limited training data.
no code implementations • 14 Nov 2023 • Yuhan Li, Jian Wu, Zhiwei Yu, Börje F. Karlsson, Wei Shen, Manabu Okumura, Chin-Yew Lin
To close this gap in data availability and enable cross-modality IE, while alleviating labeling costs, we propose a semi-supervised pipeline for annotating entities in text, as well as entities and relations in tables, in an iterative procedure.
no code implementations • 2 Nov 2023 • Wei Shen, Minhui Huang, Jiawei Zhang, Cong Shen
In recent years, federated minimax optimization has attracted growing interest due to its extensive applications in various machine learning tasks.
no code implementations • 18 Oct 2023 • Rui Zheng, Wei Shen, Yuan Hua, Wenbin Lai, Shihan Dou, Yuhao Zhou, Zhiheng Xi, Xiao Wang, Haoran Huang, Tao Gui, Qi Zhang, Xuanjing Huang
In this work, we propose a novel approach that can learn a consistent policy via RL across various data groups or domains.
no code implementations • 8 Oct 2023 • Wei Shen, Rui Zheng, WenYu Zhan, Jun Zhao, Shihan Dou, Tao Gui, Qi Zhang, Xuanjing Huang
Reinforcement learning from human feedback serves as a crucial bridge, aligning large language models with human and societal values.
2 code implementations • 12 Sep 2023 • Anthony Cioppa, Silvio Giancola, Vladimir Somers, Floriane Magera, Xin Zhou, Hassan Mkhallati, Adrien Deliège, Jan Held, Carlos Hinojosa, Amir M. Mansourian, Pierre Miralles, Olivier Barnich, Christophe De Vleeschouwer, Alexandre Alahi, Bernard Ghanem, Marc Van Droogenbroeck, Abdullah Kamal, Adrien Maglo, Albert Clapés, Amr Abdelaziz, Artur Xarles, Astrid Orcesi, Atom Scott, Bin Liu, Byoungkwon Lim, Chen Chen, Fabian Deuser, Feng Yan, Fufu Yu, Gal Shitrit, Guanshuo Wang, Gyusik Choi, Hankyul Kim, Hao Guo, Hasby Fahrudin, Hidenari Koguchi, Håkan Ardö, Ibrahim Salah, Ido Yerushalmy, Iftikar Muhammad, Ikuma Uchida, Ishay Be'ery, Jaonary Rabarisoa, Jeongae Lee, Jiajun Fu, Jianqin Yin, Jinghang Xu, Jongho Nang, Julien Denize, Junjie Li, Junpei Zhang, Juntae Kim, Kamil Synowiec, Kenji Kobayashi, Kexin Zhang, Konrad Habel, Kota Nakajima, Licheng Jiao, Lin Ma, Lizhi Wang, Luping Wang, Menglong Li, Mengying Zhou, Mohamed Nasr, Mohamed Abdelwahed, Mykola Liashuha, Nikolay Falaleev, Norbert Oswald, Qiong Jia, Quoc-Cuong Pham, Ran Song, Romain Hérault, Rui Peng, Ruilong Chen, Ruixuan Liu, Ruslan Baikulov, Ryuto Fukushima, Sergio Escalera, Seungcheon Lee, Shimin Chen, Shouhong Ding, Taiga Someya, Thomas B. Moeslund, Tianjiao Li, Wei Shen, Wei zhang, Wei Li, Wei Dai, Weixin Luo, Wending Zhao, Wenjie Zhang, Xinquan Yang, Yanbiao Ma, Yeeun Joo, Yingsen Zeng, Yiyang Gan, Yongqiang Zhu, Yujie Zhong, Zheng Ruan, Zhiheng Li, Zhijian Huang, Ziyu Meng
More information on the tasks, challenges, and leaderboards are available on https://www. soccer-net. org.
no code implementations • 28 Aug 2023 • Zelin Peng, Zhengqin Xu, Zhilin Zeng, Xiaokang Yang, Wei Shen
Most existing fine-tuning methods attempt to bridge the gaps among different scenarios by introducing a set of new parameters to modify SAM's original parameter space.
no code implementations • 26 Aug 2023 • Danyang Tu, Wei Shen, Wei Sun, Xiongkuo Min, Guangtao Zhai
In contrast, we reframe the gaze following detection task as detecting human head locations and their gaze followings simultaneously, aiming at jointly detect human gaze location and gaze object in a unified and single-stage pipeline.
no code implementations • ICCV 2023 • Danyang Tu, Wei Sun, Guangtao Zhai, Wei Shen
We propose an agglomerative Transformer (AGER) that enables Transformer-based human-object interaction (HOI) detectors to flexibly exploit extra instance-level cues in a single-stage and end-to-end manner for the first time.
1 code implementation • 11 Jul 2023 • Rui Zheng, Shihan Dou, Songyang Gao, Yuan Hua, Wei Shen, Binghai Wang, Yan Liu, Senjie Jin, Qin Liu, Yuhao Zhou, Limao Xiong, Lu Chen, Zhiheng Xi, Nuo Xu, Wenbin Lai, Minghao Zhu, Cheng Chang, Zhangyue Yin, Rongxiang Weng, Wensen Cheng, Haoran Huang, Tianxiang Sun, Hang Yan, Tao Gui, Qi Zhang, Xipeng Qiu, Xuanjing Huang
Therefore, we explore the PPO-max, an advanced version of PPO algorithm, to efficiently improve the training stability of the policy model.
no code implementations • 18 Jun 2023 • Hangjian Li, Dong Xu, Konstantin Shmakov, Kuang-Chih Lee, Wei Shen
Online retailers often use third-party demand-side-platforms (DSPs) to conduct offsite advertising and reach shoppers across the Internet on behalf of their advertisers.
2 code implementations • 31 May 2023 • Chen Yang, Kailing Wang, Yuehao Wang, Xiaokang Yang, Wei Shen
Reconstructing deformable tissues from endoscopic stereo videos in robotic surgery is crucial for various clinical applications.
2 code implementations • CVPR 2023 • Yunhao Bai, Duowen Chen, Qingli Li, Wei Shen, Yan Wang
In semi-supervised medical image segmentation, there exist empirical mismatch problems between labeled and unlabeled data distribution.
Image Segmentation
Semi-supervised Medical Image Segmentation
+1
1 code implementation • NeurIPS 2023 • Jiazhong Cen, Jiemin Fang, Zanwei Zhou, Chen Yang, Lingxi Xie, Xiaopeng Zhang, Wei Shen, Qi Tian
The Segment Anything Model (SAM) emerges as a powerful vision foundation model to generate high-quality 2D segmentation results.
no code implementations • CVPR 2023 • Chen Yang, Peihao Li, Zanwei Zhou, Shanxin Yuan, Bingbing Liu, Xiaokang Yang, Weichao Qiu, Wei Shen
We present NeRFVS, a novel neural radiance fields (NeRF) based method to enable free navigation in a room.
1 code implementation • 30 Mar 2023 • Huiyu Duan, Wei Shen, Xiongkuo Min, Danyang Tu, Long Teng, Jia Wang, Guangtao Zhai
Recently, masked autoencoders (MAE) for feature pre-training have further unleashed the potential of Transformers, leading to state-of-the-art performances on various high-level vision tasks.
Ranked #4 on
Image Defocus Deblurring
on DPD (Dual-view)
no code implementations • ICCV 2023 • Zelin Peng, Guanchun Wang, Lingxi Xie, Dongsheng Jiang, Wei Shen, Qi Tian
Seed area generation is usually the starting point of weakly supervised semantic segmentation (WSSS).
no code implementations • 3 Mar 2023 • Yuanying Cai, Chuheng Zhang, Wei Shen, Xuyun Zhang, Wenjie Ruan, Longbo Huang
Inspired by the recent success of sequence modeling in RL and the use of masked language model for pre-training, we propose a masked model for pre-training in RL, RePreM (Representation Pre-training with Masked Model), which trains the encoder combined with transformer blocks to predict the masked states or actions in a trajectory.
no code implementations • 9 Jan 2023 • Yang Peng, Changzheng Liu, Wei Shen
Customer-centric marketing campaigns generate a large portion of e-commerce website traffic for Walmart.
1 code implementation • CVPR 2023 • Duowen Chen, Yunhao Bai, Wei Shen, Qingli Li, Lequan Yu, Yan Wang
Our strategy encourages unlabeled images to learn organ semantics in relative locations from the labeled images (cross-branch) and enhances the learning ability for small organs (within-branch).
no code implementations • 6 Dec 2022 • Zanwei Zhou, RuiZhe Zhong, Chen Yang, Yan Wang, Xiaokang Yang, Wei Shen
In this study, we point out that the current tokenization strategy in MTSF Transformer architectures ignores the token uniformity inductive bias of Transformers.
no code implementations • 5 Dec 2022 • Wei Shen, Xiaonan He, Chuheng Zhang, Xuyun Zhang, Jian Xie
Moreover, they are trained and evaluated on the benchmark datasets with adequate labels, which are expensive to obtain in a commercial dialogue system.
1 code implementation • 5 Dec 2022 • Yuanying Cai, Chuheng Zhang, Li Zhao, Wei Shen, Xuyun Zhang, Lei Song, Jiang Bian, Tao Qin, TieYan Liu
There are two challenges for this setting: 1) The optimal trade-off between optimizing the RL signal and the behavior cloning (BC) signal changes on different states due to the variation of the action coverage induced by different behavior policies.
no code implementations • Proceedings of the 2021 International Conference on Management of Data 2021 • Yinan Liu, Wei Shen, Yuanfei Wang, Jianyong Wang, Zhenglu Yang, Xiaojie Yuan
However, noun phrases (NPs) and relation phrases (RPs) in OKBs are not canonicalized and often appear in different paraphrased textual variants, which leads to redundant and ambiguous facts.
no code implementations • 28 Nov 2022 • Yinan Liu, Hu Chen, Wei Shen, Jiaoyan Chen
Previous studies often rely on a relative number of resources such as labeled utterances and external data, yet the attribute knowledge embedded in unlabeled utterances is underutilized and their performance of predicting some difficult personal attributes is still unsatisfactory.
1 code implementation • ICCV 2023 • Tongkun Guan, Wei Shen, Xue Yang, Qi Feng, Zekun Jiang, Xiaokang Yang
Therefore, exploring the robust text feature representations on unlabeled real images by self-supervised learning is a good solution.
1 code implementation • 9 Oct 2022 • Yunhao Li, Zhenbo Yu, Yucheng Zhu, Bingbing Ni, Guangtao Zhai, Wei Shen
Stage I introduces a test time adaptation strategy, which improves the physical plausibility of synthesized human skeleton motions by optimizing skeleton joint locations.
no code implementations • 22 Sep 2022 • Cheng Jie, Da Xu, Zigeng Wang, Wei Shen
Organic search comprises a large portion of the total traffic for e-commerce companies.
no code implementations • 12 Sep 2022 • Xue Li, Wei Shen, Denis Charles
In this paper, we propose TEDL, a two-stage learning approach to quantify uncertainty for deep learning models in classification tasks, inspired by our findings in experimenting with Evidential Deep Learning (EDL) method, a recently proposed uncertainty quantification approach based on the Dempster-Shafer theory.
no code implementations • 30 Aug 2022 • Li Lyna Zhang, Youkow Homma, Yujing Wang, Min Wu, Mao Yang, Ruofei Zhang, Ting Cao, Wei Shen
Remarkably, under our latency requirement of 1900us on CPU, SwiftPruner achieves a 0. 86% higher AUC than the state-of-the-art uniform sparse baseline for BERT-Mini on a large scale real-world dataset.
1 code implementation • 29 Aug 2022 • Yinan Liu, Hu Chen, Wei Shen
Personal knowledge bases (PKBs) are critical to many applications, such as Web-based chatbots and personalized recommendation.
1 code implementation • 8 Aug 2022 • Chenwei Ran, Wei Shen, Jianbo Gao, Yuhan Li, Jianyong Wang, Yantao Jia
Entity linking (EL) is the process of linking entity mentions appearing in text with their corresponding entities in a knowledge base.
1 code implementation • 6 Jul 2022 • Yuan YAO, Fengze Liu, Zongwei Zhou, Yan Wang, Wei Shen, Alan Yuille, Yongyi Lu
Previous methods proposed Variational Autoencoder (VAE) based models to learn the distribution of shape for a particular organ and used it to automatically evaluate the quality of a segmentation prediction by fitting it into the learned shape distribution.
no code implementations • 4 Jul 2022 • Wei Shen, Zelin Peng, Xuehui Wang, Huayu Wang, Jiazhong Cen, Dongsheng Jiang, Lingxi Xie, Xiaokang Yang, Qi Tian
Next, we summarize the existing label-efficient image segmentation methods from a unified perspective that discusses an important question: how to bridge the gap between weak supervision and dense prediction -- the current methods are mostly based on heuristic priors, such as cross-pixel similarity, cross-label constraint, cross-view consistency, and cross-image relation.
2 code implementations • 22 Jun 2022 • Wei Shen, Yang Yang, Yinan Liu
In this paper, we propose CMVC, a novel unsupervised framework that leverages these two views of knowledge jointly for canonicalizing OKBs without the need of manually annotated labels.
no code implementations • 4 Jun 2022 • Danyang Tu, Wei Sun, Xiongkuo Min, Guangtao Zhai, Wei Shen
We present a novel vision Transformer, named TUTOR, which is able to learn tubelet tokens, served as highly-abstracted spatiotemporal representations, for video-based human-object interaction (V-HOI) detection.
2 code implementations • 24 May 2022 • Yuhan Li, Wei Shen, Jianbo Gao, Yadong Wang
Community Question Answering (CQA) platforms contain plenty of CQA texts (i. e., questions and answers corresponding to the question) where named entities appear ubiquitously.
1 code implementation • 21 Apr 2022 • Yuzhi Zhao, Lai-Man Po, Xuehui Wang, Qiong Yan, Wei Shen, Yujia Zhang, Wei Liu, Chun-Kit Wong, Chiu-Sing Pang, Weifeng Ou, Wing-Yin Yu, Buhua Liu
On this basis, we formulate predictions as a mapping from parents' genetic factors to children's genetic factors, and disentangle them from external and variety factors.
Age-Invariant Face Recognition
Image-to-Image Translation
+3
1 code implementation • 18 Apr 2022 • Huiyu Duan, Wei Shen, Xiongkuo Min, Danyang Tu, Jing Li, Guangtao Zhai
Therefore, in this paper, we mainly analyze the interaction effect between background (BG) scenes and AR contents, and study the saliency prediction problem in AR.
1 code implementation • 22 Mar 2022 • Feng Wang, Huiyu Wang, Chen Wei, Alan Yuille, Wei Shen
Recent advances in self-supervised contrastive learning yield good image-level representation, which favors classification tasks but usually neglects pixel-level detailed information, leading to unsatisfactory transfer performance to dense prediction tasks such as semantic segmentation.
no code implementations • 20 Mar 2022 • Danyang Tu, Xiongkuo Min, Huiyu Duan, Guodong Guo, Guangtao Zhai, Wei Shen
Iwin Transformer is a hierarchical Transformer which progressively performs token representation learning and token agglomeration within irregular windows.
no code implementations • CVPR 2022 • Danyang Tu, Xiongkuo Min, Huiyu Duan, Guodong Guo, Guangtao Zhai, Wei Shen
In contrast, we redefine the HGT detection task as detecting human head locations and their gaze targets, simultaneously.
1 code implementation • CVPR 2022 • Xuehui Wang, Kai Zhao, Ruixin Zhang, Shouhong Ding, Yan Wang, Wei Shen
In this framework, annotated masks of seen categories and pseudo masks of unseen categories serve as a prior for contrastive learning, where features from the mask regions (foreground) are pulled together, and are contrasted against those from the background, and vice versa.
no code implementations • 11 Mar 2022 • Kai Zhao, Lei Shen, Yingyi Zhang, Chuhan Zhou, Tao Wang, Ruixin Zhang, Shouhong Ding, Wei Jia, Wei Shen
In this paper, by observing that palmar creases are the key information to deep-learning-based palmprint recognition, we propose to synthesize training data by manipulating palmar creases.
1 code implementation • CVPR 2023 • Tongkun Guan, Chaochen Gu, Jingzheng Tu, Xue Yang, Qi Feng, Yudi Zhao, Xiaokang Yang, Wei Shen
Supervised attention can alleviate the above issue, but it is character category-specific, which requires extra laborious character-level bounding box annotations and would be memory-intensive when handling languages with larger character categories.
Ranked #2 on
Scene Text Recognition
on ICDAR 2003
no code implementations • 4 Jan 2022 • Yuyin Zhou, David Dreizin, Yan Wang, Fengze Liu, Wei Shen, Alan L. Yuille
The spleen is one of the most commonly injured solid organs in blunt abdominal trauma.
no code implementations • 24 Nov 2021 • Jiazhong Cen, Zenkun Jiang, Lingxi Xie, Qi Tian, Xiaokang Yang, Wei Shen
Anomaly segmentation is a crucial task for safety-critical applications, such as autonomous driving in urban scenes, where the goal is to detect out-of-distribution (OOD) objects with categories which are unseen during training.
Ranked #10 on
Anomaly Detection
on Fishyscapes L&F
5 code implementations • 23 Nov 2021 • Xintian Mao, Yiming Liu, Fengze Liu, Qingli Li, Wei Shen, Yan Wang
Blur was naturally analyzed in the frequency domain, by estimating the latent sharp image and the blur kernel given a blurry image.
Ranked #5 on
Deblurring
on RealBlur-R (trained on GoPro)
2 code implementations • 15 Nov 2021 • Jinghao Zhou, Chen Wei, Huiyu Wang, Wei Shen, Cihang Xie, Alan Yuille, Tao Kong
We present a self-supervised framework iBOT that can perform masked prediction with an online tokenizer.
Ranked #4 on
Unsupervised Image Classification
on ImageNet
no code implementations • ICLR 2022 • Jinghao Zhou, Chen Wei, Huiyu Wang, Wei Shen, Cihang Xie, Alan Yuille, Tao Kong
The success of language Transformers is primarily attributed to the pretext task of masked language modeling (MLM), where texts are first tokenized into semantically meaningful pieces.
no code implementations • 26 Sep 2021 • Wei Shen, Yuhan Li, Yinan Liu, Jiawei Han, Jianyong Wang, Xiaojie Yuan
Entity linking (EL) is the process of linking entity mentions appearing in web text with their corresponding entities in a knowledge base.
2 code implementations • 25 Aug 2021 • Wei Shen, Chuheng Zhang, Yun Tian, Liang Zeng, Xiaonan He, Wanchun Dou, Xiaolong Xu
However, without node content (i. e., side information) for training, the user (or item) specific representation can not be learned in the inductive setting, that is, a model trained on one group of users (or items) cannot adapt to new users (or items).
Ranked #3 on
Recommendation Systems
on MovieLens 1M
no code implementations • 24 Jun 2021 • Cheng Jie, Da Xu, Zigeng Wang, Lu Wang, Wei Shen
With the increasing scale of search engine marketing, designing an efficient bidding system is becoming paramount for the success of e-commerce companies.
1 code implementation • CVPR 2021 • Yi Fang, Jiapeng Tang, Wang Shen, Wei Shen, Xiao Gu, Li Song, Guangtao Zhai
In the third stage, we use the generated dual attention as guidance to perform two sub-tasks: (1) identifying whether the gaze target is inside or out of the image; (2) locating the target if inside.
1 code implementation • 10 Jun 2021 • Michela Antonelli, Annika Reinke, Spyridon Bakas, Keyvan Farahani, AnnetteKopp-Schneider, Bennett A. Landman, Geert Litjens, Bjoern Menze, Olaf Ronneberger, Ronald M. Summers, Bram van Ginneken, Michel Bilello, Patrick Bilic, Patrick F. Christ, Richard K. G. Do, Marc J. Gollub, Stephan H. Heckers, William R. Jarnagin, Maureen K. McHugo, Sandy Napel, Jennifer S. Goli Pernicka, Kawal Rhode, Catalina Tobon-Gomez, Eugene Vorontsov, Henkjan Huisman, James A. Meakin, Sebastien Ourselin, Manuel Wiesenfarth, Pablo Arbelaez, Byeonguk Bae, Sihong Chen, Laura Daza, Jianjiang Feng, Baochun He, Fabian Isensee, Yuanfeng Ji, Fucang Jia, Namkug Kim, Ildoo Kim, Dorit Merhof, Akshay Pai, Beomhee Park, Mathias Perslev, Ramin Rezaiifar, Oliver Rippel, Ignacio Sarasua, Wei Shen, Jaemin Son, Christian Wachinger, Liansheng Wang, Yan Wang, Yingda Xia, Daguang Xu, Zhanwei Xu, Yefeng Zheng, Amber L. Simpson, Lena Maier-Hein, M. Jorge Cardoso
Segmentation is so far the most widely investigated medical image processing task, but the various segmentation challenges have typically been organized in isolation, such that algorithm development was driven by the need to tackle a single specific clinical problem.
no code implementations • 5 Jun 2021 • Yilin Wang, Shaozuo Yu, Xiaokang Yang, Wei Shen
In this paper, we propose a generic model transfer scheme to make Convlutional Neural Networks (CNNs) interpretable, while maintaining their high classification accuracy.
1 code implementation • NeurIPS 2021 • Qihang Yu, Yingda Xia, Yutong Bai, Yongyi Lu, Alan Yuille, Wei Shen
It is motivated by the Glance and Gaze behavior of human beings when recognizing objects in natural scenes, with the ability to efficiently model both long-range dependencies and local context.
no code implementations • 31 May 2021 • Yan Wang, Peng Tang, Yuyin Zhou, Wei Shen, Elliot K. Fishman, Alan L. Yuille
We instantiate both the global and the local classifiers by multiple instance learning (MIL), where the attention guidance, indicating roughly where the PDAC regions are, is the key to bridging them: For global MIL based normal/PDAC classification, attention serves as a weight for each instance (voxel) during MIL pooling, which eliminates the distraction from the background; For local MIL based semi-supervised PDAC segmentation, the attention guidance is inductive, which not only provides bag-level pseudo-labels to training data without per-voxel annotations for MIL training, but also acts as a proxy of an instance-level classifier.
1 code implementation • IEEE Transactions on Knowledge and Data Engineering 2021 • Wei Shen, Yuwei Yin, Yang Yang, Jiawei Han, Jianyong Wang, Xiaojie Yuan
The task of linking an entity mention in a tweet with its corresponding entity in a heterogeneous information network is of great importance, for the purpose of enriching heterogeneous information networks with the abundant and fresh knowledge embedded in tweets.
1 code implementation • 5 Mar 2021 • Boxiang Yun, Yan Wang, Jieneng Chen, Huiyu Wang, Wei Shen, Qingli Li
Hyperspectral imaging (HSI) unlocks the huge potential to a wide variety of applications relied on high-precision pathology image segmentation, such as computational pathology and precision medicine.
no code implementations • ICCV 2021 • Yunhao Li, Wei Shen, Zhongpai Gao, Yucheng Zhu, Guangtao Zhai, Guodong Guo
Specifically, the local region is obtained as a 2D cone-shaped field along the 2D projection of the sight line starting at the human subject's head position, and the distant region is obtained by searching along the sight line in 3D sphere space.
1 code implementation • 28 Nov 2020 • Yuhui Xu, Lingxi Xie, Cihang Xie, Jieru Mei, Siyuan Qiao, Wei Shen, Hongkai Xiong, Alan Yuille
Batch normalization (BN) is a fundamental unit in modern deep networks, in which a linear transformation module was designed for improving BN's flexibility of fitting complex data distributions.
no code implementations • 29 Oct 2020 • Yingwei Li, Zhuotun Zhu, Yuyin Zhou, Yingda Xia, Wei Shen, Elliot K. Fishman, Alan L. Yuille
Although deep neural networks have been a dominant method for many 2D vision tasks, it is still challenging to apply them to 3D tasks, such as medical image segmentation, due to the limited amount of annotated 3D data and limited computational resources.
no code implementations • 14 Oct 2020 • Yiren Chen, Yaming Yang, Hong Sun, Yujing Wang, Yu Xu, Wei Shen, Rong Zhou, Yunhai Tong, Jing Bai, Ruofei Zhang
We add the model designed by AutoADR as a sub-model into the production Ad Relevance model.
1 code implementation • ICLR 2021 • Yingwei Li, Qihang Yu, Mingxing Tan, Jieru Mei, Peng Tang, Wei Shen, Alan Yuille, Cihang Xie
To prevent models from exclusively attending on a single cue in representation learning, we augment training data with images with conflicting shape and texture information (eg, an image of chimpanzee shape but with lemon texture) and, most importantly, provide the corresponding supervisions from shape and texture simultaneously.
Ranked #1 on
Image Classification
on ImageNet
(Hardware Burden metric)
no code implementations • ICLR 2021 • Chen Wei, Huiyu Wang, Wei Shen, Alan Yuille
Regarding the similarity of the query crop to each crop from other images as "unlabeled", the consistency term takes the corresponding similarity of a positive crop as a pseudo label, and encourages consistency between these two similarities.
no code implementations • 25 Aug 2020 • Wei Shen, Xiaonan He, Chuheng Zhang, Qiang Ni, Wanchun Dou, Yan Wang
Therefore, it is crucial to design a participant selection algorithm that applies to different MCS systems to achieve multiple goals.
no code implementations • 9 Jul 2020 • Daniil Pakhomov, Wei Shen, Nassir Navab
Surgical tool segmentation in endoscopic images is an important problem: it is a crucial step towards full instrument pose estimation and it is used for integration of pre- and intra-operative images into the endoscopic view.
no code implementations • 21 May 2020 • R. Daniel Meyer, Bohdana Ratitch, Marcel Wolbers, Olga Marchenko, Hui Quan, Daniel Li, Chrissie Fletcher, Xin Li, David Wright, Yue Shentu, Stefan Englert, Wei Shen, Jyotirmoy Dey, Thomas Liu, Ming Zhou, Norman Bohidar, Peng-Liang Zhao, Michael Hale
The COVID-19 pandemic has had and continues to have major impacts on planned and ongoing clinical trials.
no code implementations • 18 May 2020 • Shuhao Fu, Yongyi Lu, Yan Wang, Yuyin Zhou, Wei Shen, Elliot Fishman, Alan Yuille
In this paper, we present a novel unsupervised domain adaptation (UDA) method, named Domain Adaptive Relational Reasoning (DARR), to generalize 3D multi-organ segmentation models to medical data collected from different scanners and/or protocols (domains).
no code implementations • 4 Apr 2020 • Zhuotun Zhu, Yongyi Lu, Wei Shen, Elliot K. Fishman, Alan L. Yuille
This work presents comprehensive results to detect in the early stage the pancreatic neuroendocrine tumors (PNETs), a group of endocrine tumors arising in the pancreas, which are the second common type of pancreatic cancer, by checking the abdominal CT scans.
1 code implementation • ECCV 2020 • Yingda Xia, Yi Zhang, Fengze Liu, Wei Shen, Alan Yuille
The ability to detect failures and anomalies are fundamental requirements for building reliable systems for computer vision applications, especially safety-critical applications of semantic segmentation, such as autonomous driving and medical image analysis.
Ranked #10 on
Anomaly Detection
on Road Anomaly
(using extra training data)
no code implementations • 18 Mar 2020 • Yingda Xia, Qihang Yu, Wei Shen, Yuyin Zhou, Elliot K. Fishman, Alan L. Yuille
Pancreatic ductal adenocarcinoma (PDAC) is one of the most lethal cancers among the population.
no code implementations • CVPR 2020 • Yan Wang, Xu Wei, Fengze Liu, Jieneng Chen, Yuyin Zhou, Wei Shen, Elliot K. Fishman, Alan L. Yuille
Tubular structure segmentation in medical images, e. g., segmenting vessels in CT scans, serves as a vital step in the use of computers to aid in screening early stages of related diseases.
1 code implementation • CVPR 2021 • Hao Ding, Siyuan Qiao, Alan Yuille, Wei Shen
The key to a successful cascade architecture for precise instance segmentation is to fully leverage the relationship between bounding box detection and mask segmentation across multiple stages.
1 code implementation • 21 Nov 2019 • Siyuan Qiao, Huiyu Wang, Chenxi Liu, Wei Shen, Alan Yuille
To address this issue, we propose BatchChannel Normalization (BCN), which uses batch knowledge to avoid the elimination singularities in the training of channel-normalized models.
no code implementations • 9 Sep 2019 • Mingqing Xiao, Adam Kortylewski, Ruihai Wu, Siyuan Qiao, Wei Shen, Alan Yuille
Despite deep convolutional neural networks' great success in object classification, it suffers from severe generalization performance drop under occlusion due to the inconsistency between training and testing data.
no code implementations • 23 Jul 2019 • Wei Shen, Yilu Guo, Yan Wang, Kai Zhao, Bo wang, Alan Yuille
Both of them connect split nodes to the top layer of convolutional neural networks (CNNs) and deal with inhomogeneous data by jointly learning input-dependent data partitions at the split nodes and age distributions at the leaf nodes.
no code implementations • 30 Jun 2019 • Wei Shen, Fei Li, Rujie Liu
We argue that the discard of the correlated discriminative information is partially caused by the fact that the minimization of the classification loss doesn't ensure to learn the overall discriminative information but only the most discriminative information.
no code implementations • 25 Apr 2019 • Wei Shen, Zhenhuan Yang, Yiming Ying, Xiaoming Yuan
From this fundamental trade-off, we obtain lower bounds for the optimization error of SGD algorithms and the excess expected risk over a class of pairwise losses.
7 code implementations • 25 Mar 2019 • Siyuan Qiao, Huiyu Wang, Chenxi Liu, Wei Shen, Alan Yuille
Batch Normalization (BN) has become an out-of-box technique to improve deep network training.
Ranked #79 on
Instance Segmentation
on COCO minival
no code implementations • 25 Mar 2019 • Wei Shen, Ziqiang Shi, Jun Sun
Then we use the adversarial region attention to aggregate the feature maps to obtain the adversarial features.
no code implementations • ICLR 2018 • Wei Shen, Rujie Liu
In this paper, we propose to generate sample-specific filters for convolutional layers in the forward pass.
1 code implementation • 28 Nov 2018 • Zhishuai Zhang, Wei Shen, Siyuan Qiao, Yan Wang, Bo wang, Alan Yuille
In this paper, we propose that the robustness of a face detector against hard faces can be improved by learning small faces on hard images.
Ranked #7 on
Face Detection
on WIDER Face (Hard)
no code implementations • 27 Nov 2018 • Wei Shen, Rujie Liu
Recent advances in fine-grained recognition utilize attention maps to localize objects of interest.
no code implementations • 27 Nov 2018 • Wei Shen, Rujie Liu
However, we find that choosing squared Euclidean distance may cause distance explosion leading gradients to be extremely sparse in the early stage of back propagation.
4 code implementations • 9 Jul 2018 • Peng Tang, Xinggang Wang, Song Bai, Wei Shen, Xiang Bai, Wenyu Liu, Alan Yuille
The iterative instance classifier refinement is implemented online using multiple streams in convolutional neural networks, where the first is an MIL network and the others are for instance classifier refinement supervised by the preceding one.
Ranked #1 on
Weakly Supervised Object Detection
on ImageNet
no code implementations • 1 Jul 2018 • Kai Zhao, Wei Shen, ShangHua Gao, Dandan Li, Ming-Ming Cheng
In natural images, the scales (thickness) of object skeletons may dramatically vary among objects and object parts.
no code implementations • 16 May 2018 • Yunhan Zhao, Ye Tian, Charless Fowlkes, Wei Shen, Alan Yuille
Experimental results verify that our approach significantly improves the ability of deep networks to resist large variations between training and testing data and achieves classification accuracy improvements on several benchmark datasets, including MNIST, affNIST, SVHN, CIFAR-10 and miniImageNet.
no code implementations • 23 Apr 2018 • Yan Wang, Yuyin Zhou, Wei Shen, Seyoun Park, Elliot K. Fishman, Alan L. Yuille
To address these challenges, we introduce a novel framework for multi-organ segmentation by using organ-attention networks with reverse connections (OAN-RCs) which are applied to 2D views, of the 3D CT volume, and output estimates which are combined by statistical fusion exploiting structural similarity.
no code implementations • 7 Apr 2018 • Yuyin Zhou, Yan Wang, Peng Tang, Song Bai, Wei Shen, Elliot K. Fishman, Alan L. Yuille
In multi-organ segmentation of abdominal CT scans, most existing fully supervised deep learning algorithms require lots of voxel-wise annotations, which are usually difficult, expensive, and slow to obtain.
no code implementations • 7 Apr 2018 • Yan Wang, Yuyin Zhou, Peng Tang, Wei Shen, Elliot K. Fishman, Alan L. Yuille
Based on the fact that very hard samples might have annotation errors, we propose a new sample selection policy, named Relaxed Upper Confident Bound (RUCB).
1 code implementation • ECCV 2018 • Siyuan Qiao, Wei Shen, Zhishuai Zhang, Bo wang, Alan Yuille
We present Deep Co-Training, a deep learning based method inspired by the Co-Training framework.
no code implementations • 5 Jan 2018 • Kai Zhao, Wei Shen, Shang-Hua Gao, Dandan Li, Ming-Ming Cheng
In natural images, the scales (thickness) of object skeletons may dramatically vary among objects and object parts, making object skeleton detection a challenging problem.
Ranked #2 on
Object Skeleton Detection
on SK-LARGE
2 code implementations • CVPR 2018 • Wei Shen, Yilu Guo, Yan Wang, Kai Zhao, Bo wang, Alan Yuille
Age estimation from facial images is typically cast as a nonlinear regression problem.
Ranked #6 on
Age Estimation
on FGNET
no code implementations • 1 Dec 2017 • Zhuotun Zhu, Yingda Xia, Wei Shen, Elliot K. Fishman, Alan L. Yuille
In this paper, we adopt 3D Convolutional Neural Networks to segment volumetric medical images.
no code implementations • CVPR 2018 • Zhishuai Zhang, Siyuan Qiao, Cihang Xie, Wei Shen, Bo wang, Alan L. Yuille
Our motivation is to enrich the semantics of object detection features within a typical deep detector, by a semantic segmentation branch and a global activation module.
no code implementations • ICML 2018 • Siyuan Qiao, Zhishuai Zhang, Wei Shen, Bo wang, Alan Yuille
Our method is by introducing computation orderings to the channels within convolutional layers or blocks, based on which we gradually compute the outputs in a channel-wise manner.
1 code implementation • CVPR 2018 • Siyuan Qiao, Chenxi Liu, Wei Shen, Alan Yuille
In this paper, we are interested in the few-shot learning problem.
no code implementations • ICCV 2017 • Siyuan Qiao, Wei Shen, Weichao Qiu, Chenxi Liu, Alan Yuille
We argue that estimation of object scales in images is helpful for generating object proposals, especially for supermarket images where object scales are usually within a small range.
no code implementations • ICCV 2017 • Wei Shen, Bin Wang, Yuan Jiang, Yan Wang, Alan Yuille
This design is biologically-plausible, as it likes a human visual system to compare different possible segmentation solutions to address the ambiguous boundary issue.
no code implementations • NeurIPS 2017 • Wei Shen, Kai Zhao, Yilu Guo, Alan Yuille
This paper presents label distribution learning forests (LDLFs) - a novel label distribution learning algorithm based on differentiable decision trees, which have several advantages: 1) Decision trees have the potential to model any general form of label distributions by a mixture of leaf node predictions.
Ranked #11 on
Age Estimation
on MORPH album2 (Caucasian)
3 code implementations • 25 Dec 2016 • Yuyin Zhou, Lingxi Xie, Wei Shen, Yan Wang, Elliot K. Fishman, Alan L. Yuille
Deep neural networks have been widely adopted for automatic organ segmentation from abdominal CT scans.
1 code implementation • CVPR 2017 • Wei Shen, Rujie Liu
The transformation networks are responsible for the attribute manipulation and its dual operation and the discriminative network is used to distinguish the generated images from real images.
1 code implementation • 13 Sep 2016 • Wei Shen, Kai Zhao, Yuan Jiang, Yan Wang, Xiang Bai, Alan Yuille
By observing the relationship between the receptive field sizes of the different layers in the network and the skeleton scales they can capture, we introduce two scale-associated side outputs to each stage of the network.
no code implementations • 20 May 2016 • Wei Shen, Yuan Jiang, Wenjing Gao, Dan Zeng, Xinggang Wang
Contour and skeleton are two complementary representations for shape recognition.
1 code implementation • CVPR 2016 • Zheng Zhang, Chengquan Zhang, Wei Shen, Cong Yao, Wenyu Liu, Xiang Bai
In this paper, we propose a novel approach for text detec- tion in natural images.
Ranked #40 on
Scene Text Detection
on ICDAR 2015
no code implementations • CVPR 2016 • Wei Shen, Kai Zhao, Yuan Jiang, Yan Wang, Zhijiang Zhang, Xiang Bai
Object skeleton is a useful cue for object detection, complementary to the object contour, as it provides a structural representation to describe the relationship among object parts.
no code implementations • CVPR 2015 • Wei Shen, Xinggang Wang, Yan Wang, Xiang Bai, Zhijiang Zhang
Contour detection serves as the basis of a variety of computer vision tasks such as image segmentation and object recognition.
no code implementations • CVPR 2015 • Zheng Zhang, Wei Shen, Cong Yao, Xiang Bai
Recently, a variety of real-world applications have triggered huge demand for techniques that can extract textual information from natural scenes.