Search Results for author: Li Zhang

Found 219 papers, 99 papers with code

Label Definitions Improve Semantic Role Labeling

1 code implementation NAACL 2022 Li Zhang, Ishan Jindal, Yunyao Li

Given a sentence and the predicate, a semantic role label is assigned to each argument of the predicate.

Semantic Role Labeling

SmartCiteCon: Implicit Citation Context Extraction from Academic Literature Using Supervised Learning

no code implementations WOSP 2020 Chenrui Guo, Haoran Cui, Li Zhang, Jiamin Wang, Wei Lu, Jian Wu

The tool is built on a Support Vector Machine (SVM) model trained on a set of 7, 058 manually annotated citation context sentences, curated from 34, 000 papers from the ACL Anthology.

Is “My Favorite New Movie” My Favorite Movie? Probing the Understanding of Recursive Noun Phrases

no code implementations NAACL 2022 Qing Lyu, Zheng Hua, Daoxin Li, Li Zhang, Marianna Apidianaki, Chris Callison-Burch

We introduce the Recursive Noun Phrase Challenge (RNPC), a dataset of three textual inference tasks involving textual entailment and event plausibility comparison, precisely targeting the understanding of recursive NPs.

Common Sense Reasoning Natural Language Inference

Generative Semantic Segmentation

1 code implementation20 Mar 2023 Jiaqi Chen, Jiachen Lu, Xiatian Zhu, Li Zhang

To that end, the segmentation mask is expressed with a special type of image (dubbed as maskige).

Semantic Segmentation

Single-view Neural Radiance Fields with Depth Teacher

no code implementations17 Mar 2023 Yurui Chen, Chun Gu, Feihu Zhang, Li Zhang

Moreover, it has poor generalizations to new scenes and requires retraining or fine-tuning on each scene.

Novel View Synthesis

RTMPose: Real-Time Multi-Person Pose Estimation based on MMPose

1 code implementation13 Mar 2023 Tao Jiang, Peng Lu, Li Zhang, Ningsheng Ma, Rui Han, Chengqi Lyu, Yining Li, Kai Chen

Recent studies on 2D pose estimation have achieved excellent performance on public benchmarks, yet its application in the industrial community still suffers from heavy model parameters and high latency.

2D Human Pose Estimation Multi-Person Pose Estimation

QVRF: A Quantization-error-aware Variable Rate Framework for Learned Image Compression

1 code implementation10 Mar 2023 Kedeng Tong, Yaojun Wu, Yue Li, Kai Zhang, Li Zhang, Xin Jin

In this paper, we present a Quantization-error-aware Variable Rate Framework (QVRF) that utilizes a univariate quantization regulator a to achieve wide-range variable rates within a single model.

Image Compression Quantization

Self-Asymmetric Invertible Network for Compression-Aware Image Rescaling

1 code implementation4 Mar 2023 Jinhai Yang, Mengxi Guo, Shijie Zhao, Junlin Li, Li Zhang

In this paper, we propose the Self-Asymmetric Invertible Network (SAIN) for compression-aware image rescaling.

Image Compression

FAME-ViL: Multi-Tasking Vision-Language Model for Heterogeneous Fashion Tasks

1 code implementation4 Mar 2023 Xiao Han, Xiatian Zhu, Licheng Yu, Li Zhang, Yi-Zhe Song, Tao Xiang

In the fashion domain, there exists a variety of vision-and-language (V+L) tasks, including cross-modal retrieval, text-guided image retrieval, multi-modal classification, and image captioning.

Cross-Modal Retrieval Image Captioning +4

S-NeRF: Neural Radiance Fields for Street Views

no code implementations1 Mar 2023 Ziyang Xie, Junge Zhang, Wenye Li, Feihu Zhang, Li Zhang

Specifically, we improve the scene parameterization function and the camera poses for learning better neural representations from street views.

Novel View Synthesis Self-Driving Cars

Multi-Task Differential Privacy Under Distribution Skew

no code implementations15 Feb 2023 Walid Krichene, Prateek Jain, Shuang Song, Mukund Sundararajan, Abhradeep Thakurta, Li Zhang

We study the problem of multi-task learning under user-level differential privacy, in which $n$ users contribute data to $m$ tasks, each involving a subset of users.

Multi-Task Learning

Preconditioned Score-based Generative Models

1 code implementation13 Feb 2023 Li Zhang, Hengyuan Ma, Xiatian Zhu, Jianfeng Feng

Compared with the latest generative models (\eg, CLD-SGM, DDIM, and Analytic-DDIM), PDS can achieve the best sampling quality on CIFAR-10 at a FID score of 1. 99.

Image Generation

Syntax and Domain Aware Model for Unsupervised Program Translation

no code implementations8 Feb 2023 Fang Liu, Jia Li, Li Zhang

The experimental results on function translation tasks between Python, Java, and C++ show that SDA-Trans outperforms many large-scale pre-trained models, especially for unseen language translation.

Cross-Lingual Transfer Translation

SimMTM: A Simple Pre-Training Framework for Masked Time-Series Modeling

no code implementations2 Feb 2023 Jiaxiang Dong, Haixu Wu, Haoran Zhang, Li Zhang, Jianmin Wang, Mingsheng Long

By relating masked modeling to manifold learning, SimMTM proposes to recover masked time points by the weighted aggregation of multiple neighbors outside the manifold, which eases the reconstruction task by assembling ruined but complementary temporal variations from multiple masked series.

Representation Learning Time Series Analysis

Faithful Chain-of-Thought Reasoning

1 code implementation31 Jan 2023 Qing Lyu, Shreya Havaldar, Adam Stein, Li Zhang, Delip Rao, Eric Wong, Marianna Apidianaki, Chris Callison-Burch

Together with self-consistency decoding, we achieve new state-of-the-art few-shot performance on 7 out of the 10 datasets, showing a strong synergy between faithfulness and accuracy.

Multi-hop Question Answering Question Answering

SeaFormer: Squeeze-enhanced Axial Transformer for Mobile Semantic Segmentation

1 code implementation30 Jan 2023 Qiang Wan, Zilong Huang, Jiachen Lu, Gang Yu, Li Zhang

Coupled with a light segmentation head, we achieve the best trade-off between segmentation accuracy and latency on the ARM-based mobile devices on the ADE20K and Cityscapes datasets.

Image Classification Semantic Segmentation

Causal Reasoning of Entities and Events in Procedural Texts

1 code implementation26 Jan 2023 Li Zhang, Hainiu Xu, Yue Yang, Shuyan Zhou, Weiqiu You, Manni Arora, Chris Callison-Burch

By injecting the causal relations between entities and events as intermediate reasoning steps in our representation, we further boost the performance to . 67 F1.

LB-SimTSC: An Efficient Similarity-Aware Graph Neural Network for Semi-Supervised Time Series Classification

no code implementations12 Jan 2023 Wenjie Xi, Arnav Jain, Li Zhang, Jessica Lin

Recently, Similarity-aware Time Series Classification (SimTSC) is proposed to address this problem by using a graph neural network classification model on the graph generated from pairwise Dynamic Time Warping (DTW) distance of batch data.

Classification Dynamic Time Warping +2

PMP: Privacy-Aware Matrix Profile against Sensitive Pattern Inference for Time Series

no code implementations4 Jan 2023 Li Zhang, Jiahao Ding, Yifeng Gao, Jessica Lin

During the process, data sharing is often involved to allow the third-party modelers to perform specific time series data mining (TSDM) tasks based on the need of data owner.

Privacy Preserving Time Series Analysis

Language Models are Drummers: Drum Composition with Natural Language Pre-Training

no code implementations3 Jan 2023 Li Zhang, Chris Callison-Burch

Automatic music generation with artificial intelligence typically requires a large amount of data which is hard to obtain for many less common genres and musical instruments.

Music Generation Transfer Learning

MSV Challenge 2022: NPU-HC Speaker Verification System for Low-resource Indian Languages

no code implementations30 Nov 2022 Yue Li, Li Zhang, Namin Wang, Jie Liu, Lei Xie

Specifically, the weight transfer fine-tuning aims to constrain the distance of the weights between the pre-trained model and the fine-tuned model, which takes advantage of the previously acquired discriminative ability from the large-scale out-domain datasets and avoids catastrophic forgetting and overfitting at the same time.

Speaker Verification

Panoramic Video Salient Object Detection with Ambisonic Audio Guidance

no code implementations26 Nov 2022 Xiang Li, Haoyuan Cao, Shijie Zhao, Junlin Li, Li Zhang, Bhiksha Raj

In this paper, we aim to tackle the video salient object detection problem for panoramic videos, with their corresponding ambisonic audios.

object-detection Salient Object Detection +1

Robust Time Series Chain Discovery with Incremental Nearest Neighbors

no code implementations3 Nov 2022 Li Zhang, Yan Zhu, Yifeng Gao, Jessica Lin

Inspired by a recent work that tracks how the nearest neighbor of a time series subsequence changes over time, we introduce a new TSC definition which is much more robust to noise in the data, in the sense that they can better locate the evolving patterns while excluding the non-evolving ones.

Time Series Analysis

TSUP Speaker Diarization System for Conversational Short-phrase Speaker Diarization Challenge

no code implementations26 Oct 2022 Bowen Pang, Huan Zhao, Gaosheng Zhang, Xiaoyue Yang, Yang Sun, Li Zhang, Qing Wang, Lei Xie

In this challenge, we explore three kinds of typical speaker diarization systems, which are spectral clustering(SC) based diarization, target-speaker voice activity detection(TS-VAD) and end-to-end neural diarization(EEND) respectively.

Action Detection Activity Detection +2

Generative Model Watermarking Based on Human Visual System

no code implementations30 Sep 2022 Li Zhang, Yong liu, Shaoteng Liu, Tianshu Yang, Yexin Wang, Xinpeng Zhang, Hanzhou Wu

Intellectual property protection of deep neural networks is receiving attention from more and more researchers, and the latest research applies model watermarking to generative models for image processing.

NWPU-ASLP System for the VoicePrivacy 2022 Challenge

no code implementations24 Sep 2022 Jixun Yao, Qing Wang, Li Zhang, Pengcheng Guo, Yuhao Liang, Lei Xie

Our system consists of four modules, including feature extractor, acoustic model, anonymization module, and neural vocoder.

Speaker Verification

Dynamic Graph Message Passing Networks for Visual Recognition

2 code implementations20 Sep 2022 Li Zhang, Mohan Chen, Anurag Arnab, xiangyang xue, Philip H. S. Torr

A fully-connected graph, such as the self-attention operation in Transformers, is beneficial for such modelling, however, its computational overhead is prohibitive.

Image Classification object-detection +3

Model-Guided Multi-Contrast Deep Unfolding Network for MRI Super-resolution Reconstruction

1 code implementation15 Sep 2022 Gang Yang, Li Zhang, Man Zhou, Aiping Liu, Xun Chen, Zhiwei Xiong, Feng Wu

Interpretable neural network models are of significant interest since they enhance the trustworthiness required in clinical practice when dealing with medical images.

Super-Resolution

Data-Driven Deep Supervision for Skin Lesion Classification

no code implementations4 Sep 2022 Suraj Mishra, Yizhe Zhang, Li Zhang, Tianyu Zhang, X. Sharon Hu, Danny Z. Chen

Specifically, we analyze the convolutional network's behavior (field-of-view) to find the location of deep supervision for improved feature extraction.

Classification Lesion Classification +1

Scalable Nanophotonic-Electronic Spiking Neural Networks

no code implementations28 Aug 2022 Luis El Srouji, Yun-jhu Lee, Mehmet Berkay On, Li Zhang, S. J. Ben Yoo

Photonic devices are ideal for the design of high-bandwidth, parallel architectures matching the SNN computational paradigm.

Hierarchical Reinforcement Learning Based Video Semantic Coding for Segmentation

no code implementations24 Aug 2022 Guangqi Xie, Xin Li, Shiqi Lin, Li Zhang, Kai Zhang, Yue Li, Zhibo Chen

In this paper, we take a step forward to video semantic compression and propose the Hierarchical Reinforcement Learning based task-driven Video Semantic Coding, named as HRLVSC.

Hierarchical Reinforcement Learning reinforcement-learning +3

DeepInteraction: 3D Object Detection via Modality Interaction

2 code implementations23 Aug 2022 Zeyu Yang, Jiaqi Chen, Zhenwei Miao, Wei Li, Xiatian Zhu, Li Zhang

Existing top-performance 3D object detectors typically rely on the multi-modal fusion strategy.

3D Object Detection object-detection

CelebV-HQ: A Large-Scale Video Facial Attributes Dataset

1 code implementation25 Jul 2022 Hao Zhu, Wayne Wu, Wentao Zhu, Liming Jiang, Siwei Tang, Li Zhang, Ziwei Liu, Chen Change Loy

Large-scale datasets have played indispensable roles in the recent success of face generation/editing and significantly facilitated the advances of emerging research fields.

Face Generation Unconditional Video Generation

RCLane: Relay Chain Prediction for Lane Detection

no code implementations19 Jul 2022 Shenghua Xu, Xinyue Cai, Bin Zhao, Li Zhang, Hang Xu, Yanwei Fu, xiangyang xue

This is because most of the existing lane detection methods either treat the lane detection as a dense prediction or a detection task, few of them consider the unique topologies (Y-shape, Fork-shape, nearly horizontal lane) of the lane markers, which leads to sub-optimal solution.

Lane Detection

Vision Transformers: From Semantic Segmentation to Dense Prediction

1 code implementation19 Jul 2022 Li Zhang, Jiachen Lu, Sixiao Zheng, Xinxuan Zhao, Xiatian Zhu, Yanwei Fu, Tao Xiang, Jianfeng Feng

In this work, for the first time we explore the global context learning potentials of ViTs for dense visual prediction (e. g., semantic segmentation).

Image Classification Instance Segmentation +4

FashionViL: Fashion-Focused Vision-and-Language Representation Learning

1 code implementation17 Jul 2022 Xiao Han, Licheng Yu, Xiatian Zhu, Li Zhang, Yi-Zhe Song, Tao Xiang

We thus propose a Multi-View Contrastive Learning task for pulling closer the visual representation of one image to the compositional multimodal representation of another image+text.

Contrastive Learning Image Retrieval +2

What Makes for Automatic Reconstruction of Pulmonary Segments

1 code implementation7 Jul 2022 Kaiming Kuang, Li Zhang, Jingyu Li, Hongwei Li, Jiajun Chen, Bo Du, Jiancheng Yang

The automatic reconstruction of pulmonary segments by ImPulSe is accurate in metrics and visually appealing.

3D Reconstruction

SiamMask: A Framework for Fast Online Object Tracking and Segmentation

no code implementations5 Jul 2022 Weiming Hu, Qiang Wang, Li Zhang, Luca Bertinetto, Philip H. S. Torr

In this paper we introduce SiamMask, a framework to perform both visual object tracking and video object segmentation, in real-time, with the same simple method.

Multiple Object Tracking Semantic Segmentation +3

Accelerating Score-based Generative Models with Preconditioned Diffusion Sampling

1 code implementation5 Jul 2022 Hengyuan Ma, Li Zhang, Xiatian Zhu, Jianfeng Feng

However, a fundamental limitation is that their inference is very slow due to a need for many (e. g., 2000) iterations of sequential computations.

Image Generation

Softmax-free Linear Transformers

1 code implementation5 Jul 2022 Jiachen Lu, Li Zhang, Junge Zhang, Xiatian Zhu, Hang Xu, Jianfeng Feng

Crucially, with a linear complexity, much longer token sequences are permitted in SOFT, resulting in superior trade-off between accuracy and complexity.

Knowledge-aware Neural Collective Matrix Factorization for Cross-domain Recommendation

no code implementations27 Jun 2022 Li Zhang, Yan Ge, Jun Ma, Jianmo Ni, Haiping Lu

In this paper, we propose to incorporate the knowledge graph (KG) for CDR, which enables items in different domains to share knowledge.

General Knowledge

GEMv2: Multilingual NLG Benchmarking in a Single Line of Code

no code implementations22 Jun 2022 Sebastian Gehrmann, Abhik Bhattacharjee, Abinaya Mahendiran, Alex Wang, Alexandros Papangelis, Aman Madaan, Angelina McMillan-Major, Anna Shvets, Ashish Upadhyay, Bingsheng Yao, Bryan Wilie, Chandra Bhagavatula, Chaobin You, Craig Thomson, Cristina Garbacea, Dakuo Wang, Daniel Deutsch, Deyi Xiong, Di Jin, Dimitra Gkatzia, Dragomir Radev, Elizabeth Clark, Esin Durmus, Faisal Ladhak, Filip Ginter, Genta Indra Winata, Hendrik Strobelt, Hiroaki Hayashi, Jekaterina Novikova, Jenna Kanerva, Jenny Chim, Jiawei Zhou, Jordan Clive, Joshua Maynez, João Sedoc, Juraj Juraska, Kaustubh Dhole, Khyathi Raghavi Chandu, Laura Perez-Beltrachini, Leonardo F. R. Ribeiro, Lewis Tunstall, Li Zhang, Mahima Pushkarna, Mathias Creutz, Michael White, Mihir Sanjay Kale, Moussa Kamal Eddine, Nico Daheim, Nishant Subramani, Ondrej Dusek, Paul Pu Liang, Pawan Sasanka Ammanamanchi, Qi Zhu, Ratish Puduppully, Reno Kriz, Rifat Shahriyar, Ronald Cardenas, Saad Mahamood, Salomey Osei, Samuel Cahyawijaya, Sanja Štajner, Sebastien Montella, Shailza, Shailza Jolly, Simon Mille, Tahmid Hasan, Tianhao Shen, Tosin Adewumi, Vikas Raunak, Vipul Raheja, Vitaly Nikolaev, Vivian Tsai, Yacine Jernite, Ying Xu, Yisi Sang, Yixin Liu, Yufang Hou

This problem is especially pertinent in natural language generation which requires ever-improving suites of datasets, metrics, and human evaluation to make definitive claims.

Benchmarking Text Generation

Intra Encoding Complexity Control with a Time-Cost Model for Versatile Video Coding

no code implementations13 Jun 2022 Yan Huang, Jizheng Xu, Li Zhang, Yan Zhao, Li Song

Inspired by rate control algorithms, we propose a scheme to precisely control the intra encoding complexity of VVC.

Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models

1 code implementation9 Jun 2022 Aarohi Srivastava, Abhinav Rastogi, Abhishek Rao, Abu Awal Md Shoeb, Abubakar Abid, Adam Fisch, Adam R. Brown, Adam Santoro, Aditya Gupta, Adrià Garriga-Alonso, Agnieszka Kluska, Aitor Lewkowycz, Akshat Agarwal, Alethea Power, Alex Ray, Alex Warstadt, Alexander W. Kocurek, Ali Safaya, Ali Tazarv, Alice Xiang, Alicia Parrish, Allen Nie, Aman Hussain, Amanda Askell, Amanda Dsouza, Ambrose Slone, Ameet Rahane, Anantharaman S. Iyer, Anders Andreassen, Andrea Madotto, Andrea Santilli, Andreas Stuhlmüller, Andrew Dai, Andrew La, Andrew Lampinen, Andy Zou, Angela Jiang, Angelica Chen, Anh Vuong, Animesh Gupta, Anna Gottardi, Antonio Norelli, Anu Venkatesh, Arash Gholamidavoodi, Arfa Tabassum, Arul Menezes, Arun Kirubarajan, Asher Mullokandov, Ashish Sabharwal, Austin Herrick, Avia Efrat, Aykut Erdem, Ayla Karakaş, B. Ryan Roberts, Bao Sheng Loe, Barret Zoph, Bartłomiej Bojanowski, Batuhan Özyurt, Behnam Hedayatnia, Behnam Neyshabur, Benjamin Inden, Benno Stein, Berk Ekmekci, Bill Yuchen Lin, Blake Howald, Cameron Diao, Cameron Dour, Catherine Stinson, Cedrick Argueta, César Ferri Ramírez, Chandan Singh, Charles Rathkopf, Chenlin Meng, Chitta Baral, Chiyu Wu, Chris Callison-Burch, Chris Waites, Christian Voigt, Christopher D. Manning, Christopher Potts, Cindy Ramirez, Clara E. Rivera, Clemencia Siro, Colin Raffel, Courtney Ashcraft, Cristina Garbacea, Damien Sileo, Dan Garrette, Dan Hendrycks, Dan Kilman, Dan Roth, Daniel Freeman, Daniel Khashabi, Daniel Levy, Daniel Moseguí González, Danielle Perszyk, Danny Hernandez, Danqi Chen, Daphne Ippolito, Dar Gilboa, David Dohan, David Drakard, David Jurgens, Debajyoti Datta, Deep Ganguli, Denis Emelin, Denis Kleyko, Deniz Yuret, Derek Chen, Derek Tam, Dieuwke Hupkes, Diganta Misra, Dilyar Buzan, Dimitri Coelho Mollo, Diyi Yang, Dong-Ho Lee, Ekaterina Shutova, Ekin Dogus Cubuk, Elad Segal, Eleanor Hagerman, Elizabeth Barnes, Elizabeth Donoway, Ellie Pavlick, Emanuele Rodola, Emma Lam, Eric Chu, Eric Tang, Erkut Erdem, Ernie Chang, Ethan A. Chi, Ethan Dyer, Ethan Jerzak, Ethan Kim, Eunice Engefu Manyasi, Evgenii Zheltonozhskii, Fanyue Xia, Fatemeh Siar, Fernando Martínez-Plumed, Francesca Happé, Francois Chollet, Frieda Rong, Gaurav Mishra, Genta Indra Winata, Gerard de Melo, Germán Kruszewski, Giambattista Parascandolo, Giorgio Mariani, Gloria Wang, Gonzalo Jaimovitch-López, Gregor Betz, Guy Gur-Ari, Hana Galijasevic, Hannah Kim, Hannah Rashkin, Hannaneh Hajishirzi, Harsh Mehta, Hayden Bogar, Henry Shevlin, Hinrich Schütze, Hiromu Yakura, Hongming Zhang, Hugh Mee Wong, Ian Ng, Isaac Noble, Jaap Jumelet, Jack Geissinger, Jackson Kernion, Jacob Hilton, Jaehoon Lee, Jaime Fernández Fisac, James B. Simon, James Koppel, James Zheng, James Zou, Jan Kocoń, Jana Thompson, Jared Kaplan, Jarema Radom, Jascha Sohl-Dickstein, Jason Phang, Jason Wei, Jason Yosinski, Jekaterina Novikova, Jelle Bosscher, Jennifer Marsh, Jeremy Kim, Jeroen Taal, Jesse Engel, Jesujoba Alabi, Jiacheng Xu, Jiaming Song, Jillian Tang, Joan Waweru, John Burden, John Miller, John U. Balis, Jonathan Berant, Jörg Frohberg, Jos Rozen, Jose Hernandez-Orallo, Joseph Boudeman, Joseph Jones, Joshua B. Tenenbaum, Joshua S. Rule, Joyce Chua, Kamil Kanclerz, Karen Livescu, Karl Krauth, Karthik Gopalakrishnan, Katerina Ignatyeva, Katja Markert, Kaustubh D. Dhole, Kevin Gimpel, Kevin Omondi, Kory Mathewson, Kristen Chiafullo, Ksenia Shkaruta, Kumar Shridhar, Kyle McDonell, Kyle Richardson, Laria Reynolds, Leo Gao, Li Zhang, Liam Dugan, Lianhui Qin, Lidia Contreras-Ochando, Louis-Philippe Morency, Luca Moschella, Lucas Lam, Lucy Noble, Ludwig Schmidt, Luheng He, Luis Oliveros Colón, Luke Metz, Lütfi Kerem Şenel, Maarten Bosma, Maarten Sap, Maartje ter Hoeve, Maheen Farooqi, Manaal Faruqui, Mantas Mazeika, Marco Baturan, Marco Marelli, Marco Maru, Maria Jose Ramírez Quintana, Marie Tolkiehn, Mario Giulianelli, Martha Lewis, Martin Potthast, Matthew L. Leavitt, Matthias Hagen, Mátyás Schubert, Medina Orduna Baitemirova, Melody Arnaud, Melvin McElrath, Michael A. Yee, Michael Cohen, Michael Gu, Michael Ivanitskiy, Michael Starritt, Michael Strube, Michał Swędrowski, Michele Bevilacqua, Michihiro Yasunaga, Mihir Kale, Mike Cain, Mimee Xu, Mirac Suzgun, Mo Tiwari, Mohit Bansal, Moin Aminnaseri, Mor Geva, Mozhdeh Gheini, Mukund Varma T, Nanyun Peng, Nathan Chi, Nayeon Lee, Neta Gur-Ari Krakover, Nicholas Cameron, Nicholas Roberts, Nick Doiron, Nikita Nangia, Niklas Deckers, Niklas Muennighoff, Nitish Shirish Keskar, Niveditha S. Iyer, Noah Constant, Noah Fiedel, Nuan Wen, Oliver Zhang, Omar Agha, Omar Elbaghdadi, Omer Levy, Owain Evans, Pablo Antonio Moreno Casares, Parth Doshi, Pascale Fung, Paul Pu Liang, Paul Vicol, Pegah Alipoormolabashi, Peiyuan Liao, Percy Liang, Peter Chang, Peter Eckersley, Phu Mon Htut, Pinyu Hwang, Piotr Miłkowski, Piyush Patil, Pouya Pezeshkpour, Priti Oli, Qiaozhu Mei, Qing Lyu, Qinlang Chen, Rabin Banjade, Rachel Etta Rudolph, Raefer Gabriel, Rahel Habacker, Ramón Risco Delgado, Raphaël Millière, Rhythm Garg, Richard Barnes, Rif A. Saurous, Riku Arakawa, Robbe Raymaekers, Robert Frank, Rohan Sikand, Roman Novak, Roman Sitelew, Ronan LeBras, Rosanne Liu, Rowan Jacobs, Rui Zhang, Ruslan Salakhutdinov, Ryan Chi, Ryan Lee, Ryan Stovall, Ryan Teehan, Rylan Yang, Sahib Singh, Saif M. Mohammad, Sajant Anand, Sam Dillavou, Sam Shleifer, Sam Wiseman, Samuel Gruetter, Samuel R. Bowman, Samuel S. Schoenholz, Sanghyun Han, Sanjeev Kwatra, Sarah A. Rous, Sarik Ghazarian, Sayan Ghosh, Sean Casey, Sebastian Bischoff, Sebastian Gehrmann, Sebastian Schuster, Sepideh Sadeghi, Shadi Hamdan, Sharon Zhou, Shashank Srivastava, Sherry Shi, Shikhar Singh, Shima Asaadi, Shixiang Shane Gu, Shubh Pachchigar, Shubham Toshniwal, Shyam Upadhyay, Shyamolima, Debnath, Siamak Shakeri, Simon Thormeyer, Simone Melzi, Siva Reddy, Sneha Priscilla Makini, Soo-Hwan Lee, Spencer Torene, Sriharsha Hatwar, Stanislas Dehaene, Stefan Divic, Stefano Ermon, Stella Biderman, Stephanie Lin, Stephen Prasad, Steven T. Piantadosi, Stuart M. Shieber, Summer Misherghi, Svetlana Kiritchenko, Swaroop Mishra, Tal Linzen, Tal Schuster, Tao Li, Tao Yu, Tariq Ali, Tatsu Hashimoto, Te-Lin Wu, Théo Desbordes, Theodore Rothschild, Thomas Phan, Tianle Wang, Tiberius Nkinyili, Timo Schick, Timofei Kornev, Timothy Telleen-Lawton, Titus Tunduny, Tobias Gerstenberg, Trenton Chang, Trishala Neeraj, Tushar Khot, Tyler Shultz, Uri Shaham, Vedant Misra, Vera Demberg, Victoria Nyamai, Vikas Raunak, Vinay Ramasesh, Vinay Uday Prabhu, Vishakh Padmakumar, Vivek Srikumar, William Fedus, William Saunders, William Zhang, Wout Vossen, Xiang Ren, Xiaoyu Tong, Xinran Zhao, Xinyi Wu, Xudong Shen, Yadollah Yaghoobzadeh, Yair Lakretz, Yangqiu Song, Yasaman Bahri, Yejin Choi, Yichi Yang, Yiding Hao, Yifu Chen, Yonatan Belinkov, Yu Hou, Yufang Hou, Yuntao Bai, Zachary Seid, Zhuoye Zhao, Zijian Wang, Zijie J. Wang, ZiRui Wang, Ziyi Wu

BIG-bench focuses on tasks that are believed to be beyond the capabilities of current language models.

Common Sense Reasoning Memorization

Learning Ego 3D Representation as Ray Tracing

1 code implementation8 Jun 2022 Jiachen Lu, Zheyuan Zhou, Xiatian Zhu, Hang Xu, Li Zhang

A self-driving perception model aims to extract 3D semantic representations from multiple cameras collectively into the bird's-eye-view (BEV) coordinate frame of the ego car in order to ground downstream planner.

3D Object Detection Depth Estimation +3

Accelerating Score-based Generative Models for High-Resolution Image Synthesis

no code implementations8 Jun 2022 Hengyuan Ma, Li Zhang, Xiatian Zhu, Jingfeng Zhang, Jianfeng Feng

To ensure stability of convergence in sampling and generation quality, however, this sequential sampling process has to take a small step size and many sampling iterations (e. g., 2000).

Image Generation

Region-Aware Metric Learning for Open World Semantic Segmentation via Meta-Channel Aggregation

1 code implementation17 May 2022 Hexin Dong, ZiFan Chen, Mingze Yuan, Yutong Xie, Jie Zhao, Fei Yu, Bin Dong, Li Zhang

Therefore, we propose a method called region-aware metric learning (RAML), which first separates the regions of the images and generates region-aware features for further metric learning.

Few-Shot Learning Metric Learning +1

Reasoning about Procedures with Natural Language Processing: A Tutorial

no code implementations16 May 2022 Li Zhang

This tutorial provides a comprehensive and in-depth view of the research on procedures, primarily in Natural Language Processing.

In Defense of Subspace Tracker: Orthogonal Embedding for Visual Tracking

no code implementations17 Apr 2022 Yao Sui, Guanghui Wang, Li Zhang

The paper focuses on a classical tracking model, subspace learning, grounded on the fact that the targets in successive frames are considered to reside in a low-dimensional subspace or manifold due to the similarity in their appearances.

Visual Tracking

Bidirectional Self-Training with Multiple Anisotropic Prototypes for Domain Adaptive Semantic Segmentation

1 code implementation16 Apr 2022 Yulei Lu, Yawei Luo, Li Zhang, Zheyang Li, Yi Yang, Jun Xiao

A thriving trend for domain adaptive segmentation endeavors to generate the high-quality pseudo labels for target domain and retrain the segmentor on them.

Pseudo Label Semantic Segmentation +2

UIGR: Unified Interactive Garment Retrieval

1 code implementation6 Apr 2022 Xiao Han, Sen He, Li Zhang, Yi-Zhe Song, Tao Xiang

In this paper, we propose a Unified Interactive Garment Retrieval (UIGR) framework to unify TGR and VCR.

Retrieval

ImpDet: Exploring Implicit Fields for 3D Object Detection

no code implementations31 Mar 2022 Xuelin Qian, Li Wang, Yi Zhu, Li Zhang, Yanwei Fu, xiangyang xue

Conventional 3D object detection approaches concentrate on bounding boxes representation learning with several parameters, i. e., localization, dimension, and orientation.

3D Object Detection object-detection +1

Show Me More Details: Discovering Hierarchies of Procedures from Semi-structured Web Data

1 code implementation ACL 2022 Shuyan Zhou, Li Zhang, Yue Yang, Qing Lyu, Pengcheng Yin, Chris Callison-Burch, Graham Neubig

To this end, we develop a simple and efficient method that links steps (e. g., "purchase a camera") in an article to other articles with similar goals (e. g., "how to choose a camera"), recursively constructing the KB.

Retrieval Video Retrieval

A general framework for adaptive two-index fusion attribute weighted naive Bayes

no code implementations24 Feb 2022 Xiaoliang Zhou, Dongyang Wu, Zitong You, Li Zhang, Ning Ye

In addition, the ATFNB framework can improve the existing two-index NB model by introducing the adaptive switching factor \{beta}.

Is "My Favorite New Movie" My Favorite Movie? Probing the Understanding of Recursive Noun Phrases

1 code implementation15 Dec 2021 Qing Lyu, Hua Zheng, Daoxin Li, Li Zhang, Marianna Apidianaki, Chris Callison-Burch

We introduce the Recursive Noun Phrase Challenge (RNPC), a dataset of three textual inference tasks involving textual entailment and event plausibility comparison, precisely targeting the understanding of recursive NPs.

Common Sense Reasoning Natural Language Inference

Persistent Object Identification Leveraging Non-Visual Markers

1 code implementation13 Dec 2021 Michael P. J. Camilleri, Li Zhang, Rasneer S. Bains, Andrew Zisserman, Christopher K. I. Williams

Our objective is to locate and provide a unique identifier for each mouse in a cluttered home-cage environment through time, as a precursor to automated behaviour recognition for biological research.

Visual Tracking

SGM3D: Stereo Guided Monocular 3D Object Detection

1 code implementation3 Dec 2021 Zheyuan Zhou, Liang Du, Xiaoqing Ye, Zhikang Zou, Xiao Tan, Li Zhang, xiangyang xue, Jianfeng Feng

Monocular 3D object detection aims to predict the object location, dimension and orientation in 3D space alongside the object category given only a monocular image.

Autonomous Driving Depth Estimation +3

ALX: Large Scale Matrix Factorization on TPUs

no code implementations3 Dec 2021 Harsh Mehta, Steffen Rendle, Walid Krichene, Li Zhang

We present ALX, an open-source library for distributed matrix factorization using Alternating Least Squares, written in JAX.

Link Prediction

Learning from Mistakes -- A Framework for Neural Architecture Search

1 code implementation11 Nov 2021 Bhanu Garg, Li Zhang, Pradyumna Sridhara, Ramtin Hosseini, Eric Xing, Pengtao Xie

We propose a novel machine learning method called Learning From Mistakes (LFM), wherein the learner improves its ability to learn by focusing more on the mistakes during revision.

BIG-bench Machine Learning Neural Architecture Search

Revisiting the Performance of iALS on Item Recommendation Benchmarks

1 code implementation26 Oct 2021 Steffen Rendle, Walid Krichene, Li Zhang, Yehuda Koren

Matrix factorization learned by implicit alternating least squares (iALS) is a popular baseline in recommender system research publications.

Collaborative Filtering Recommendation Systems

iALS++: Speeding up Matrix Factorization with Subspace Optimization

1 code implementation26 Oct 2021 Steffen Rendle, Walid Krichene, Li Zhang, Yehuda Koren

However, iALS does not scale well with large embedding dimensions, d, due to its cubic runtime dependency on d. Coordinate descent variations, iCD, have been proposed to lower the complexity to quadratic in d. In this work, we show that iCD approaches are not well suited for modern processors and can be an order of magnitude slower than a careful iALS implementation for small to mid scale embedding sizes (d ~ 100) and only perform better than iALS on large embeddings d ~ 1000.

SOFT: Softmax-free Transformer with Linear Complexity

2 code implementations NeurIPS 2021 Jiachen Lu, Jinghan Yao, Junge Zhang, Xiatian Zhu, Hang Xu, Weiguo Gao, Chunjing Xu, Tao Xiang, Li Zhang

Crucially, with a linear complexity, much longer token sequences are permitted in SOFT, resulting in superior trade-off between accuracy and complexity.

Text-Based Person Search with Limited Data

1 code implementation20 Oct 2021 Xiao Han, Sen He, Li Zhang, Tao Xiang

Firstly, to fully utilize the existing small-scale benchmarking datasets for more discriminative feature learning, we introduce a cross-modal momentum contrastive learning framework to enrich the training data for a given mini-batch.

Ranked #3 on Text based Person Retrieval on CUHK-PEDES (using extra training data)

Benchmarking Contrastive Learning +6

Multi-Frequency Wireless Channel Measurements and Characteristics Analysis in Indoor Corridor Scenarios

no code implementations14 Aug 2021 ZiHao Zhou, Li Zhang, Xinyue Chen, Cheng-Xiang Wang, Jie Huang

In this paper, we conduct wireless channel measurements in indoor corridor scenarios at 2. 4, 5 and 6 GHz bands with bandwidth of 320 MHz.

A Unified Efficient Pyramid Transformer for Semantic Segmentation

no code implementations29 Jul 2021 Fangrui Zhu, Yi Zhu, Li Zhang, Chongruo wu, Yanwei Fu, Mu Li

Semantic segmentation is a challenging problem due to difficulties in modeling context in complex scenes and class confusions along boundaries.

Semantic Segmentation

Goal-Oriented Script Construction

1 code implementation INLG (ACL) 2021 Qing Lyu, Li Zhang, Chris Callison-Burch

The knowledge of scripts, common chains of events in stereotypical scenarios, is a valuable asset for task-oriented natural language understanding systems.

Language Modelling Natural Language Understanding +1

Global Aggregation then Local Distribution for Scene Parsing

1 code implementation28 Jul 2021 Xiangtai Li, Li Zhang, Guangliang Cheng, Kuiyuan Yang, Yunhai Tong, Xiatian Zhu, Tao Xiang

Modelling long-range contextual relationships is critical for pixel-wise prediction tasks such as semantic segmentation.

Scene Parsing Semantic Segmentation

Oneshot Differentially Private Top-k Selection

no code implementations18 May 2021 Gang Qiao, Weijie J. Su, Li Zhang

Being able to efficiently and accurately select the top-$k$ elements with differential privacy is an integral component of various private data analysis tasks.

Composite Localization for Human Pose Estimation

no code implementations15 May 2021 ZiFan Chen, Xin Qin, Chao Yang, Li Zhang

This work proposes a novel deep learning framework for human pose estimation called composite localization to divide the complex learning objective into two simpler ones: a sparse heatmap to find the keypoint's approximate location and two short-distance offsetmaps to obtain its final precise coordinates.

Pose Estimation

Prediction of clinical tremor severity using Rank Consistent Ordinal Regression

no code implementations3 May 2021 Li Zhang, Vijay Yadav, Vidya Koesmahargyo, Anzar Abbas, Isaac Galatzer-Levy

The videos are coupled with clinician assessed TETRAS scores, which are used as ground truth labels to train the DNN.

regression Transfer Learning

Delving into Data: Effectively Substitute Training for Black-box Attack

no code implementations CVPR 2021 Wenxuan Wang, Bangjie Yin, Taiping Yao, Li Zhang, Yanwei Fu, Shouhong Ding, Jilin Li, Feiyue Huang, xiangyang xue

Previous substitute training approaches focus on stealing the knowledge of the target model based on real training data or synthetic data, without exploring what kind of data can further improve the transferability between the substitute and target models.

Adversarial Attack

Optimize Neural Fictitious Self-Play in Regret Minimization Thinking

no code implementations22 Apr 2021 Yuxuan Chen, Li Zhang, Shijian Li, Gang Pan

Optimization of deep learning algorithms to approach Nash Equilibrium remains a significant problem in imperfect information games, e. g. StarCraft and poker.

Starcraft

Improving Weakly-supervised Object Localization via Causal Intervention

1 code implementation21 Apr 2021 Feifei Shao, Yawei Luo, Li Zhang, Lu Ye, Siliang Tang, Yi Yang, Jun Xiao

The recent emerged weakly supervised object localization (WSOL) methods can learn to localize an object in the image only using image-level labels.

Weakly-Supervised Object Localization

Visual Goal-Step Inference using wikiHow

1 code implementation EMNLP 2021 Yue Yang, Artemis Panagopoulou, Qing Lyu, Li Zhang, Mark Yatskar, Chris Callison-Burch

Understanding what sequence of steps are needed to complete a goal can help artificial intelligence systems reason about human activities.

VGSI

Hierarchical Road Topology Learning for Urban Map-less Driving

no code implementations31 Mar 2021 Li Zhang, Faezeh Tafazzoli, Gunther Krehl, Runsheng Xu, Timo Rehfeld, Manuel Schier, Arunava Seal

The majority of current approaches in autonomous driving rely on High-Definition (HD) maps which detail the road geometry and surrounding area.

Autonomous Driving

Learning Dynamic Alignment via Meta-filter for Few-shot Learning

1 code implementation CVPR 2021 Chengming Xu, Chen Liu, Li Zhang, Chengjie Wang, Jilin Li, Feiyue Huang, xiangyang xue, Yanwei Fu

Our insight is that these methods would lead to poor adaptation with redundant matching, and leveraging channel-wise adjustment is the key to well adapting the learned knowledge to new classes.

Few-Shot Learning

Robust and Accurate Object Detection via Adversarial Learning

1 code implementation CVPR 2021 Xiangning Chen, Cihang Xie, Mingxing Tan, Li Zhang, Cho-Jui Hsieh, Boqing Gong

Data augmentation has become a de facto component for training high-performance deep image classifiers, but its potential is under-explored for object detection.

AutoML Data Augmentation +2

MoViNets: Mobile Video Networks for Efficient Video Recognition

3 code implementations CVPR 2021 Dan Kondratyuk, Liangzhe Yuan, Yandong Li, Li Zhang, Mingxing Tan, Matthew Brown, Boqing Gong

We present Mobile Video Networks (MoViNets), a family of computation and memory efficient video networks that can operate on streaming video for online inference.

Action Classification Action Recognition +3

Automatically detecting the conflicts between software requirements based on finer semantic analysis

1 code implementation3 Mar 2021 Weize Guo, Li Zhang, Xiaoli Lian

Besides, our approach is capable of transforming the natural language functional requirements into eight semantic tuples, which is useful not only the detection of the conflicts between requirements but also some other tasks such as constructing the association between requirements and so on.

Association

EEGFuseNet: Hybrid Unsupervised Deep Feature Characterization and Fusion for High-Dimensional EEG with An Application to Emotion Recognition

no code implementations7 Feb 2021 Zhen Liang, Rushuang Zhou, Li Zhang, Linling Li, Gan Huang, Zhiguo Zhang, Shin Ishii

The performance of the extracted deep and low-dimensional features by EEGFuseNet is carefully evaluated in an unsupervised emotion recognition application based on three public emotion databases.

Electroencephalogram (EEG) Emotion Recognition

Failure Prediction in Production Line Based on Federated Learning: An Empirical Study

no code implementations25 Jan 2021 Ning Ge, Guanghao Li, Li Zhang, Yi Liu Yi Liu

Data protection across organizations is limiting the application of centralized learning (CL) techniques.

Federated Learning

Few-shot Action Recognition with Prototype-centered Attentive Learning

1 code implementation20 Jan 2021 Xiatian Zhu, Antoine Toisoul, Juan-Manuel Perez-Rua, Li Zhang, Brais Martinez, Tao Xiang

Extensive experiments on four standard few-shot action benchmarks show that our method clearly outperforms previous state-of-the-art methods, with the improvement particularly significant (10+\%) on the most challenging fine-grained action recognition benchmark.

Contrastive Learning Few-Shot action recognition +3

TEAC: Intergrating Trust Region and Max Entropy Actor Critic for Continuous Control

1 code implementation1 Jan 2021 Hongyu Zang, Xin Li, Li Zhang, Peiyao Zhao, Mingzhong Wang

Trust region methods and maximum entropy methods are two state-of-the-art branches used in reinforcement learning (RL) for the benefits of stability and exploration in continuous environments, respectively.

Continuous Control Reinforcement Learning (RL)

Hop-Hop Relation-aware Graph Neural Networks

no code implementations21 Dec 2020 Li Zhang, Yan Ge, Haiping Lu

Graph Neural Networks (GNNs) are widely used in graph representation learning.

Knowledge Graph Embedding

Unifying Homophily and Heterophily Network Transformation via Motifs

no code implementations21 Dec 2020 Yan Ge, Jun Ma, Li Zhang, Haiping Lu

Because H2NT can sparsify networks with motif structures, it can also improve the computational efficiency of existing network embedding methods when integrated.

Network Embedding Node Classification

A Systematic Literature Review on Federated Learning: From A Model Quality Perspective

no code implementations1 Dec 2020 Yi Liu, Li Zhang, Ning Ge, Guanghao Li

In this process, the server uses an incentive mechanism to encourage clients to contribute high-quality and large-volume data to improve the global model.

Federated Learning

Direct Classification of Emotional Intensity

no code implementations15 Nov 2020 Jacob Ouyang, Isaac R Galatzer-Levy, Vidya Koesmahargyo, Li Zhang

In this paper, we present a model that can directly predict emotion intensity score from video inputs, instead of deriving from action units.

Classification General Classification

Skin disease diagnosis with deep learning: a review

no code implementations11 Nov 2020 Hongfeng Li, Yini Pan, Jie Zhao, Li Zhang

As an important part of this article, we then review the literature involving deep learning methods for skin disease diagnosis from several aspects according to the specific tasks.

Towards Efficient Scene Understanding via Squeeze Reasoning

1 code implementation6 Nov 2020 Xiangtai Li, Xia Li, Ansheng You, Li Zhang, Guangliang Cheng, Kuiyuan Yang, Yunhai Tong, Zhouchen Lin

Instead of propagating information on the spatial map, we first learn to squeeze the input feature into a channel-wise global vector and perform reasoning within the single vector where the computation cost can be significantly reduced.

Instance Segmentation object-detection +3

Depth Guided Adaptive Meta-Fusion Network for Few-shot Video Recognition

1 code implementation20 Oct 2020 Yuqian Fu, Li Zhang, Junke Wang, Yanwei Fu, Yu-Gang Jiang

Humans can easily recognize actions with only a few examples given, while the existing video recognition models still heavily rely on the large-scale labeled data inputs.

Few Shot Action Recognition Meta-Learning +2

Towards Optimal Filter Pruning with Balanced Performance and Pruning Speed

1 code implementation14 Oct 2020 Dong Li, Sitong Chen, Xudong Liu, YunDa Sun, Li Zhang

In this paper, we propose a balanced filter pruning method for both performance and pruning speed.

Holistic Grid Fusion Based Stop Line Estimation

no code implementations18 Sep 2020 Runsheng Xu, Faezeh Tafazzoli, Li Zhang, Timo Rehfeld, Gunther Krehl, Arunava Seal

Intersection scenarios provide the most complex traffic situations in Autonomous Driving and Driving Assistance Systems.

Autonomous Driving

Reasoning about Goals, Steps, and Temporal Ordering with WikiHow

1 code implementation EMNLP 2020 Li Zhang, Qing Lyu, Chris Callison-Burch

We propose a suite of reasoning tasks on two types of relations between procedural events: goal-step relations ("learn poses" is a step in the larger goal of "doing yoga") and step-step temporal relations ("buy a yoga mat" typically precedes "learn poses").

Cloze Test

Dual-constrained Deep Semi-Supervised Coupled Factorization Network with Enriched Prior

no code implementations8 Sep 2020 Yan Zhang, Zhao Zhang, Yang Wang, Zheng Zhang, Li Zhang, Shuicheng Yan, Meng Wang

Nonnegative matrix factorization is usually powerful for learning the "shallow" parts-based representation, but it clearly fails to discover deep hierarchical information within both the basis and representation spaces.

Graph Learning Representation Learning

Spatial Language Representation with Multi-Level Geocoding

no code implementations21 Aug 2020 Sayali Kulkarni, Shailee Jain, Mohammad Javad Hosseini, Jason Baldridge, Eugene Ie, Li Zhang

We present a multi-level geocoding model (MLG) that learns to associate texts to geographic locations.

Toponym Resolution

Zero-Shot Heterogeneous Transfer Learning from Recommender Systems to Cold-Start Search Retrieval

no code implementations7 Aug 2020 Tao Wu, Ellie Ka-In Chio, Heng-Tze Cheng, Yu Du, Steffen Rendle, Dima Kuzmin, Ritesh Agarwal, Li Zhang, John Anderson, Sarvjeet Singh, Tushar Chandra, Ed H. Chi, Wen Li, Ankit Kumar, Xiang Ma, Alex Soares, Nitin Jindal, Pei Cao

In light of these problems, we observed that most online content platforms have both a search and a recommender system that, while having heterogeneous input spaces, can be connected through their common output item space and a shared semantic representation.

Information Retrieval Recommendation Systems +2

Hybrid Template Canonical Correlation Analysis Method for Enhancing SSVEP Recognition under data-limited Condition

no code implementations7 Aug 2020 Runfeng Miao, Li Zhang, Qiang Sun

In this study, an advanced CCA-based algorithn called hybrid template canonical correlation analysis (HTCCA) was proposed to improve the performance of brain-computer interface (BCI) based on steady state visual evoked potential (SSVEP) uuder data-linited condition.

Electroencephalogram (EEG) Transfer Learning

Learning-based Computer-aided Prescription Model for Parkinson's Disease: A Data-driven Perspective

no code implementations31 Jul 2020 Yinghuan Shi, Wanqi Yang, Kim-Han Thung, Hao Wang, Yang Gao, Yang Pan, Li Zhang, Dinggang Shen

Then, we build a novel computer-aided prescription model by learning the relation between observed symptoms and prescription drug.

A Survey on Concept Factorization: From Shallow to Deep Representation Learning

no code implementations31 Jul 2020 Zhao Zhang, Yan Zhang, Mingliang Xu, Li Zhang, Yi Yang, Shuicheng Yan

In this paper, we therefore survey the recent advances on CF methodologies and the potential benchmarks by categorizing and summarizing the current methods.

Representation Learning

Improving Semantic Segmentation via Decoupled Body and Edge Supervision

2 code implementations ECCV 2020 Xiangtai Li, Xia Li, Li Zhang, Guangliang Cheng, Jianping Shi, Zhouchen Lin, Shaohua Tan, Yunhai Tong

Our insight is that appealing performance of semantic segmentation requires \textit{explicitly} modeling the object \textit{body} and \textit{edge}, which correspond to the high and low frequency of the image.

Semantic Segmentation

A novel deep learning-based method for monochromatic image synthesis from spectral CT using photon-counting detectors

no code implementations20 Jul 2020 Ao Zheng, Hongkai Yang, Li Zhang, Yuxiang Xing

To solve this problem, in this paper, we proposed a novel deep learning-based monochromatic image synthesis method working in sinogram domain.

Image Generation

XingGAN for Person Image Generation

2 code implementations ECCV 2020 Hao Tang, Song Bai, Li Zhang, Philip H. S. Torr, Nicu Sebe

We propose a novel Generative Adversarial Network (XingGAN or CrossingGAN) for person image generation tasks, i. e., translating the pose of a given person to a desired one.

 Ranked #1 on Pose Transfer on Market-1501 (IS metric)

Pose Transfer

How to trust unlabeled data? Instance Credibility Inference for Few-Shot Learning

2 code implementations15 Jul 2020 Yikai Wang, Li Zhang, Yuan YAO, Yanwei Fu

We rank the credibility of pseudo-labeled instances along the regularization path of their corresponding incidental parameters, and the most trustworthy pseudo-labeled examples are preserved as the augmented labeled instances.

Data Augmentation Few-Shot Learning

Egocentric Action Recognition by Video Attention and Temporal Context

no code implementations3 Jul 2020 Juan-Manuel Perez-Rua, Antoine Toisoul, Brais Martinez, Victor Escorcia, Li Zhang, Xiatian Zhu, Tao Xiang

In this challenge, action recognition is posed as the problem of simultaneously predicting a single `verb' and `noun' class label given an input trimmed video clip.

Action Recognition

PriceAggregator: An Intelligent System for Hotel Price Fetching

no code implementations30 Jun 2020 Jiangwei Zhang, Li Zhang, Vigneshwaran Raveendran, Ziv Ben-Zuk, Leonard Lu

The major challenge is that each supplier only allows Agoda to fetch for the hotel price with a limited amount of Queries Per Second (QPS).

Self-supervised Video Object Segmentation

no code implementations22 Jun 2020 Fangrui Zhu, Li Zhang, Yanwei Fu, Guodong Guo, Weidi Xie

The objective of this paper is self-supervised representation learning, with the goal of solving semi-supervised video object segmentation (a. k. a.

One-shot visual object segmentation Representation Learning +2

Long-Term Cloth-Changing Person Re-identification

no code implementations26 May 2020 Xuelin Qian, Wenxuan Wang, Li Zhang, Fangrui Zhu, Yanwei Fu, Tao Xiang, Yu-Gang Jiang, xiangyang xue

Specifically, we consider that under cloth-changes, soft-biometrics such as body shape would be more reliable.

Person Re-Identification

SentPWNet: A Unified Sentence Pair Weighting Network for Task-specific Sentence Embedding

no code implementations22 May 2020 Li Zhang, Han Wang, Lingxiao Li

Our model, SentPWNet, exploits the neighboring spatial distribution of each sentence as locality weight to indicate the informative level of sentence pair.

Metric Learning Sentence Embedding +2

A Survey on Deep Learning for Neuroimaging-based Brain Disorder Analysis

no code implementations10 May 2020 Li Zhang, Mingliang Wang, Mingxia Liu, Daoqiang Zhang

Deep learning has been recently used for the analysis of neuroimages, such as structural magnetic resonance imaging (MRI), functional MRI, and positron emission tomography (PET), and has achieved significant performance improvements over traditional machine learning in computer-aided diagnosis of brain disorders.

In-Vehicle Object Detection in the Wild for Driverless Vehicles

no code implementations27 Apr 2020 Ranjith Dinakaran, Li Zhang, Richard Jiang

In-vehicle human object identification plays an important role in vision-based automated vehicle driving systems while objects such as pedestrians and vehicles on roads or streets are the primary targets to protect from driverless vehicles.

object-detection Object Detection

Universal Adversarial Perturbations Generative Network for Speaker Recognition

1 code implementation7 Apr 2020 Jiguo Li, Xinfeng Zhang, Chuanmin Jia, Jizheng Xu, Li Zhang, Yue Wang, Siwei Ma, Wen Gao

Attacking deep learning based biometric systems has drawn more and more attention with the wide deployment of fingerprint/face/speaker recognition systems, given the fact that the neural networks are vulnerable to the adversarial examples, which have been intentionally perturbed to remain almost imperceptible for human.

Speaker Recognition

Direct Speech-to-image Translation

1 code implementation7 Apr 2020 Jiguo Li, Xinfeng Zhang, Chuanmin Jia, Jizheng Xu, Li Zhang, Yue Wang, Siwei Ma, Wen Gao

In this paper, we attempt to translate the speech signals into the image signals without the transcription stage.

Multimedia Sound Audio and Speech Processing

Learning to fool the speaker recognition

1 code implementation7 Apr 2020 Jiguo Li, Xinfeng Zhang, Jizheng Xu, Li Zhang, Yue Wang, Siwei Ma, Wen Gao

Due to the widespread deployment of fingerprint/face/speaker recognition systems, attacking deep learning based biometric systems has drawn more and more attention.

Audio and Speech Processing Cryptography and Security Sound