Search Results for author: Yao Zhao

Found 195 papers, 88 papers with code

ForumSum: A Multi-Speaker Conversation Summarization Dataset

no code implementations • Findings (EMNLP) 2021 • Misha Khalman, Yao Zhao, Mohammad Saleh

We also show that using a conversational corpus for pre-training improves the quality of the chat summarization model.

Paper
Add Code

Implicit Relation Linking for Question Answering over Knowledge Graph

no code implementations • Findings (ACL) 2022 • Yao Zhao, Jiacheng Huang, Wei Hu, Qijin Chen, Xiaoxia Qiu, Chengfu Huo, Weijun Ren

In this paper, we propose an implicit RL method called ImRL, which links relation phrases in NL to relation paths in KG.

Question Answering Relation +1

Paper
Add Code

Digging into contrastive learning for robust depth estimation with diffusion models

no code implementations • 15 Apr 2024 • Jiyuan Wang, Chunyu Lin, Lang Nie, Kang Liao, Shuwei Shao, Yao Zhao

In this paper, we propose a novel robust depth estimation method called D4RD, featuring a custom contrastive learning mode tailored for diffusion models to mitigate performance degradation in complex environments.

Contrastive Learning Denoising +2

Paper
Add Code

Transferable and Principled Efficiency for Open-Vocabulary Segmentation

2 code implementations • 11 Apr 2024 • Jingxuan Xu, Wuyang Chen, Yao Zhao, Yunchao Wei

In the context of efficient OVS, we target achieving performance that is comparable to or even better than prior OVS works based on large vision-language foundation models, by utilizing smaller models that incur lower training costs.

Model Compression

131

Paper
Code

TransformerLSR: Attentive Joint Model of Longitudinal Data, Survival, and Recurrent Events with Concurrent Latent Structure

no code implementations • 4 Apr 2024 • Zhiyue Zhang, Yao Zhao, Yanxun Xu

However, current methods only address joint modeling of longitudinal measurements at regularly-spaced observation times and survival events, neglecting recurrent events.

Epidemiology Point Processes

Paper
Add Code

Learning Trimaps via Clicks for Image Matting

1 code implementation • 30 Mar 2024 • Chenyi Zhang, Yihan Hu, Henghui Ding, Humphrey Shi, Yao Zhao, Yunchao Wei

Despite significant advancements in image matting, existing models heavily depend on manually-drawn trimaps for accurate results in natural image scenarios.

Image Matting

Paper
Code

BlindDiff: Empowering Degradation Modelling in Diffusion Models for Blind Image Super-Resolution

1 code implementation • 15 Mar 2024 • Feng Li, Yixuan Wu, Zichao Liang, Runmin Cong, Huihui Bai, Yao Zhao, Meng Wang

BlindDiff seamlessly integrates the MAP-based optimization into DMs, which constructs a joint distribution of the low-resolution (LR) observation, high-resolution (HR) data, and degradation kernels for the data and kernel priors, and solves the blind SR problem by unfolding MAP approach along with the reverse process.

Image Restoration Image Super-Resolution

Paper
Code

Frequency-Aware Deepfake Detection: Improving Generalizability through Frequency Space Learning

1 code implementation • 12 Mar 2024 • Chuangchuang Tan, Yao Zhao, Shikui Wei, Guanghua Gu, Ping Liu, Yunchao Wei

Consequently, these detectors have exhibited a lack of proficiency in learning the frequency domain and tend to overfit to the artifacts present in the training data, leading to suboptimal performance on unseen sources.

DeepFake Detection Face Swapping

Paper
Code

Learning Hierarchical Color Guidance for Depth Map Super-Resolution

no code implementations • 12 Mar 2024 • Runmin Cong, Ronghui Sheng, Hao Wu, Yulan Guo, Yunchao Wei, WangMeng Zuo, Yao Zhao, Sam Kwong

On the one hand, the low-level detail embedding module is designed to supplement high-frequency color information of depth features in a residual mask manner at the low-level stages.

Depth Map Super-Resolution

Paper
Add Code

Data-Independent Operator: A Training-Free Artifact Representation Extractor for Generalizable Deepfake Detection

1 code implementation • 11 Mar 2024 • Chuangchuang Tan, Ping Liu, Renshuai Tao, Huan Liu, Yao Zhao, Baoyuan Wu, Yunchao Wei

Due to its unbias towards both the training and test sources, we define it as Data-Independent Operator (DIO) to achieve appealing improvements on unseen sources.

DeepFake Detection Face Swapping

Paper
Code

SiGNN: A Spike-induced Graph Neural Network for Dynamic Graph Representation Learning

no code implementations • 11 Mar 2024 • Dong Chen, Shuai Zheng, Muhao Xu, Zhenfeng Zhu, Yao Zhao

In the domain of dynamic graph representation learning (DGRL), the efficient and comprehensive capture of temporal evolution within real-world networks is crucial.

Graph Representation Learning Node Classification

Paper
Add Code

Eliminating Warping Shakes for Unsupervised Online Video Stitching

1 code implementation • 11 Mar 2024 • Lang Nie, Chunyu Lin, Kang Liao, Yun Zhang, Shuaicheng Liu, Yao Zhao

In this paper, we retarget video stitching to an emerging issue, named warping shake, when extending image stitching to video stitching.

Image Stitching Video Stabilization

Paper
Code

Towards the Uncharted: Density-Descending Feature Perturbation for Semi-supervised Semantic Segmentation

no code implementations • 11 Mar 2024 • Xiaoyang Wang, Huihui Bai, Limin Yu, Yao Zhao, Jimin Xiao

Inspired by the low-density separation assumption in semi-supervised learning, our key insight is that feature density can shed a light on the most promising direction for the segmentation classifier to explore, which is the regions with lower density.

Semi-Supervised Semantic Segmentation

Paper
Add Code

Query-guided Prototype Evolution Network for Few-Shot Segmentation

no code implementations • 11 Mar 2024 • Runmin Cong, Hang Xiong, Jinpeng Chen, Wei zhang, Qingming Huang, Yao Zhao

To address this, we present the Query-guided Prototype Evolution Network (QPENet), a new method that integrates query features into the generation process of foreground and background prototypes, thereby yielding customized prototypes attuned to specific queries.

Segmentation

Paper
Add Code

Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

no code implementations • 8 Mar 2024 • Machel Reid, Nikolay Savinov, Denis Teplyashin, Dmitry Lepikhin, Timothy Lillicrap, Jean-Baptiste Alayrac, Radu Soricut, Angeliki Lazaridou, Orhan Firat, Julian Schrittwieser, Ioannis Antonoglou, Rohan Anil, Sebastian Borgeaud, Andrew Dai, Katie Millican, Ethan Dyer, Mia Glaese, Thibault Sottiaux, Benjamin Lee, Fabio Viola, Malcolm Reynolds, Yuanzhong Xu, James Molloy, Jilin Chen, Michael Isard, Paul Barham, Tom Hennigan, Ross Mcilroy, Melvin Johnson, Johan Schalkwyk, Eli Collins, Eliza Rutherford, Erica Moreira, Kareem Ayoub, Megha Goel, Clemens Meyer, Gregory Thornton, Zhen Yang, Henryk Michalewski, Zaheer Abbas, Nathan Schucher, Ankesh Anand, Richard Ives, James Keeling, Karel Lenc, Salem Haykal, Siamak Shakeri, Pranav Shyam, Aakanksha Chowdhery, Roman Ring, Stephen Spencer, Eren Sezener, Luke Vilnis, Oscar Chang, Nobuyuki Morioka, George Tucker, Ce Zheng, Oliver Woodman, Nithya Attaluri, Tomas Kocisky, Evgenii Eltyshev, Xi Chen, Timothy Chung, Vittorio Selo, Siddhartha Brahma, Petko Georgiev, Ambrose Slone, Zhenkai Zhu, James Lottes, Siyuan Qiao, Ben Caine, Sebastian Riedel, Alex Tomala, Martin Chadwick, Juliette Love, Peter Choy, Sid Mittal, Neil Houlsby, Yunhao Tang, Matthew Lamm, Libin Bai, Qiao Zhang, Luheng He, Yong Cheng, Peter Humphreys, Yujia Li, Sergey Brin, Albin Cassirer, Yingjie Miao, Lukas Zilka, Taylor Tobin, Kelvin Xu, Lev Proleev, Daniel Sohn, Alberto Magni, Lisa Anne Hendricks, Isabel Gao, Santiago Ontañón, Oskar Bunyan, Nathan Byrd, Abhanshu Sharma, Biao Zhang, Mario Pinto, Rishika Sinha, Harsh Mehta, Dawei Jia, Sergi Caelles, Albert Webson, Alex Morris, Becca Roelofs, Yifan Ding, Robin Strudel, Xuehan Xiong, Marvin Ritter, Mostafa Dehghani, Rahma Chaabouni, Abhijit Karmarkar, Guangda Lai, Fabian Mentzer, Bibo Xu, Yaguang Li, Yujing Zhang, Tom Le Paine, Alex Goldin, Behnam Neyshabur, Kate Baumli, Anselm Levskaya, Michael Laskin, Wenhao Jia, Jack W. Rae, Kefan Xiao, Antoine He, Skye Giordano, Lakshman Yagati, Jean-Baptiste Lespiau, Paul Natsev, Sanjay Ganapathy, Fangyu Liu, Danilo Martins, Nanxin Chen, Yunhan Xu, Megan Barnes, Rhys May, Arpi Vezer, Junhyuk Oh, Ken Franko, Sophie Bridgers, Ruizhe Zhao, Boxi Wu, Basil Mustafa, Sean Sechrist, Emilio Parisotto, Thanumalayan Sankaranarayana Pillai, Chris Larkin, Chenjie Gu, Christina Sorokin, Maxim Krikun, Alexey Guseynov, Jessica Landon, Romina Datta, Alexander Pritzel, Phoebe Thacker, Fan Yang, Kevin Hui, Anja Hauth, Chih-Kuan Yeh, David Barker, Justin Mao-Jones, Sophia Austin, Hannah Sheahan, Parker Schuh, James Svensson, Rohan Jain, Vinay Ramasesh, Anton Briukhov, Da-Woon Chung, Tamara von Glehn, Christina Butterfield, Priya Jhakra, Matthew Wiethoff, Justin Frye, Jordan Grimstad, Beer Changpinyo, Charline Le Lan, Anna Bortsova, Yonghui Wu, Paul Voigtlaender, Tara Sainath, Charlotte Smith, Will Hawkins, Kris Cao, James Besley, Srivatsan Srinivasan, Mark Omernick, Colin Gaffney, Gabriela Surita, Ryan Burnell, Bogdan Damoc, Junwhan Ahn, Andrew Brock, Mantas Pajarskas, Anastasia Petrushkina, Seb Noury, Lorenzo Blanco, Kevin Swersky, Arun Ahuja, Thi Avrahami, Vedant Misra, Raoul de Liedekerke, Mariko Iinuma, Alex Polozov, Sarah York, George van den Driessche, Paul Michel, Justin Chiu, Rory Blevins, Zach Gleicher, Adrià Recasens, Alban Rrustemi, Elena Gribovskaya, Aurko Roy, Wiktor Gworek, Séb Arnold, Lisa Lee, James Lee-Thorp, Marcello Maggioni, Enrique Piqueras, Kartikeya Badola, Sharad Vikram, Lucas Gonzalez, Anirudh Baddepudi, Evan Senter, Jacob Devlin, James Qin, Michael Azzam, Maja Trebacz, Martin Polacek, Kashyap Krishnakumar, Shuo-Yiin Chang, Matthew Tung, Ivo Penchev, Rishabh Joshi, Kate Olszewska, Carrie Muir, Mateo Wirth, Ale Jakse Hartman, Josh Newlan, Sheleem Kashem, Vijay Bolina, Elahe Dabir, Joost van Amersfoort, Zafarali Ahmed, James Cobon-Kerr, Aishwarya Kamath, Arnar Mar Hrafnkelsson, Le Hou, Ian Mackinnon, Alexandre Frechette, Eric Noland, Xiance Si, Emanuel Taropa, Dong Li, Phil Crone, Anmol Gulati, Sébastien Cevey, Jonas Adler, Ada Ma, David Silver, Simon Tokumine, Richard Powell, Stephan Lee, Michael Chang, Samer Hassan, Diana Mincu, Antoine Yang, Nir Levine, Jenny Brennan, Mingqiu Wang, Sarah Hodkinson, Jeffrey Zhao, Josh Lipschultz, Aedan Pope, Michael B. Chang, Cheng Li, Laurent El Shafey, Michela Paganini, Sholto Douglas, Bernd Bohnet, Fabio Pardo, Seth Odoom, Mihaela Rosca, Cicero Nogueira dos santos, Kedar Soparkar, Arthur Guez, Tom Hudson, Steven Hansen, Chulayuth Asawaroengchai, Ravi Addanki, Tianhe Yu, Wojciech Stokowiec, Mina Khan, Justin Gilmer, Jaehoon Lee, Carrie Grimes Bostock, Keran Rong, Jonathan Caton, Pedram Pejman, Filip Pavetic, Geoff Brown, Vivek Sharma, Mario Lučić, Rajkumar Samuel, Josip Djolonga, Amol Mandhane, Lars Lowe Sjösund, Elena Buchatskaya, Elspeth White, Natalie Clay, Jiepu Jiang, Hyeontaek Lim, Ross Hemsley, Jane Labanowski, Nicola De Cao, David Steiner, Sayed Hadi Hashemi, Jacob Austin, Anita Gergely, Tim Blyth, Joe Stanton, Kaushik Shivakumar, Aditya Siddhant, Anders Andreassen, Carlos Araya, Nikhil Sethi, Rakesh Shivanna, Steven Hand, Ankur Bapna, Ali Khodaei, Antoine Miech, Garrett Tanzer, Andy Swing, Shantanu Thakoor, Zhufeng Pan, Zachary Nado, Stephanie Winkler, Dian Yu, Mohammad Saleh, Loren Maggiore, Iain Barr, Minh Giang, Thais Kagohara, Ivo Danihelka, Amit Marathe, Vladimir Feinberg, Nimesh Ghelani, Dan Horgan, Helen Miller, Lexi Walker, Richard Tanburn, Mukarram Tariq, Disha Shrivastava, Fei Xia, Chung-Cheng Chiu, Khuslen Baatarsukh, Sina Samangooei, Fred Alcober, Axel Stjerngren, Paul Komarek, Katerina Tsihlas, Anudhyan Boral, Ramona Comanescu, Jeremy Chen, Ruibo Liu, Dawn Bloxwich, Charlie Chen, Yanhua Sun, Fangxiaoyu Feng, Matthew Mauger, Xerxes Dotiwalla, Vincent Hellendoorn, Michael Sharman, Ivy Zheng, Krishna Haridasan, Gabe Barth-Maron, Craig Swanson, Dominika Rogozińska, Alek Andreev, Paul Kishan Rubenstein, Ruoxin Sang, Dan Hurt, Gamaleldin Elsayed, Renshen Wang, Dave Lacey, Anastasija Ilić, Yao Zhao, Lora Aroyo, Chimezie Iwuanyanwu, Vitaly Nikolaev, Balaji Lakshminarayanan, Sadegh Jazayeri, Raphaël Lopez Kaufman, Mani Varadarajan, Chetan Tekur, Doug Fritz, Misha Khalman, David Reitter, Kingshuk Dasgupta, Shourya Sarcar, Tina Ornduff, Javier Snaider, Fantine Huot, Johnson Jia, Rupert Kemp, Nejc Trdin, Anitha Vijayakumar, Lucy Kim, Christof Angermueller, Li Lao, Tianqi Liu, Haibin Zhang, David Engel, Somer Greene, Anaïs White, Jessica Austin, Lilly Taylor, Shereen Ashraf, Dangyi Liu, Maria Georgaki, Irene Cai, Yana Kulizhskaya, Sonam Goenka, Brennan Saeta, Kiran Vodrahalli, Christian Frank, Dario de Cesare, Brona Robenek, Harry Richardson, Mahmoud Alnahlawi, Christopher Yew, Priya Ponnapalli, Marco Tagliasacchi, Alex Korchemniy, Yelin Kim, Dinghua Li, Bill Rosgen, Zoe Ashwood, Kyle Levin, Jeremy Wiesner, Praseem Banzal, Praveen Srinivasan, Hongkun Yu, Çağlar Ünlü, David Reid, Zora Tung, Daniel Finchelstein, Ravin Kumar, Andre Elisseeff, Jin Huang, Ming Zhang, Rui Zhu, Ricardo Aguilar, Mai Giménez, Jiawei Xia, Olivier Dousse, Willi Gierke, Soheil Hassas Yeganeh, Damion Yates, Komal Jalan, Lu Li, Eri Latorre-Chimoto, Duc Dung Nguyen, Ken Durden, Praveen Kallakuri, Yaxin Liu, Matthew Johnson, Tomy Tsai, Alice Talbert, Jasmine Liu, Alexander Neitz, Chen Elkind, Marco Selvi, Mimi Jasarevic, Livio Baldini Soares, Albert Cui, Pidong Wang, Alek Wenjiao Wang, Xinyu Ye, Krystal Kallarackal, Lucia Loher, Hoi Lam, Josef Broder, Dan Holtmann-Rice, Nina Martin, Bramandia Ramadhana, Daniel Toyama, Mrinal Shukla, Sujoy Basu, Abhi Mohan, Nick Fernando, Noah Fiedel, Kim Paterson, Hui Li, Ankush Garg, Jane Park, DongHyun Choi, Diane Wu, Sankalp Singh, Zhishuai Zhang, Amir Globerson, Lily Yu, John Carpenter, Félix de Chaumont Quitry, Carey Radebaugh, Chu-Cheng Lin, Alex Tudor, Prakash Shroff, Drew Garmon, Dayou Du, Neera Vats, Han Lu, Shariq Iqbal, Alex Yakubovich, Nilesh Tripuraneni, James Manyika, Haroon Qureshi, Nan Hua, Christel Ngani, Maria Abi Raad, Hannah Forbes, Anna Bulanova, Jeff Stanway, Mukund Sundararajan, Victor Ungureanu, Colton Bishop, Yunjie Li, Balaji Venkatraman, Bo Li, Chloe Thornton, Salvatore Scellato, Nishesh Gupta, Yicheng Wang, Ian Tenney, Xihui Wu, Ashish Shenoy, Gabriel Carvajal, Diana Gage Wright, Ben Bariach, Zhuyun Xiao, Peter Hawkins, Sid Dalmia, Clement Farabet, Pedro Valenzuela, Quan Yuan, Chris Welty, Ananth Agarwal, Mia Chen, Wooyeol Kim, Brice Hulse, Nandita Dukkipati, Adam Paszke, Andrew Bolt, Elnaz Davoodi, Kiam Choo, Jennifer Beattie, Jennifer Prendki, Harsha Vashisht, Rebeca Santamaria-Fernandez, Luis C. Cobo, Jarek Wilkiewicz, David Madras, Ali Elqursh, Grant Uy, Kevin Ramirez, Matt Harvey, Tyler Liechty, Heiga Zen, Jeff Seibert, Clara Huiyi Hu, Mohamed Elhawaty, Andrey Khorlin, Maigo Le, Asaf Aharoni, Megan Li, Lily Wang, Sandeep Kumar, Alejandro Lince, Norman Casagrande, Jay Hoover, Dalia El Badawy, David Soergel, Denis Vnukov, Matt Miecnikowski, Jiri Simsa, Anna Koop, Praveen Kumar, Thibault Sellam, Daniel Vlasic, Samira Daruki, Nir Shabat, John Zhang, Guolong Su, Jiageng Zhang, Jeremiah Liu, Yi Sun, Evan Palmer, Alireza Ghaffarkhah, Xi Xiong, Victor Cotruta, Michael Fink, Lucas Dixon, Ashwin Sreevatsa, Adrian Goedeckemeyer, Alek Dimitriev, Mohsen Jafari, Remi Crocker, Nicholas FitzGerald, Aviral Kumar, Sanjay Ghemawat, Ivan Philips, Frederick Liu, Yannie Liang, Rachel Sterneck, Alena Repina, Marcus Wu, Laura Knight, Marin Georgiev, Hyo Lee, Harry Askham, Abhishek Chakladar, Annie Louis, Carl Crous, Hardie Cate, Dessie Petrova, MICHAEL QUINN, Denese Owusu-Afriyie, Achintya Singhal, Nan Wei, Solomon Kim, Damien Vincent, Milad Nasr, Christopher A. Choquette-Choo, Reiko Tojo, Shawn Lu, Diego de Las Casas, Yuchung Cheng, Tolga Bolukbasi, Katherine Lee, Saaber Fatehi, Rajagopal Ananthanarayanan, Miteyan Patel, Charbel Kaed, Jing Li, Jakub Sygnowski, Shreyas Rammohan Belle, Zhe Chen, Jaclyn Konzelmann, Siim Põder, Roopal Garg, Vinod Koverkathu, Adam Brown, Chris Dyer, Rosanne Liu, Azade Nova, Jun Xu, Slav Petrov, Demis Hassabis, Koray Kavukcuoglu, Jeffrey Dean, Oriol Vinyals

In this report, we present the latest model of the Gemini family, Gemini 1. 5 Pro, a highly compute-efficient multimodal mixture-of-experts model capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio.

Ranked #19 on Code Generation on HumanEval

Code Generation Retrieval

Paper
Add Code

Region-Adaptive Transform with Segmentation Prior for Image Compression

1 code implementation • 1 Mar 2024 • Yuxi Liu, Wenhan Yang, Huihui Bai, Yunchao Wei, Yao Zhao

However, there is no prior research on neural transform that focuses on specific regions.

Ranked #1 on Image Compression on kodak

Image Compression Segmentation

Paper
Code

Direct Language Model Alignment from Online AI Feedback

no code implementations • 7 Feb 2024 • Shangmin Guo, Biao Zhang, Tianlin Liu, Tianqi Liu, Misha Khalman, Felipe Llinares, Alexandre Rame, Thomas Mesnard, Yao Zhao, Bilal Piot, Johan Ferret, Mathieu Blondel

Moreover, responses in these datasets are often sampled from a language model distinct from the one being aligned, and since the model evolves over training, the alignment phase is inevitably off-policy.

Language Modelling

Paper
Add Code

LiPO: Listwise Preference Optimization through Learning-to-Rank

no code implementations • 2 Feb 2024 • Tianqi Liu, Zhen Qin, Junru Wu, Jiaming Shen, Misha Khalman, Rishabh Joshi, Yao Zhao, Mohammad Saleh, Simon Baumgartner, Jialu Liu, Peter J. Liu, Xuanhui Wang

In this work, we formulate the LM alignment as a listwise ranking problem and describe the Listwise Preference Optimization (LiPO) framework, where the policy can potentially learn more effectively from a ranked list of plausible responses given the prompt.

Learning-To-Rank

Paper
Add Code

EASRec: Elastic Architecture Search for Efficient Long-term Sequential Recommender Systems

no code implementations • 1 Feb 2024 • Sheng Zhang, Maolin Wang, Yao Zhao, Chenyi Zhuang, Jinjie Gu, Ruocheng Guo, Xiangyu Zhao, Zijian Zhang, Hongzhi Yin

Our research addresses the computational and resource inefficiencies that current Sequential Recommender Systems (SRSs) suffer from.

Neural Architecture Search Recommendation Systems

Paper
Add Code

One for all: A novel Dual-space Co-training baseline for Large-scale Multi-View Clustering

no code implementations • 28 Jan 2024 • Zisen Kong, Zhiqiang Fu, Dongxia Chang, Yiming Wang, Yao Zhao

We jointly optimize the construction of the latent consistent anchor graph and the feature transformation to generate a discriminative anchor graph.

Clustering

Paper
Add Code

Semi-Supervised Coupled Thin-Plate Spline Model for Rotation Correction and Beyond

1 code implementation • 24 Jan 2024 • Lang Nie, Chunyu Lin, Kang Liao, Shuaicheng Liu, Yao Zhao

To break this bottleneck, we propose the coupled thin-plate spline model (CoupledTPS), which iteratively couples multiple TPS with limited control points into a more flexible and powerful transformation.

Paper
Code

DeepSeek LLM: Scaling Open-Source Language Models with Longtermism

1 code implementation • 5 Jan 2024 • DeepSeek-AI, :, Xiao Bi, Deli Chen, Guanting Chen, Shanhuang Chen, Damai Dai, Chengqi Deng, Honghui Ding, Kai Dong, Qiushi Du, Zhe Fu, Huazuo Gao, Kaige Gao, Wenjun Gao, Ruiqi Ge, Kang Guan, Daya Guo, JianZhong Guo, Guangbo Hao, Zhewen Hao, Ying He, Wenjie Hu, Panpan Huang, Erhang Li, Guowei Li, Jiashi Li, Yao Li, Y. K. Li, Wenfeng Liang, Fangyun Lin, A. X. Liu, Bo Liu, Wen Liu, Xiaodong Liu, Xin Liu, Yiyuan Liu, Haoyu Lu, Shanghao Lu, Fuli Luo, Shirong Ma, Xiaotao Nie, Tian Pei, Yishi Piao, Junjie Qiu, Hui Qu, Tongzheng Ren, Zehui Ren, Chong Ruan, Zhangli Sha, Zhihong Shao, Junxiao Song, Xuecheng Su, Jingxiang Sun, Yaofeng Sun, Minghui Tang, Bingxuan Wang, Peiyi Wang, Shiyu Wang, Yaohui Wang, Yongji Wang, Tong Wu, Y. Wu, Xin Xie, Zhenda Xie, Ziwei Xie, Yiliang Xiong, Hanwei Xu, R. X. Xu, Yanhong Xu, Dejian Yang, Yuxiang You, Shuiping Yu, Xingkai Yu, B. Zhang, Haowei Zhang, Lecong Zhang, Liyue Zhang, Mingchuan Zhang, Minghua Zhang, Wentao Zhang, Yichao Zhang, Chenggang Zhao, Yao Zhao, Shangyan Zhou, Shunfeng Zhou, Qihao Zhu, Yuheng Zou

The rapid development of open-source large language models (LLMs) has been truly remarkable.

1,114

Paper
Code

4DGen: Grounded 4D Content Generation with Spatial-temporal Consistency

no code implementations • 28 Dec 2023 • Yuyang Yin, Dejia Xu, Zhangyang Wang, Yao Zhao, Yunchao Wei

Our pipeline facilitates conditional 4D generation, enabling users to specify geometry (3D assets) and motion (monocular videos), thus offering superior control over content creation.

Prompt Engineering

Paper
Add Code

Forgery-aware Adaptive Transformer for Generalizable Synthetic Image Detection

no code implementations • 27 Dec 2023 • Huan Liu, Zichang Tan, Chuangchuang Tan, Yunchao Wei, Yao Zhao, Jingdong Wang

In this paper, we study the problem of generalizable synthetic image detection, aiming to detect forgery images from diverse generative methods, e. g., GANs and diffusion models.

Attribute Synthetic Image Detection

Paper
Add Code

360 Layout Estimation via Orthogonal Planes Disentanglement and Multi-view Geometric Consistency Perception

no code implementations • 26 Dec 2023 • Zhijie Shen, Chunyu Lin, Junsong Zhang, Lang Nie, Kang Liao, Yao Zhao

Existing panoramic layout estimation solutions tend to recover room boundaries from a vertically compressed sequence, yielding imprecise results as the compression process often muddles the semantics between various planes.

Disentanglement

Paper
Add Code

Lookahead: An Inference Acceleration Framework for Large Language Model with Lossless Generation Accuracy

1 code implementation • 20 Dec 2023 • Yao Zhao, Zhitian Xie, Chenyi Zhuang, Jinjie Gu

Hence, this paper presents a generic framework for accelerating the inference process, resulting in a substantial increase in speed and cost reduction for our RAG system, with lossless generation accuracy.

Language Modelling Large Language Model +3

236

Paper
Code

Gemini: A Family of Highly Capable Multimodal Models

no code implementations • The Keyword 2023 • Gemini Team, Rohan Anil, Sebastian Borgeaud, Jean-Baptiste Alayrac, Jiahui Yu, Radu Soricut, Johan Schalkwyk, Andrew M. Dai, Anja Hauth, Katie Millican, David Silver, Melvin Johnson, Ioannis Antonoglou, Julian Schrittwieser, Amelia Glaese, Jilin Chen, Emily Pitler, Timothy Lillicrap, Angeliki Lazaridou, Orhan Firat, James Molloy, Michael Isard, Paul R. Barham, Tom Hennigan, Benjamin Lee, Fabio Viola, Malcolm Reynolds, Yuanzhong Xu, Ryan Doherty, Eli Collins, Clemens Meyer, Eliza Rutherford, Erica Moreira, Kareem Ayoub, Megha Goel, Jack Krawczyk, Ed Chi, Heng-Tze Cheng, Eric Ni, Purvi Shah, Patrick Kane, Betty Chan, Manaal Faruqui, Aliaksei Severyn, Hanzhao Lin, Yaguang Li, Yong Cheng, Mahdis Mahdieh, Mia Chen, Pei Sun, Dustin Tran, Sumit Bagri, Balaji Lakshminarayanan, Jeremiah Liu, Andras Orban, Fabian Güra, Hao Zhou, Xinying Song, Aurelien Boffy, Harish Ganapathy, Steven Zheng, HyunJeong Choe, Ágoston Weisz, Tao Zhu, Yifeng Lu, Siddharth Gopal, Jarrod Kahn, Maciej Kula, Jeff Pitman, Rushin Shah, Emanuel Taropa, Majd Al Merey, Martin Baeuml, Zhifeng Chen, Laurent El Shafey, Yujing Zhang, Olcan Sercinoglu, George Tucker, Enrique Piqueras, Maxim Krikun, Iain Barr, Nikolay Savinov, Ivo Danihelka, Becca Roelofs, Anaïs White, Anders Andreassen, Tamara von Glehn, Lakshman Yagati, Mehran Kazemi, Lucas Gonzalez, Misha Khalman, Jakub Sygnowski, Alexandre Frechette, Charlotte Smith, Laura Culp, Lev Proleev, Yi Luan, Xi Chen, James Lottes, Nathan Schucher, Federico Lebron, Alban Rrustemi, Natalie Clay, Phil Crone, Tomas Kocisky, Jeffrey Zhao, Bartek Perz, Dian Yu, Heidi Howard, Adam Bloniarz, Jack W. Rae, Han Lu, Laurent SIfre, Marcello Maggioni, Fred Alcober, Dan Garrette, Megan Barnes, Shantanu Thakoor, Jacob Austin, Gabriel Barth-Maron, William Wong, Rishabh Joshi, Rahma Chaabouni, Deeni Fatiha, Arun Ahuja, Gaurav Singh Tomar, Evan Senter, Martin Chadwick, Ilya Kornakov, Nithya Attaluri, Iñaki Iturrate, Ruibo Liu, Yunxuan Li, Sarah Cogan, Jeremy Chen, Chao Jia, Chenjie Gu, Qiao Zhang, Jordan Grimstad, Ale Jakse Hartman, Xavier Garcia, Thanumalayan Sankaranarayana Pillai, Jacob Devlin, Michael Laskin, Diego de Las Casas, Dasha Valter, Connie Tao, Lorenzo Blanco, Adrià Puigdomènech Badia, David Reitter, Mianna Chen, Jenny Brennan, Clara Rivera, Sergey Brin, Shariq Iqbal, Gabriela Surita, Jane Labanowski, Abhi Rao, Stephanie Winkler, Emilio Parisotto, Yiming Gu, Kate Olszewska, Ravi Addanki, Antoine Miech, Annie Louis, Denis Teplyashin, Geoff Brown, Elliot Catt, Jan Balaguer, Jackie Xiang, Pidong Wang, Zoe Ashwood, Anton Briukhov, Albert Webson, Sanjay Ganapathy, Smit Sanghavi, Ajay Kannan, Ming-Wei Chang, Axel Stjerngren, Josip Djolonga, Yuting Sun, Ankur Bapna, Matthew Aitchison, Pedram Pejman, Henryk Michalewski, Tianhe Yu, Cindy Wang, Juliette Love, Junwhan Ahn, Dawn Bloxwich, Kehang Han, Peter Humphreys, Thibault Sellam, James Bradbury, Varun Godbole, Sina Samangooei, Bogdan Damoc, Alex Kaskasoli, Sébastien M. R. Arnold, Vijay Vasudevan, Shubham Agrawal, Jason Riesa, Dmitry Lepikhin, Richard Tanburn, Srivatsan Srinivasan, Hyeontaek Lim, Sarah Hodkinson, Pranav Shyam, Johan Ferret, Steven Hand, Ankush Garg, Tom Le Paine, Jian Li, Yujia Li, Minh Giang, Alexander Neitz, Zaheer Abbas, Sarah York, Machel Reid, Elizabeth Cole, Aakanksha Chowdhery, Dipanjan Das, Dominika Rogozińska, Vitaliy Nikolaev, Pablo Sprechmann, Zachary Nado, Lukas Zilka, Flavien Prost, Luheng He, Marianne Monteiro, Gaurav Mishra, Chris Welty, Josh Newlan, Dawei Jia, Miltiadis Allamanis, Clara Huiyi Hu, Raoul de Liedekerke, Justin Gilmer, Carl Saroufim, Shruti Rijhwani, Shaobo Hou, Disha Shrivastava, Anirudh Baddepudi, Alex Goldin, Adnan Ozturel, Albin Cassirer, Yunhan Xu, Daniel Sohn, Devendra Sachan, Reinald Kim Amplayo, Craig Swanson, Dessie Petrova, Shashi Narayan, Arthur Guez, Siddhartha Brahma, Jessica Landon, Miteyan Patel, Ruizhe Zhao, Kevin Villela, Luyu Wang, Wenhao Jia, Matthew Rahtz, Mai Giménez, Legg Yeung, James Keeling, Petko Georgiev, Diana Mincu, Boxi Wu, Salem Haykal, Rachel Saputro, Kiran Vodrahalli, James Qin, Zeynep Cankara, Abhanshu Sharma, Nick Fernando, Will Hawkins, Behnam Neyshabur, Solomon Kim, Adrian Hutter, Priyanka Agrawal, Alex Castro-Ros, George van den Driessche, Tao Wang, Shuo-Yiin Chang, Paul Komarek, Ross Mcilroy, Mario Lučić, Guodong Zhang, Wael Farhan, Michael Sharman, Paul Natsev, Paul Michel, Yamini Bansal, Siyuan Qiao, Kris Cao, Siamak Shakeri, Christina Butterfield, Justin Chung, Paul Kishan Rubenstein, Shivani Agrawal, Arthur Mensch, Kedar Soparkar, Karel Lenc, Timothy Chung, Aedan Pope, Loren Maggiore, Jackie Kay, Priya Jhakra, Shibo Wang, Joshua Maynez, Mary Phuong, Taylor Tobin, Andrea Tacchetti, Maja Trebacz, Kevin Robinson, Yash Katariya, Sebastian Riedel, Paige Bailey, Kefan Xiao, Nimesh Ghelani, Lora Aroyo, Ambrose Slone, Neil Houlsby, Xuehan Xiong, Zhen Yang, Elena Gribovskaya, Jonas Adler, Mateo Wirth, Lisa Lee, Music Li, Thais Kagohara, Jay Pavagadhi, Sophie Bridgers, Anna Bortsova, Sanjay Ghemawat, Zafarali Ahmed, Tianqi Liu, Richard Powell, Vijay Bolina, Mariko Iinuma, Polina Zablotskaia, James Besley, Da-Woon Chung, Timothy Dozat, Ramona Comanescu, Xiance Si, Jeremy Greer, Guolong Su, Martin Polacek, Raphaël Lopez Kaufman, Simon Tokumine, Hexiang Hu, Elena Buchatskaya, Yingjie Miao, Mohamed Elhawaty, Aditya Siddhant, Nenad Tomasev, Jinwei Xing, Christina Greer, Helen Miller, Shereen Ashraf, Aurko Roy, Zizhao Zhang, Ada Ma, Angelos Filos, Milos Besta, Rory Blevins, Ted Klimenko, Chih-Kuan Yeh, Soravit Changpinyo, Jiaqi Mu, Oscar Chang, Mantas Pajarskas, Carrie Muir, Vered Cohen, Charline Le Lan, Krishna Haridasan, Amit Marathe, Steven Hansen, Sholto Douglas, Rajkumar Samuel, Mingqiu Wang, Sophia Austin, Chang Lan, Jiepu Jiang, Justin Chiu, Jaime Alonso Lorenzo, Lars Lowe Sjösund, Sébastien Cevey, Zach Gleicher, Thi Avrahami, Anudhyan Boral, Hansa Srinivasan, Vittorio Selo, Rhys May, Konstantinos Aisopos, Léonard Hussenot, Livio Baldini Soares, Kate Baumli, Michael B. Chang, Adrià Recasens, Ben Caine, Alexander Pritzel, Filip Pavetic, Fabio Pardo, Anita Gergely, Justin Frye, Vinay Ramasesh, Dan Horgan, Kartikeya Badola, Nora Kassner, Subhrajit Roy, Ethan Dyer, Víctor Campos Campos, Alex Tomala, Yunhao Tang, Dalia El Badawy, Elspeth White, Basil Mustafa, Oran Lang, Abhishek Jindal, Sharad Vikram, Zhitao Gong, Sergi Caelles, Ross Hemsley, Gregory Thornton, Fangxiaoyu Feng, Wojciech Stokowiec, Ce Zheng, Phoebe Thacker, Çağlar Ünlü, Zhishuai Zhang, Mohammad Saleh, James Svensson, Max Bileschi, Piyush Patil, Ankesh Anand, Roman Ring, Katerina Tsihlas, Arpi Vezer, Marco Selvi, Toby Shevlane, Mikel Rodriguez, Tom Kwiatkowski, Samira Daruki, Keran Rong, Allan Dafoe, Nicholas FitzGerald, Keren Gu-Lemberg, Mina Khan, Lisa Anne Hendricks, Marie Pellat, Vladimir Feinberg, James Cobon-Kerr, Tara Sainath, Maribeth Rauh, Sayed Hadi Hashemi, Richard Ives, Yana Hasson, Eric Noland, Yuan Cao, Nathan Byrd, Le Hou, Qingze Wang, Thibault Sottiaux, Michela Paganini, Jean-Baptiste Lespiau, Alexandre Moufarek, Samer Hassan, Kaushik Shivakumar, Joost van Amersfoort, Amol Mandhane, Pratik Joshi, Anirudh Goyal, Matthew Tung, Andrew Brock, Hannah Sheahan, Vedant Misra, Cheng Li, Nemanja Rakićević, Mostafa Dehghani, Fangyu Liu, Sid Mittal, Junhyuk Oh, Seb Noury, Eren Sezener, Fantine Huot, Matthew Lamm, Nicola De Cao, Charlie Chen, Sidharth Mudgal, Romina Stella, Kevin Brooks, Gautam Vasudevan, Chenxi Liu, Mainak Chain, Nivedita Melinkeri, Aaron Cohen, Venus Wang, Kristie Seymore, Sergey Zubkov, Rahul Goel, Summer Yue, Sai Krishnakumaran, Brian Albert, Nate Hurley, Motoki Sano, Anhad Mohananey, Jonah Joughin, Egor Filonov, Tomasz Kępa, Yomna Eldawy, Jiawern Lim, Rahul Rishi, Shirin Badiezadegan, Taylor Bos, Jerry Chang, Sanil Jain, Sri Gayatri Sundara Padmanabhan, Subha Puttagunta, Kalpesh Krishna, Leslie Baker, Norbert Kalb, Vamsi Bedapudi, Shuntong Lei, Anthony Yu, Oren Litvin, Xiang Zhou, Zhichun Wu, Sam Sobell, Andrea Siciliano, Alan Papir, Robby Neale, Jonas Bragagnolo, Tej Toor, Tina Chen, Valentin Anklin, Feiran Wang, Richie Feng, Milad Gholami, Kevin Ling, Lijuan Liu, Jules Walter, Hamid Moghaddam, Arun Kishore, Jakub Adamek, Tyler Mercado, Jonathan Mallinson, Siddhinita Wandekar, Stephen Cagle, Eran Ofek, Guillermo Garrido, Clemens Lombriser, Maksim Mukha, Botu Sun, Hafeezul Rahman Mohammad, Josip Matak, Yadi Qian, Vikas Peswani, Pawel Janus, Quan Yuan, Leif Schelin, Oana David, Ankur Garg, Yifan He, Oleksii Duzhyi, Anton Älgmyr, Timothée Lottaz, Qi Li, Vikas Yadav, Luyao Xu, Alex Chinien, Rakesh Shivanna, Aleksandr Chuklin, Josie Li, Carrie Spadine, Travis Wolfe, Kareem Mohamed, Subhabrata Das, Zihang Dai, Kyle He, Daniel von Dincklage, Shyam Upadhyay, Akanksha Maurya, Luyan Chi, Sebastian Krause, Khalid Salama, Pam G Rabinovitch, Pavan Kumar Reddy M, Aarush Selvan, Mikhail Dektiarev, Golnaz Ghiasi, Erdem Guven, Himanshu Gupta, Boyi Liu, Deepak Sharma, Idan Heimlich Shtacher, Shachi Paul, Oscar Akerlund, François-Xavier Aubet, Terry Huang, Chen Zhu, Eric Zhu, Elico Teixeira, Matthew Fritze, Francesco Bertolini, Liana-Eleonora Marinescu, Martin Bölle, Dominik Paulus, Khyatti Gupta, Tejasi Latkar, Max Chang, Jason Sanders, Roopa Wilson, Xuewei Wu, Yi-Xuan Tan, Lam Nguyen Thiet, Tulsee Doshi, Sid Lall, Swaroop Mishra, Wanming Chen, Thang Luong, Seth Benjamin, Jasmine Lee, Ewa Andrejczuk, Dominik Rabiej, Vipul Ranjan, Krzysztof Styrc, Pengcheng Yin, Jon Simon, Malcolm Rose Harriott, Mudit Bansal, Alexei Robsky, Geoff Bacon, David Greene, Daniil Mirylenka, Chen Zhou, Obaid Sarvana, Abhimanyu Goyal, Samuel Andermatt, Patrick Siegler, Ben Horn, Assaf Israel, Francesco Pongetti, Chih-Wei "Louis" Chen, Marco Selvatici, Pedro Silva, Kathie Wang, Jackson Tolins, Kelvin Guu, Roey Yogev, Xiaochen Cai, Alessandro Agostini, Maulik Shah, Hung Nguyen, Noah Ó Donnaile, Sébastien Pereira, Linda Friso, Adam Stambler, Adam Kurzrok, Chenkai Kuang, Yan Romanikhin, Mark Geller, ZJ Yan, Kane Jang, Cheng-Chun Lee, Wojciech Fica, Eric Malmi, Qijun Tan, Dan Banica, Daniel Balle, Ryan Pham, Yanping Huang, Diana Avram, Hongzhi Shi, Jasjot Singh, Chris Hidey, Niharika Ahuja, Pranab Saxena, Dan Dooley, Srividya Pranavi Potharaju, Eileen O'Neill, Anand Gokulchandran, Ryan Foley, Kai Zhao, Mike Dusenberry, YuAn Liu, Pulkit Mehta, Ragha Kotikalapudi, Chalence Safranek-Shrader, Andrew Goodman, Joshua Kessinger, Eran Globen, Prateek Kolhar, Chris Gorgolewski, Ali Ibrahim, Yang song, Ali Eichenbaum, Thomas Brovelli, Sahitya Potluri, Preethi Lahoti, Cip Baetu, Ali Ghorbani, Charles Chen, Andy Crawford, Shalini Pal, Mukund Sridhar, Petru Gurita, Asier Mujika, Igor Petrovski, Pierre-Louis Cedoz, Chenmei Li, Shiyuan Chen, Niccolò Dal Santo, Siddharth Goyal, Jitesh Punjabi, Karthik Kappaganthu, Chester Kwak, Pallavi LV, Sarmishta Velury, Himadri Choudhury, Jamie Hall, Premal Shah, Ricardo Figueira, Matt Thomas, Minjie Lu, Ting Zhou, Chintu Kumar, Thomas Jurdi, Sharat Chikkerur, Yenai Ma, Adams Yu, Soo Kwak, Victor Ähdel, Sujeevan Rajayogam, Travis Choma, Fei Liu, Aditya Barua, Colin Ji, Ji Ho Park, Vincent Hellendoorn, Alex Bailey, Taylan Bilal, Huanjie Zhou, Mehrdad Khatir, Charles Sutton, Wojciech Rzadkowski, Fiona Macintosh, Konstantin Shagin, Paul Medina, Jinjing Zhou, Pararth Shah, Yingying Bi, Attila Dankovics, Shipra Banga, Sabine Lehmann, Marissa Bredesen, Zifan Lin, John Eric Hoffmann, Jonathan Lai, Raynald Chung, Kai Yang, Nihal Balani, Arthur Bražinskas, Andrei Sozanschi, Matthew Hayes, Héctor Fernández Alcalde, Peter Makarov, Will Chen, Antonio Stella, Liselotte Snijders, Michael Mandl, Ante Kärrman, Paweł Nowak, Xinyi Wu, Alex Dyck, Krishnan Vaidyanathan, Raghavender R, Jessica Mallet, Mitch Rudominer, Eric Johnston, Sushil Mittal, Akhil Udathu, Janara Christensen, Vishal Verma, Zach Irving, Andreas Santucci, Gamaleldin Elsayed, Elnaz Davoodi, Marin Georgiev, Ian Tenney, Geoffrey Cideron, Edouard Leurent, Mahmoud Alnahlawi, Ionut Georgescu, Nan Wei, Ivy Zheng, Dylan Scandinaro, Heinrich Jiang, Jasper Snoek, Mukund Sundararajan, Xuezhi Wang, Zack Ontiveros, Itay Karo, Jeremy Cole, Vinu Rajashekhar, Lara Tumeh, Eyal Ben-David, Rishub Jain, Jonathan Uesato, Romina Datta, Oskar Bunyan, Shimu Wu, John Zhang, Piotr Stanczyk, Ye Zhang, David Steiner, Subhajit Naskar, Michael Azzam, Matthew Johnson, Adam Paszke, Chung-Cheng Chiu, Jaume Sanchez Elias, Afroz Mohiuddin, Faizan Muhammad, Jin Miao, Andrew Lee, Nino Vieillard, Jane Park, Jiageng Zhang, Jeff Stanway, Drew Garmon, Abhijit Karmarkar, Zhe Dong, Jong Lee, Aviral Kumar, Luowei Zhou, Jonathan Evens, William Isaac, Geoffrey Irving, Edward Loper, Michael Fink, Isha Arkatkar, Nanxin Chen, Izhak Shafran, Ivan Petrychenko, Zhe Chen, Johnson Jia, Anselm Levskaya, Zhenkai Zhu, Peter Grabowski, Yu Mao, Alberto Magni, Kaisheng Yao, Javier Snaider, Norman Casagrande, Evan Palmer, Paul Suganthan, Alfonso Castaño, Irene Giannoumis, Wooyeol Kim, Mikołaj Rybiński, Ashwin Sreevatsa, Jennifer Prendki, David Soergel, Adrian Goedeckemeyer, Willi Gierke, Mohsen Jafari, Meenu Gaba, Jeremy Wiesner, Diana Gage Wright, Yawen Wei, Harsha Vashisht, Yana Kulizhskaya, Jay Hoover, Maigo Le, Lu Li, Chimezie Iwuanyanwu, Lu Liu, Kevin Ramirez, Andrey Khorlin, Albert Cui, Tian Lin, Marcus Wu, Ricardo Aguilar, Keith Pallo, Abhishek Chakladar, Ginger Perng, Elena Allica Abellan, Mingyang Zhang, Ishita Dasgupta, Nate Kushman, Ivo Penchev, Alena Repina, Xihui Wu, Tom van der Weide, Priya Ponnapalli, Caroline Kaplan, Jiri Simsa, Shuangfeng Li, Olivier Dousse, Jeff Piper, Nathan Ie, Rama Pasumarthi, Nathan Lintz, Anitha Vijayakumar, Daniel Andor, Pedro Valenzuela, Minnie Lui, Cosmin Paduraru, Daiyi Peng, Katherine Lee, Shuyuan Zhang, Somer Greene, Duc Dung Nguyen, Paula Kurylowicz, Cassidy Hardin, Lucas Dixon, Lili Janzer, Kiam Choo, Ziqiang Feng, Biao Zhang, Achintya Singhal, Dayou Du, Dan McKinnon, Natasha Antropova, Tolga Bolukbasi, Orgad Keller, David Reid, Daniel Finchelstein, Maria Abi Raad, Remi Crocker, Peter Hawkins, Robert Dadashi, Colin Gaffney, Ken Franko, Anna Bulanova, Rémi Leblond, Shirley Chung, Harry Askham, Luis C. Cobo, Kelvin Xu, Felix Fischer, Jun Xu, Christina Sorokin, Chris Alberti, Chu-Cheng Lin, Colin Evans, Alek Dimitriev, Hannah Forbes, Dylan Banarse, Zora Tung, Mark Omernick, Colton Bishop, Rachel Sterneck, Rohan Jain, Jiawei Xia, Ehsan Amid, Francesco Piccinno, Xingyu Wang, Praseem Banzal, Daniel J. Mankowitz, Alex Polozov, Victoria Krakovna, Sasha Brown, Mohammadhossein Bateni, Dennis Duan, Vlad Firoiu, Meghana Thotakuri, Tom Natan, Matthieu Geist, Ser tan Girgin, Hui Li, Jiayu Ye, Ofir Roval, Reiko Tojo, Michael Kwong, James Lee-Thorp, Christopher Yew, Danila Sinopalnikov, Sabela Ramos, John Mellor, Abhishek Sharma, Kathy Wu, David Miller, Nicolas Sonnerat, Denis Vnukov, Rory Greig, Jennifer Beattie, Emily Caveness, Libin Bai, Julian Eisenschlos, Alex Korchemniy, Tomy Tsai, Mimi Jasarevic, Weize Kong, Phuong Dao, Zeyu Zheng, Frederick Liu, Fan Yang, Rui Zhu, Tian Huey Teh, Jason Sanmiya, Evgeny Gladchenko, Nejc Trdin, Daniel Toyama, Evan Rosen, Sasan Tavakkol, Linting Xue, Chen Elkind, Oliver Woodman, John Carpenter, George Papamakarios, Rupert Kemp, Sushant Kafle, Tanya Grunina, Rishika Sinha, Alice Talbert, Diane Wu, Denese Owusu-Afriyie, Cosmo Du, Chloe Thornton, Jordi Pont-Tuset, Pradyumna Narayana, Jing Li, Saaber Fatehi, John Wieting, Omar Ajmeri, Benigno Uria, Yeongil Ko, Laura Knight, Amélie Héliou, Ning Niu, Shane Gu, Chenxi Pang, Yeqing Li, Nir Levine, Ariel Stolovich, Rebeca Santamaria-Fernandez, Sonam Goenka, Wenny Yustalim, Robin Strudel, Ali Elqursh, Charlie Deck, Hyo Lee, Zonglin Li, Kyle Levin, Raphael Hoffmann, Dan Holtmann-Rice, Olivier Bachem, Sho Arora, Christy Koh, Soheil Hassas Yeganeh, Siim Põder, Mukarram Tariq, Yanhua Sun, Lucian Ionita, Mojtaba Seyedhosseini, Pouya Tafti, Zhiyu Liu, Anmol Gulati, Jasmine Liu, Xinyu Ye, Bart Chrzaszcz, Lily Wang, Nikhil Sethi, Tianrun Li, Ben Brown, Shreya Singh, Wei Fan, Aaron Parisi, Joe Stanton, Vinod Koverkathu, Christopher A. Choquette-Choo, Yunjie Li, TJ Lu, Abe Ittycheriah, Prakash Shroff, Mani Varadarajan, Sanaz Bahargam, Rob Willoughby, David Gaddy, Guillaume Desjardins, Marco Cornero, Brona Robenek, Bhavishya Mittal, Ben Albrecht, Ashish Shenoy, Fedor Moiseev, Henrik Jacobsson, Alireza Ghaffarkhah, Morgane Rivière, Alanna Walton, Clément Crepy, Alicia Parrish, Zongwei Zhou, Clement Farabet, Carey Radebaugh, Praveen Srinivasan, Claudia van der Salm, Andreas Fidjeland, Salvatore Scellato, Eri Latorre-Chimoto, Hanna Klimczak-Plucińska, David Bridson, Dario de Cesare, Tom Hudson, Piermaria Mendolicchio, Lexi Walker, Alex Morris, Matthew Mauger, Alexey Guseynov, Alison Reid, Seth Odoom, Lucia Loher, Victor Cotruta, Madhavi Yenugula, Dominik Grewe, Anastasia Petrushkina, Tom Duerig, Antonio Sanchez, Steve Yadlowsky, Amy Shen, Amir Globerson, Lynette Webb, Sahil Dua, Dong Li, Surya Bhupatiraju, Dan Hurt, Haroon Qureshi, Ananth Agarwal, Tomer Shani, Matan Eyal, Anuj Khare, Shreyas Rammohan Belle, Lei Wang, Chetan Tekur, Mihir Sanjay Kale, Jinliang Wei, Ruoxin Sang, Brennan Saeta, Tyler Liechty, Yao Zhao, Stephan Lee, Pandu Nayak, Doug Fritz, Manish Reddy Vuyyuru, John Aslanides, Nidhi Vyas, Martin Wicke, Xiao Ma, Evgenii Eltyshev, Nina Martin, Hardie Cate, James Manyika, Keyvan Amiri, Yelin Kim, Xi Xiong, Kai Kang, Florian Luisier, Nilesh Tripuraneni, David Madras, Mandy Guo, Austin Waters, Oliver Wang, Joshua Ainslie, Jason Baldridge, Han Zhang, Garima Pruthi, Jakob Bauer, Feng Yang, Riham Mansour, Jason Gelman, Yang Xu, George Polovets, Ji Liu, Honglong Cai, Warren Chen, XiangHai Sheng, Emily Xue, Sherjil Ozair, Christof Angermueller, Xiaowei Li, Anoop Sinha, Weiren Wang, Julia Wiesinger, Emmanouil Koukoumidis, Yuan Tian, Anand Iyer, Madhu Gurumurthy, Mark Goldenson, Parashar Shah, MK Blake, Hongkun Yu, Anthony Urbanowicz, Jennimaria Palomaki, Chrisantha Fernando, Ken Durden, Harsh Mehta, Nikola Momchev, Elahe Rahimtoroghi, Maria Georgaki, Amit Raul, Sebastian Ruder, Morgan Redshaw, Jinhyuk Lee, Denny Zhou, Komal Jalan, Dinghua Li, Blake Hechtman, Parker Schuh, Milad Nasr, Kieran Milan, Vladimir Mikulik, Juliana Franco, Tim Green, Nam Nguyen, Joe Kelley, Aroma Mahendru, Andrea Hu, Joshua Howland, Ben Vargas, Jeffrey Hui, Kshitij Bansal, Vikram Rao, Rakesh Ghiya, Emma Wang, Ke Ye, Jean Michel Sarr, Melanie Moranski Preston, Madeleine Elish, Steve Li, Aakash Kaku, Jigar Gupta, Ice Pasupat, Da-Cheng Juan, Milan Someswar, Tejvi M., Xinyun Chen, Aida Amini, Alex Fabrikant, Eric Chu, Xuanyi Dong, Amruta Muthal, Senaka Buthpitiya, Sarthak Jauhari, Nan Hua, Urvashi Khandelwal, Ayal Hitron, Jie Ren, Larissa Rinaldi, Shahar Drath, Avigail Dabush, Nan-Jiang Jiang, Harshal Godhia, Uli Sachs, Anthony Chen, Yicheng Fan, Hagai Taitelbaum, Hila Noga, Zhuyun Dai, James Wang, Chen Liang, Jenny Hamer, Chun-Sung Ferng, Chenel Elkind, Aviel Atias, Paulina Lee, Vít Listík, Mathias Carlen, Jan van de Kerkhof, Marcin Pikus, Krunoslav Zaher, Paul Müller, Sasha Zykova, Richard Stefanec, Vitaly Gatsko, Christoph Hirnschall, Ashwin Sethi, Xingyu Federico Xu, Chetan Ahuja, Beth Tsai, Anca Stefanoiu, Bo Feng, Keshav Dhandhania, Manish Katyal, Akshay Gupta, Atharva Parulekar, Divya Pitta, Jing Zhao, Vivaan Bhatia, Yashodha Bhavnani, Omar Alhadlaq, Xiaolin Li, Peter Danenberg, Dennis Tu, Alex Pine, Vera Filippova, Abhipso Ghosh, Ben Limonchik, Bhargava Urala, Chaitanya Krishna Lanka, Derik Clive, Yi Sun, Edward Li, Hao Wu, Kevin Hongtongsak, Ianna Li, Kalind Thakkar, Kuanysh Omarov, Kushal Majmundar, Michael Alverson, Michael Kucharski, Mohak Patel, Mudit Jain, Maksim Zabelin, Paolo Pelagatti, Rohan Kohli, Saurabh Kumar, Joseph Kim, Swetha Sankar, Vineet Shah, Lakshmi Ramachandruni, Xiangkai Zeng, Ben Bariach, Laura Weidinger, Amar Subramanya, Sissie Hsiao, Demis Hassabis, Koray Kavukcuoglu, Adam Sadovsky, Quoc Le, Trevor Strohman, Yonghui Wu, Slav Petrov, Jeffrey Dean, Oriol Vinyals

This report introduces a new family of multimodal models, Gemini, that exhibit remarkable capabilities across image, audio, video, and text understanding.

Ranked #1 on Multi-task Language Understanding on MMLU (using extra training data)

Arithmetic Reasoning Code Generation +3

Paper
Add Code

SegRefiner: Towards Model-Agnostic Segmentation Refinement with Discrete Diffusion Process

1 code implementation • NeurIPS 2023 • Mengyu Wang, Henghui Ding, Jun Hao Liew, Jiajun Liu, Yao Zhao, Yunchao Wei

We propose a model-agnostic solution called SegRefiner, which offers a novel perspective on this problem by interpreting segmentation refinement as a data generation process.

Denoising Dichotomous Image Segmentation +4

117

Paper
Code

Rethinking the Up-Sampling Operations in CNN-based Generative Network for Generalizable Deepfake Detection

2 code implementations • 16 Dec 2023 • Chuangchuang Tan, Huan Liu, Yao Zhao, Shikui Wei, Guanghua Gu, Ping Liu, Yunchao Wei

Recently, the proliferation of highly realistic synthetic images, facilitated through a variety of GANs and Diffusions, has significantly heightened the susceptibility to misuse.

DeepFake Detection Face Swapping

Paper
Code

Multiple Instance Learning for Uplift Modeling

no code implementations • 15 Dec 2023 • Yao Zhao, Haipeng Zhang, Shiwei Lyu, Ruiying Jiang, Jinjie Gu, Guannan Zhang

Uplift modeling is widely used in performance marketing to estimate effects of promotion campaigns (e. g., increase of customer retention rate).

Marketing Multiple Instance Learning

Paper
Add Code

Self-Evaluation Improves Selective Generation in Large Language Models

no code implementations • 14 Dec 2023 • Jie Ren, Yao Zhao, Tu Vu, Peter J. Liu, Balaji Lakshminarayanan

Safe deployment of large language models (LLMs) may benefit from a reliable method for assessing their generated content to determine when to abstain or to selectively generate.

Multiple-choice

Paper
Add Code

Semantic Lens: Instance-Centric Semantic Alignment for Video Super-Resolution

1 code implementation • 13 Dec 2023 • Qi Tang, Yao Zhao, Meiqin Liu, Jian Jin, Chao Yao

As a critical clue of video super-resolution (VSR), inter-frame alignment significantly impacts overall performance.

Video Super-Resolution

Paper
Code

Diffusion for Natural Image Matting

1 code implementation • 10 Dec 2023 • Yihan Hu, Yiheng Lin, Wei Wang, Yao Zhao, Yunchao Wei, Humphrey Shi

However, the presence of high computational overhead and the inconsistency of noise sampling between the training and inference processes pose significant obstacles to achieving this goal.

Ranked #1 on Image Matting on Distinctions-646

Image Matting

Paper
Code

Large Multimodal Model Compression via Efficient Pruning and Distillation at AntGroup

no code implementations • 10 Dec 2023 • Maolin Wang, Yao Zhao, Jiajia Liu, Jingdong Chen, Chenyi Zhuang, Jinjie Gu, Ruocheng Guo, Xiangyu Zhao

In our research, we constructed a dataset, the Multimodal Advertisement Audition Dataset (MAAD), from real-world scenarios within Alipay, and conducted experiments to validate the reliability of our proposed strategy.

Model Compression

Paper
Add Code

PixelLM: Pixel Reasoning with Large Multimodal Model

no code implementations • 4 Dec 2023 • Zhongwei Ren, Zhicheng Huang, Yunchao Wei, Yao Zhao, Dongmei Fu, Jiashi Feng, Xiaojie Jin

PixelLM excels across various pixel-level image reasoning and understanding tasks, outperforming well-established methods in multiple benchmarks, including MUSE, single- and multi-referring segmentation.

Segmentation

Paper
Add Code

On What Basis? Predicting Text Preference Via Structured Comparative Reasoning

no code implementations • 14 Nov 2023 • Jing Nathan Yan, Tianqi Liu, Justin T Chiu, Jiaming Shen, Zhen Qin, Yue Yu, Yao Zhao, Charu Lakshmanan, Yair Kurzion, Alexander M. Rush, Jialu Liu, Michael Bendersky

Comparative reasoning plays a crucial role in text preference prediction; however, large language models (LLMs) often demonstrate inconsistencies in their reasoning.

Hallucination Retrieval

Paper
Add Code

On the Opportunities of Green Computing: A Survey

no code implementations • 1 Nov 2023 • You Zhou, Xiujing Lin, Xiang Zhang, Maolin Wang, Gangwei Jiang, Huakang Lu, Yupeng Wu, Kai Zhang, Zhe Yang, Kehang Wang, Yongduo Sui, Fengwei Jia, Zuoli Tang, Yao Zhao, Hongxuan Zhang, Tiannuo Yang, Weibo Chen, Yunong Mao, Yi Li, De Bao, Yu Li, Hongrui Liao, Ting Liu, Jingwen Liu, Jinchi Guo, Xiangyu Zhao, Ying WEI, Hong Qian, Qi Liu, Xiang Wang, Wai Kin, Chan, Chenliang Li, Yusen Li, Shiyu Yang, Jining Yan, Chao Mou, Shuai Han, Wuxia Jin, Guannan Zhang, Xiaodong Zeng

To tackle the challenges of computing resources and environmental impact of AI, Green Computing has become a hot research topic.

Fairness Speech Synthesis +1

Paper
Add Code

Unleashing the potential of GNNs via Bi-directional Knowledge Transfer

no code implementations • 26 Oct 2023 • Shuai Zheng, Zhizhe Liu, Zhenfeng Zhu, Xingxing Zhang, JianXin Li, Yao Zhao

On this basis, BiKT not only allows us to acquire knowledge from both the GNN and its derived model but promotes each other by injecting the knowledge into the other.

Domain Adaptation Representation Learning +1

Paper
Add Code

WeatherDepth: Curriculum Contrastive Learning for Self-Supervised Depth Estimation under Adverse Weather Conditions

1 code implementation • 9 Oct 2023 • Jiyuan Wang, Chunyu Lin, Lang Nie, Shujun Huang, Yao Zhao, Xing Pan, Rui Ai

In this paper, we propose WeatherDepth, a self-supervised robust depth estimation model with curriculum contrastive learning, to tackle performance degradation in complex weather conditions.

Contrastive Learning Depth Estimation +2

Paper
Code

Learning Mask-aware CLIP Representations for Zero-Shot Segmentation

1 code implementation • NeurIPS 2023 • Siyu Jiao, Yunchao Wei, YaoWei Wang, Yao Zhao, Humphrey Shi

However, in the paper, we reveal that CLIP is insensitive to different mask proposals and tends to produce similar predictions for various mask proposals of the same image.

Ranked #7 on Open Vocabulary Semantic Segmentation on PascalVOC-20

Open Vocabulary Semantic Segmentation Zero Shot Segmentation

Paper
Code

IBVC: Interpolation-driven B-frame Video Compression

1 code implementation • 25 Sep 2023 • Chenming Xu, Meiqin Liu, Chao Yao, Weisi Lin, Yao Zhao

Learned B-frame video compression aims to adopt bi-directional motion estimation and motion compensation (MEMC) coding for middle frame reconstruction.

Motion Compensation Motion Estimation +4

Paper
Code

Unified Frequency-Assisted Transformer Framework for Detecting and Grounding Multi-Modal Manipulation

no code implementations • 18 Sep 2023 • Huan Liu, Zichang Tan, Qiang Chen, Yunchao Wei, Yao Zhao, Jingdong Wang

Moreover, to address the semantic conflicts between image and frequency domains, the forgery-aware mutual module is developed to further enable the effective interaction of disparate image and frequency features, resulting in aligned and comprehensive visual forgery representations.

Misinformation

Paper
Add Code

Statistical Rejection Sampling Improves Preference Optimization

no code implementations • 13 Sep 2023 • Tianqi Liu, Yao Zhao, Rishabh Joshi, Misha Khalman, Mohammad Saleh, Peter J. Liu, Jialu Liu

DPO's lack of a reward model constrains its ability to sample preference pairs from the optimal policy, and SLiC is restricted to sampling preference pairs only from the SFT policy.

Language Modelling Large Language Model

Paper
Add Code

SDDNet: Style-guided Dual-layer Disentanglement Network for Shadow Detection

1 code implementation • 17 Aug 2023 • Runmin Cong, Yuchen Guan, Jinpeng Chen, Wei zhang, Yao Zhao, Sam Kwong

Despite significant progress in shadow detection, current methods still struggle with the adverse impact of background color, which may lead to errors when shadows are present on complex backgrounds.

Disentanglement Shadow Detection

Paper
Code

Frequency Perception Network for Camouflaged Object Detection

2 code implementations • 17 Aug 2023 • Runmin Cong, Mengyao Sun, Sanyi Zhang, Xiaofei Zhou, Wei zhang, Yao Zhao

Camouflaged object detection (COD) aims to accurately detect objects hidden in the surrounding environment.

Object object-detection +1

Paper
Code

Group Pose: A Simple Baseline for End-to-End Multi-person Pose Estimation

2 code implementations • ICCV 2023 • Huan Liu, Qiang Chen, Zichang Tan, Jiang-Jiang Liu, Jian Wang, Xiangbo Su, Xiaolong Li, Kun Yao, Junyu Han, Errui Ding, Yao Zhao, Jingdong Wang

State-of-the-art solutions adopt the DETR-like framework, and mainly develop the complex decoder, e. g., regarding pose estimation as keypoint box detection and combining with human detection in ED-Pose, hierarchically predicting with pose decoder and joint (keypoint) decoder in PETR.

Human Detection Multi-Person Pose Estimation

Paper
Code

CTP: Towards Vision-Language Continual Pretraining via Compatible Momentum Contrast and Topology Preservation

1 code implementation • 14 Aug 2023 • Hongguang Zhu, Yunchao Wei, Xiaodan Liang, Chunjie Zhang, Yao Zhao

Regarding the growing nature of real-world data, such an offline training paradigm on ever-expanding data is unsustainable, because models lack the continual learning ability to accumulate knowledge constantly.

Continual Learning Continual Pretraining

Paper
Code

CLE Diffusion: Controllable Light Enhancement Diffusion Model

no code implementations • 13 Aug 2023 • Yuyang Yin, Dejia Xu, Chuangchuang Tan, Ping Liu, Yao Zhao, Yunchao Wei

Low light enhancement has gained increasing importance with the rapid development of visual creation and editing.

Low-Light Image Enhancement

Paper
Add Code

You Can Mask More For Extremely Low-Bitrate Image Compression

1 code implementation • 27 Jun 2023 • Anqi Li, Feng Li, Jiaxin Han, Huihui Bai, Runmin Cong, Chunjie Zhang, Meng Wang, Weisi Lin, Yao Zhao

Extensive experiments have demonstrated that our approach outperforms recent state-of-the-art methods in R-D performance, visual quality, and downstream applications, at very low bitrates.

Image Compression

Paper
Code

Exploring Resolution Fields for Scalable Image Compression with Uncertainty Guidance

1 code implementation • 15 Jun 2023 • Dongyi Zhang, Feng Li, Man Liu, Runmin Cong, Huihui Bai, Meng Wang, Yao Zhao

In this work, we explore the potential of resolution fields in scalable image compression and propose the reciprocal pyramid network (RPN) that fulfills the need for more adaptable and versatile compression.

Image Compression

Paper
Code

NPVForensics: Jointing Non-critical Phonemes and Visemes for Deepfake Detection

no code implementations • 12 Jun 2023 • Yu Chen, Yang Yu, Rongrong Ni, Yao Zhao, Haoliang Li

Next, we design a phoneme-viseme awareness module for cross-modal feature fusion and representation alignment, so that the modality gap can be reduced and the intrinsic complementarity of the two modalities can be better explored.

DeepFake Detection Face Swapping

Paper
Add Code

Tensorized Hypergraph Neural Networks

no code implementations • 5 Jun 2023 • Maolin Wang, Yaoming Zhen, Yu Pan, Yao Zhao, Chenyi Zhuang, Zenglin Xu, Ruocheng Guo, Xiangyu Zhao

THNN is a faithful hypergraph modeling framework through high-order outer product feature message passing and is a natural tensor extension of the adjacency-matrix-based graph neural networks.

Paper
Add Code

A Hybrid Approach for Smart Alert Generation

no code implementations • 2 Jun 2023 • Yao Zhao, Sophine Zhang, Zhiyuan Yao

Anomaly detection is an important task in network management.

Anomaly Detection Feature Engineering +2

Paper
Add Code

SLiC-HF: Sequence Likelihood Calibration with Human Feedback

no code implementations • 17 May 2023 • Yao Zhao, Rishabh Joshi, Tianqi Liu, Misha Khalman, Mohammad Saleh, Peter J. Liu

Past work has often relied on Reinforcement Learning from Human Feedback (RLHF), which optimizes the language model using reward scores assigned from a reward model trained on human preference data.

Language Modelling Offline RL

Paper
Add Code

Lyapunov-Stable Deep Equilibrium Models

no code implementations • 25 Apr 2023 • Haoyu Chu, Shikui Wei, Ting Liu, Yao Zhao, Yuto Miyatake

Deep equilibrium (DEQ) models have emerged as a promising class of implicit layer models, which abandon traditional depth by solving for the fixed points of a single nonlinear layer.

Adversarial Defense Adversarial Robustness

Paper
Add Code

Progressive Semantic-Visual Mutual Adaption for Generalized Zero-Shot Learning

1 code implementation • CVPR 2023 • Man Liu, Feng Li, Chunjie Zhang, Yunchao Wei, Huihui Bai, Yao Zhao

Generalized Zero-Shot Learning (GZSL) identifies unseen categories by knowledge transferred from the seen domain, relying on the intrinsic interactions between visual and semantic information.

Attribute Generalized Zero-Shot Learning

Paper
Code

MF-JMoDL-Net: A Deep Network for Azimuth Undersampling Pattern Design and Ambiguity Suppression for Sparse SAR Imaging

no code implementations • 20 Mar 2023 • Yuwei Wu, Zhe Zhang, Xiaolan Qiu, Yao Zhao, Weidong Yu

repetition frequency (PRF).

Paper
Add Code

Deep Learning for Camera Calibration and Beyond: A Survey

1 code implementation • 19 Mar 2023 • Kang Liao, Lang Nie, Shujuan Huang, Chunyu Lin, Jing Zhang, Yao Zhao, Moncef Gabbouj, DaCheng Tao

In this paper, we provide a comprehensive survey of learning-based camera calibration techniques, by analyzing their strengths and limitations.

Camera Calibration

392

Paper
Code

Global Knowledge Calibration for Fast Open-Vocabulary Segmentation

1 code implementation • ICCV 2023 • Kunyang Han, Yong liu, Jun Hao Liew, Henghui Ding, Yunchao Wei, Jiajun Liu, Yitong Wang, Yansong Tang, Yujiu Yang, Jiashi Feng, Yao Zhao

Recent advancements in pre-trained vision-language models, such as CLIP, have enabled the segmentation of arbitrary concepts solely from textual inputs, a process commonly referred to as open-vocabulary semantic segmentation (OVS).

Knowledge Distillation Open Vocabulary Semantic Segmentation +4

Paper
Code

SigVIC: Spatial Importance Guided Variable-Rate Image Compression

no code implementations • 16 Mar 2023 • Jiaming Liang, Meiqin Liu, Chao Yao, Chunyu Lin, Yao Zhao

Variable-rate mechanism has improved the flexibility and efficiency of learning-based image compression that trains multiple models for different rate-distortion tradeoffs.

Image Compression

Paper
Add Code

Disentangling Orthogonal Planes for Indoor Panoramic Room Layout Estimation with Cross-Scale Distortion Awareness

1 code implementation • CVPR 2023 • Zhijie Shen, Zishuo Zheng, Chunyu Lin, Lang Nie, Kang Liao, Shuai Zheng, Yao Zhao

Based on the Manhattan World assumption, most existing indoor layout estimation schemes focus on recovering layouts from vertically compressed 1D sequences.

Room Layout Estimation Segmentation

Paper
Code

Unsupervised OmniMVS: Efficient Omnidirectional Depth Inference via Establishing Pseudo-Stereo Supervision

no code implementations • 20 Feb 2023 • Zisong Chen, Chunyu Lin, Lang Nie, Kang Liao, Yao Zhao

In this paper, we propose the first unsupervised omnidirectional MVS framework based on multiple fisheye images.

Paper
Add Code

Parallax-Tolerant Unsupervised Deep Image Stitching

1 code implementation • ICCV 2023 • Lang Nie, Chunyu Lin, Kang Liao, Shuaicheng Liu, Yao Zhao

First, we propose a robust and flexible warp to model the image registration from global homography to local thin-plate spline motion.

Image Registration Image Stitching

115

Paper
Code

Spatiotemporal Deformation Perception for Fisheye Video Rectification

1 code implementation • 8 Feb 2023 • Shangrong Yang, Chunyu Lin, Kang Liao, Yao Zhao

Subsequently, we observe that the inter-frame optical flow of the video is facilitated to perceive the local spatial deformation of the fisheye video.

Optical Flow Estimation

Paper
Code

Dual Diffusion Architecture for Fisheye Image Rectification: Synthetic-to-Real Generalization

no code implementations • 26 Jan 2023 • Shangrong Yang, Chunyu Lin, Kang Liao, Yao Zhao

To this end, we propose a Dual Diffusion Architecture (DDA) for the fisheye rectification with a better generalization ability.

Denoising

Paper
Add Code

RecRecNet: Rectangling Rectified Wide-Angle Images by Thin-Plate Spline Model and DoF-based Curriculum Learning

1 code implementation • ICCV 2023 • Kang Liao, Lang Nie, Chunyu Lin, Zishuo Zheng, Yao Zhao

In this work, we explore constructing a win-win representation on both content and boundary by contributing a new learning model, i. e., Rectangling Rectification Network (RecRecNet).

Paper
Code

Locating Noise is Halfway Denoising for Semi-Supervised Segmentation

no code implementations • ICCV 2023 • Yan Fang, Feng Zhu, Bowen Cheng, Luoqi Liu, Yao Zhao, Yunchao Wei

This work shows that locating the patch-wise noisy region is a better way to deal with noise.

Denoising Semi-Supervised Semantic Segmentation

Paper
Add Code

CTP:Towards Vision-Language Continual Pretraining via Compatible Momentum Contrast and Topology Preservation

1 code implementation • ICCV 2023 • Hongguang Zhu, Yunchao Wei, Xiaodan Liang, Chunjie Zhang, Yao Zhao

Continual Learning Continual Pretraining

Paper
Code

Learning To Segment Every Referring Object Point by Point

1 code implementation • CVPR 2023 • Mengxue Qu, Yu Wu, Yunchao Wei, Wu Liu, Xiaodan Liang, Yao Zhao

Extensive experiments show that our model achieves 52. 06% in terms of accuracy (versus 58. 93% in fully supervised setting) on RefCOCO+@testA, when only using 1% of the mask annotations.

Object Referring Expression +1

Paper
Code

Innovating Real Fisheye Image Correction with Dual Diffusion Architecture

no code implementations • ICCV 2023 • Shangrong Yang, Chunyu Lin, Kang Liao, Yao Zhao

Fisheye image rectification is hindered by synthetic models producing poor results for real-world correction.

Denoising

Paper
Add Code

Learning on Gradients: Generalized Artifacts Representation for GAN-Generated Images Detection

1 code implementation • CVPR 2023 • Chuangchuang Tan, Yao Zhao, Shikui Wei, Guanghua Gu, Yunchao Wei

The key of fake image detection is to develop a generalized representation to describe the artifacts produced by generation models.

Fake Image Detection Image Generation

Paper
Code

An In-Depth Exploration of Person Re-Identification and Gait Recognition in Cloth-Changing Conditions

2 code implementations • CVPR 2023 • Weijia Li, Saihui Hou, Chunjie Zhang, Chunshui Cao, Xu Liu, Yongzhen Huang, Yao Zhao

For the cloth-changing problem, video-based ReID is rarely studied due to the lack of a suitable cloth-changing benchmark, and gait recognition is often researched under controlled conditions.

16k Gait Recognition +1

Paper
Code

Multi-Projection Fusion and Refinement Network for Salient Object Detection in 360° Omnidirectional Image

no code implementations • 23 Dec 2022 • Runmin Cong, Ke Huang, Jianjun Lei, Yao Zhao, Qingming Huang, Sam Kwong

Salient object detection (SOD) aims to determine the most visually attractive objects in an image.

object-detection Object Detection +1

Paper
Add Code

Improving the Robustness of Summarization Models by Detecting and Removing Input Noise

no code implementations • 20 Dec 2022 • Kundan Krishna, Yao Zhao, Jie Ren, Balaji Lakshminarayanan, Jiaming Luo, Mohammad Saleh, Peter J. Liu

We present a large empirical study quantifying the sometimes severe loss in performance (up to 12 ROUGE-1 points) from different types of input noise for a range of datasets and model sizes.

Abstractive Text Summarization

Paper
Add Code

Fully and Weakly Supervised Referring Expression Segmentation with End-to-End Learning

no code implementations • 17 Dec 2022 • Hui Li, MingJie Sun, Jimin Xiao, Eng Gee Lim, Yao Zhao

To validate our framework on a weakly-supervised setting, we annotated three RES benchmark datasets (RefCOCO, RefCOCO+ and RefCOCOg) with click annotations. Our method is simple but surprisingly effective, outperforming all previous state-of-the-art RES methods on fully- and weakly-supervised settings by a large margin.

Position Referring Expression +3

Paper
Add Code

Node-oriented Spectral Filtering for Graph Neural Networks

1 code implementation • 7 Dec 2022 • Shuai Zheng, Zhenfeng Zhu, Zhizhe Liu, Youru Li, Yao Zhao

Graph neural networks (GNNs) have shown remarkable performance on homophilic graph data while being far less impressive when handling non-homophilic graph data due to the inherent low-pass filtering property of GNNs.

Ranked #5 on Node Classification on Squirrel (60%/20%/20% random splits)

Node Classification

Paper
Code

Mask Matching Transformer for Few-Shot Segmentation

1 code implementation • 5 Dec 2022 • Siyu Jiao, Gengwei Zhang, Shant Navasardyan, Ling Chen, Yao Zhao, Yunchao Wei, Humphrey Shi

Typical methods follow the paradigm to firstly learn prototypical features from support images and then match query features in pixel-level to obtain segmentation results.

Few-Shot Semantic Segmentation Segmentation

Paper
Code

Learning Detail-Structure Alternative Optimization for Blind Super-Resolution

1 code implementation • 3 Dec 2022 • Feng Li, Yixuan Wu, Huihui Bai, Weisi Lin, Runmin Cong, Yao Zhao

Recent blind SR methods suggest to reconstruct SR images relying on blur kernel estimation.

Blind Super-Resolution Image Super-Resolution +1

Paper
Code

Bridging Component Learning with Degradation Modelling for Blind Image Super-Resolution

1 code implementation • 3 Dec 2022 • Yixuan Wu, Feng Li, Huihui Bai, Weisi Lin, Runmin Cong, Yao Zhao

In this paper, we analyze the degradation of a high-resolution (HR) image from image intrinsic components according to a degradation-based formulation model.

Image Super-Resolution

Paper
Code

HGV4Risk: Hierarchical Global View-guided Sequence Representation Learning for Risk Prediction

1 code implementation • 15 Nov 2022 • Youru Li, Zhenfeng Zhu, Xiaobo Guo, Shaoshuai Li, Yuchen Yang, Yao Zhao

Moreover, the hierarchical representations at both instance level and channel level can be coordinated by the heterogeneous information aggregation under the guidance of global view.

Graph Embedding Representation Learning +1

Paper
Code

FF2: A Feature Fusion Two-Stream Framework for Punctuation Restoration

no code implementations • 9 Nov 2022 • Yangjun Wu, Kebin Fang, Yao Zhao, Hao Zhang, Lifeng Shi, Mengqi Zhang

To accomplish punctuation restoration, most existing methods focus on introducing extra information (e. g., part-of-speech) or addressing the class imbalance problem.

Language Modelling Punctuation Restoration +1

Paper
Add Code

Cross-view Graph Contrastive Representation Learning on Partially Aligned Multi-view Data

no code implementations • 8 Nov 2022 • Yiming Wang, Dongxia Chang, Zhiqiang Fu, Jie Wen, Yao Zhao

Multi-view representation learning has developed rapidly over the past decades and has been applied in many fields.

Contrastive Learning Representation Learning

Paper
Add Code

Temporal Consistency Learning of inter-frames for Video Super-Resolution

1 code implementation • 3 Nov 2022 • Meiqin Liu, Shuo Jin, Chao Yao, Chunyu Lin, Yao Zhao

A spatio-temporal stability module is designed to learn the self-alignment from inter-frames.

Video Super-Resolution

Paper
Code

Revisiting Simple Regret: Fast Rates for Returning a Good Arm

no code implementations • 30 Oct 2022 • Yao Zhao, Connor James Stephens, Csaba Szepesvári, Kwang-Sung Jun

Simple regret is a natural and parameter-free performance criterion for pure exploration in multi-armed bandits yet is less popular than the probability of missing the best arm or an $\epsilon$-good arm, perhaps due to lack of easy ways to characterize it.

Multi-Armed Bandits

Paper
Add Code

PSNet: Parallel Symmetric Network for Video Salient Object Detection

no code implementations • 12 Oct 2022 • Runmin Cong, Weiyu Song, Jianjun Lei, Guanghui Yue, Yao Zhao, Sam Kwong

Finally, we use the Importance Perception Fusion (IPF) module to fuse the features from two parallel branches according to their different importance in different scenarios.

Object object-detection +4

Paper
Add Code

Does Thermal Really Always Matter for RGB-T Salient Object Detection?

2 code implementations • 9 Oct 2022 • Runmin Cong, Kepu Zhang, Chen Zhang, Feng Zheng, Yao Zhao, Qingming Huang, Sam Kwong

In addition, considering the role of thermal modality, we set up different cross-modality interaction mechanisms in the encoding phase and the decoding phase.

object-detection Object Detection +2

Paper
Code

CIR-Net: Cross-modality Interaction and Refinement for RGB-D Salient Object Detection

3 code implementations • 6 Oct 2022 • Runmin Cong, Qinwei Lin, Chen Zhang, Chongyi Li, Xiaochun Cao, Qingming Huang, Yao Zhao

Focusing on the issue of how to effectively capture and utilize cross-modality information in RGB-D salient object detection (SOD) task, we present a convolutional neural network (CNN) model, named CIR-Net, based on the novel cross-modality interaction and refinement.

object-detection RGB-D Salient Object Detection +1

Paper
Code

Out-of-Distribution Detection and Selective Generation for Conditional Language Models

no code implementations • 30 Sep 2022 • Jie Ren, Jiaming Luo, Yao Zhao, Kundan Krishna, Mohammad Saleh, Balaji Lakshminarayanan, Peter J. Liu

Furthermore, the space of potential low-quality outputs is larger as arbitrary text can be generated and it is important to know when to trust the generated output.

Abstractive Text Summarization Out-of-Distribution Detection +1

Paper
Add Code

Calibrating Sequence likelihood Improves Conditional Language Generation

no code implementations • 30 Sep 2022 • Yao Zhao, Misha Khalman, Rishabh Joshi, Shashi Narayan, Mohammad Saleh, Peter J. Liu

Conditional language models are predominantly trained with maximum likelihood estimation (MLE), giving probability mass to sparsely observed target sequences.

Ranked #1 on Abstractive Text Summarization on CNN / Daily Mail

abstractive question answering Abstractive Text Summarization +5

Paper
Add Code

A Weakly Supervised Learning Framework for Salient Object Detection via Hybrid Labels

3 code implementations • 7 Sep 2022 • Runmin Cong, Qi Qin, Chen Zhang, Qiuping Jiang, Shiqi Wang, Yao Zhao, Sam Kwong

In this paper, we focus on a new weakly-supervised SOD task under hybrid labels, where the supervision labels include a large number of coarse labels generated by the traditional unsupervised method and a small number of real labels.

Ranked #7 on RGB Salient Object Detection on PASCAL-S

object-detection RGB Salient Object Detection +3

Paper
Code

Boundary Guided Semantic Learning for Real-time COVID-19 Lung Infection Segmentation System

1 code implementation • 7 Sep 2022 • Runmin Cong, Yumo Zhang, Ning Yang, Haisheng Li, Xueqi Zhang, Ruochen Li, Zewen Chen, Yao Zhao, Sam Kwong

The coronavirus disease 2019 (COVID-19) continues to have a negative impact on healthcare systems around the world, though the vaccines have been developed and national vaccination coverage rate is steadily increasing.

Paper
Code

HVS-Inspired Signal Degradation Network for Just Noticeable Difference Estimation

1 code implementation • 16 Aug 2022 • Jian Jin, Yuan Xue, Xingxing Zhang, Lili Meng, Yao Zhao, Weisi Lin

However, they have a major drawback that the generated JND is assessed in the real-world signal domain instead of in the perceptual domain in the human brain.

Paper
Code

Investigating Efficiently Extending Transformers for Long Input Summarization

1 code implementation • 8 Aug 2022 • Jason Phang, Yao Zhao, Peter J. Liu

While large pretrained Transformer models have proven highly capable at tackling natural language tasks, handling long sequence inputs continues to be a significant challenge.

Ranked #2 on Long-range modeling on SCROLLS (GovRep metric)

16k Long-range modeling +1

1,588

Paper
Code

Neural Contourlet Network for Monocular 360 Depth Estimation

1 code implementation • 3 Aug 2022 • Zhijie Shen, Chunyu Lin, Lang Nie, Kang Liao, Yao Zhao

For a monocular 360 image, depth estimation is a challenging because the distortion increases along the latitude.

Ranked #8 on Depth Estimation on Stanford2D3D Panoramic

Depth Estimation

Paper
Code

SMART: Sentences as Basic Units for Text Evaluation

no code implementations • 1 Aug 2022 • Reinald Kim Amplayo, Peter J. Liu, Yao Zhao, Shashi Narayan

Specifically, We treat sentences as basic units of matching instead of tokens, and use a sentence matching function to soft-match candidate and reference sentences.

Sentence Text Generation

Paper
Add Code

SiRi: A Simple Selective Retraining Mechanism for Transformer-based Visual Grounding

1 code implementation • 27 Jul 2022 • Mengxue Qu, Yu Wu, Wu Liu, Qiqi Gong, Xiaodan Liang, Olga Russakovsky, Yao Zhao, Yunchao Wei

Particularly, SiRi conveys a significant principle to the research of visual grounding, i. e., a better initialized vision-language encoder would help the model converge to a better local minimum, advancing the performance accordingly.

Visual Grounding

Paper
Code

BCS-Net: Boundary, Context and Semantic for Automatic COVID-19 Lung Infection Segmentation from CT Images

3 code implementations • 17 Jul 2022 • Runmin Cong, Haowei Yang, Qiuping Jiang, Wei Gao, Haisheng Li, Cong Wang, Yao Zhao, Sam Kwong

The spread of COVID-19 has brought a huge disaster to the world, and the automatic segmentation of infection regions can help doctors to make diagnosis quickly and reduce workload.

Segmentation

Paper
Code

Deep Rotation Correction without Angle Prior

1 code implementation • 7 Jul 2022 • Lang Nie, Chunyu Lin, Kang Liao, Shuaicheng Liu, Yao Zhao

To this end, we leverage a neural network to predict the optical flows that can warp the tilted images to be perceptually horizontal.

Optical Flow Estimation

Paper
Code

Complementary Bi-directional Feature Compression for Indoor 360° Semantic Segmentation with Self-distillation

no code implementations • 6 Jul 2022 • Zishuo Zheng, Chunyu Lin, Lang Nie, Kang Liao, Zhijie Shen, Yao Zhao

In this paper, we combine the two different representations and propose a novel 360{\deg} semantic segmentation solution from a complementary perspective.

Ranked #1 on Semantic Segmentation on Stanford2D3D Panoramic - RGBD

Feature Compression Semantic Segmentation

Paper
Add Code

FishFormer: Annulus Slicing-based Transformer for Fisheye Rectification with Efficacy Domain Exploration

no code implementations • 5 Jul 2022 • Shangrong Yang, Chunyu Lin, Kang Liao, Yao Zhao

To leverage these two characteristics, we introduced Fishformer that processes the fisheye image as a sequence to enhance global and local perception.

Paper
Add Code

FisheyeEX: Polar Outpainting for Extending the FoV of Fisheye Lens

1 code implementation • 12 Jun 2022 • Kang Liao, Chunyu Lin, Yunchao Wei, Yao Zhao

For the distortion synthesis, we propose a spiral distortion-aware perception module, in which the learning path keeps consistent with the distortion prior of the fisheye image.

Image Outpainting

Paper
Code

JNMR: Joint Non-linear Motion Regression for Video Frame Interpolation

1 code implementation • 9 Jun 2022 • Meiqin Liu, Chenming Xu, Chao Yao, Chunyu Lin, Yao Zhao

Video frame interpolation (VFI) aims to generate predictive frames by warping learnable motions from the bidirectional historical references.

Motion Estimation regression +1

Paper
Code

TALM: Tool Augmented Language Models

no code implementations • 24 May 2022 • Aaron Parisi, Yao Zhao, Noah Fiedel

Transformer based language models (LMs) demonstrate increasing performance with scale across a wide variety of tasks.

Math

Paper
Add Code

Global-and-Local Collaborative Learning for Co-Salient Object Detection

2 code implementations • 19 Apr 2022 • Runmin Cong, Ning Yang, Chongyi Li, Huazhu Fu, Yao Zhao, Qingming Huang, Sam Kwong

In this paper, we propose a global-and-local collaborative learning architecture, which includes a global correspondence modeling (GCM) and a local correspondence modeling (LCM) to capture comprehensive inter-image corresponding relationship among different images from the global and local perspectives.

8k Co-Salient Object Detection +2

Paper
Code

Cylin-Painting: Seamless {360\textdegree} Panoramic Image Outpainting and Beyond

1 code implementation • 18 Apr 2022 • Kang Liao, Xiangyu Xu, Chunyu Lin, Wenqi Ren, Yunchao Wei, Yao Zhao

Motivated by this analysis, we present a Cylin-Painting framework that involves meaningful collaborations between inpainting and outpainting and efficiently fuses the different arrangements, with a view to leveraging their complementary benefits on a seamless cylinder.

Depth Estimation Image Outpainting +3

Paper
Code

Towards Reliable Image Outpainting: Learning Structure-Aware Multimodal Fusion with Depth Guidance

no code implementations • 12 Apr 2022 • Lei Zhang, Kang Liao, Chunyu Lin, Yao Zhao

Concretely, we propose a Depth-Guided Outpainting Network to model different feature representations of two modalities and learn the structure-aware cross-modal fusion.

Image Outpainting

Paper
Add Code

A Well-Composed Text is Half Done! Composition Sampling for Diverse Conditional Generation

1 code implementation • ACL 2022 • Shashi Narayan, Gonçalo Simões, Yao Zhao, Joshua Maynez, Dipanjan Das, Michael Collins, Mirella Lapata

We propose Composition Sampling, a simple but effective method to generate diverse outputs for conditional generation of higher quality compared to previous stochastic decoding strategies.

Question Generation Question-Generation

1,558

Paper
Code

A Context-Aware Feature Fusion Framework for Punctuation Restoration

1 code implementation • 23 Mar 2022 • Yangjun Wu, Kebin Fang, Yao Zhao

To accomplish the punctuation restoration task, most existing approaches focused on leveraging extra information (e. g., part-of-speech tags) or addressing the class imbalance problem.

Punctuation Restoration

Paper
Code

Distortion-Tolerant Monocular Depth Estimation On Omnidirectional Images Using Dual-cubemap

no code implementations • 18 Mar 2022 • Zhijie Shen, Chunyu Lin, Lang Nie, Kang Liao, Yao Zhao

It comprises two modules: Dual-Cubemap Depth Estimation (DCDE) module and Boundary Revision (BR) module.

Monocular Depth Estimation

Paper
Add Code

PanoFormer: Panorama Transformer for Indoor 360 Depth Estimation

1 code implementation • 17 Mar 2022 • Zhijie Shen, Chunyu Lin, Kang Liao, Lang Nie, Zishuo Zheng, Yao Zhao

In particular, we divide patches on the spherical tangent domain into tokens to reduce the negative effect of panoramic distortions.

Ranked #4 on Depth Estimation on Stanford2D3D Panoramic

Depth Estimation Semantic Segmentation

Paper
Code

Multi-modal Graph Learning for Disease Prediction

1 code implementation • 11 Mar 2022 • Shuai Zheng, Zhenfeng Zhu, Zhizhe Liu, Zhenyu Guo, Yang Liu, Yuchen Yang, Yao Zhao

For disease prediction tasks, most existing graph-based methods tend to define the graph manually based on specified modality (e. g., demographic information), and then integrated other modalities to obtain the patient representation by Graph Representation Learning (GRL).

Disease Prediction Graph Learning +1

Paper
Code

Improving Neural ODEs via Knowledge Distillation

no code implementations • 10 Mar 2022 • Haoyu Chu, Shikui Wei, Qiming Lu, Yao Zhao

We propose a new training based on knowledge distillation to construct more powerful and robust Neural ODEs fitting image recognition tasks.

Knowledge Distillation

Paper
Add Code

Deep Rectangling for Image Stitching: A Learning Baseline

1 code implementation • CVPR 2022 • Lang Nie, Chunyu Lin, Kang Liao, Shuaicheng Liu, Yao Zhao

In this paper, we address these issues by proposing the first deep learning solution to image rectangling.

Image Stitching

217

Paper
Code

ACTIVE:Augmentation-Free Graph Contrastive Learning for Partial Multi-View Clustering

no code implementations • 1 Mar 2022 • Yiming Wang, Dongxia Chang, Zhiqiang Fu, Jie Wen, Yao Zhao

In this paper, we propose an augmentation-free graph contrastive learning framework, namely ACTIVE, to solve the problem of partial multi-view clustering.

Clustering Contrastive Learning +1

Paper
Add Code

Toward a More Populous Online Platform: The Economic Impacts of Compensated Reviews

no code implementations • 26 Jan 2022 • Peng Li, Arim Park, Soohyun Cho, Yao Zhao

In this paper, we study the effect of compensated reviews on non-compensated reviews by utilizing online reviews on 1, 240 auto shipping companies over a ten-year period from a transportation website.

text-classification Text Classification

Paper
Add Code

Trustworthy Knowledge Graph Completion Based on Multi-sourced Noisy Data

1 code implementation • 21 Jan 2022 • Jiacheng Huang, Yao Zhao, Wei Hu, Zhen Ning, Qijin Chen, Xiaoxia Qiu, Chengfu Huo, Weijun Ren

In this paper, we propose a new trustworthy method that exploits facts for a KG based on multi-sourced noisy data and existing facts in the KG.

Paper
Code

Auto-Weighted Layer Representation Based View Synthesis Distortion Estimation for 3-D Video Coding

no code implementations • 7 Jan 2022 • Jian Jin, Xingxing Zhang, Lili Meng, Weisi Lin, Jie Liang, Huaxiang Zhang, Yao Zhao

Experimental results show that the VSD can be accurately estimated with the weights learnt by the nonlinear mapping function once its associated S-VSDs are available.

Paper
Add Code

FPPN: Future Pseudo-LiDAR Frame Prediction for Autonomous Driving

no code implementations • 8 Dec 2021 • Xudong Huang, Chunyu Lin, Haojie Liu, Lang Nie, Yao Zhao

LiDAR sensors are widely used in autonomous driving due to the reliable 3D spatial information.

Autonomous Driving Optical Flow Estimation

Paper
Add Code

Incomplete Multi-view Clustering via Cross-view Relation Transfer

no code implementations • 1 Dec 2021 • Yiming Wang, Dongxia Chang, Zhiqiang Fu, Yao Zhao

In this paper, we consider the problem of multi-view clustering on incomplete views.

Clustering Incomplete multi-view clustering +1

Paper
Add Code

RRNet: Relational Reasoning Network with Parallel Multi-scale Attention for Salient Object Detection in Optical Remote Sensing Images

2 code implementations • 27 Oct 2021 • Runmin Cong, Yumo Zhang, Leyuan Fang, Jun Li, Yao Zhao, Sam Kwong

Salient object detection (SOD) for optical remote sensing images (RSIs) aims at locating and extracting visually distinctive objects/regions from the optical RSIs.

object-detection Object Detection +2

Paper
Code

Cross-modality Discrepant Interaction Network for RGB-D Salient Object Detection

1 code implementation • 4 Aug 2021 • Chen Zhang, Runmin Cong, Qinwei Lin, Lin Ma, Feng Li, Yao Zhao, Sam Kwong

For the cross-modality interaction in feature encoder, existing methods either indiscriminately treat RGB and depth modalities, or only habitually utilize depth cues as auxiliary information of the RGB branch.

object-detection RGB-D Salient Object Detection +1

Paper
Code

Dynamic Feature Regularized Loss for Weakly Supervised Semantic Segmentation

no code implementations • 3 Aug 2021 • Bingfeng Zhang, Jimin Xiao, Yao Zhao

In this paper, we propose a new regularized loss which utilizes both shallow and deep features that are dynamically updated in order to aggregate sufficient information to represent the relationship of different pixels.

Weakly supervised Semantic Segmentation Weakly-Supervised Semantic Segmentation

Paper
Add Code

BridgeNet: A Joint Learning Network of Depth Map Super-Resolution and Monocular Depth Estimation

no code implementations • 27 Jul 2021 • Qi Tang, Runmin Cong, Ronghui Sheng, Lingzhi He, Dan Zhang, Yao Zhao, Sam Kwong

The other is the content guidance bridge (CGBdg) designed for the depth map reconstruction process, which provides the content guidance learned from DSR task for MDE task.

Depth Map Super-Resolution Monocular Depth Estimation +1

Paper
Add Code

Depth-Aware Multi-Grid Deep Homography Estimation with Contextual Correlation

1 code implementation • 6 Jul 2021 • Lang Nie, Chunyu Lin, Kang Liao, Shuaicheng Liu, Yao Zhao

Homography estimation is an important task in computer vision applications, such as image stitching, video stabilization, and camera calibration.

Camera Calibration Homography Estimation +2

Paper
Code

Multi-modal Graph Learning for Disease Prediction

no code implementations • 1 Jul 2021 • Shuai Zheng, Zhenfeng Zhu, Zhizhe Liu, Zhenyu Guo, Yang Liu, Yao Zhao

However, it is not easy for these approaches to generalize to unseen samples.

Disease Prediction Graph Learning +1

Paper
Add Code

Unsupervised Deep Image Stitching: Reconstructing Stitched Features to Images

1 code implementation • 24 Jun 2021 • Lang Nie, Chunyu Lin, Kang Liao, Shuaicheng Liu, Yao Zhao

Even compared with the supervised solutions, our image stitching quality is still preferred by users.

Image Reconstruction Image Stitching

295

Paper
Code

Double Low-Rank Representation With Projection Distance Penalty for Clustering

no code implementations • CVPR 2021 • Zhiqiang Fu, Yao Zhao, Dongxia Chang, Xingxing Zhang, Yiming Wang

This paper presents a novel, simple yet robust self-representation method, i. e., Double Low-Rank Representation with Projection Distance penalty (DLRRPD) for clustering.

Clustering

Paper
Add Code

Affinity Attention Graph Neural Network for Weakly Supervised Semantic Segmentation

1 code implementation • 8 Jun 2021 • Bingfeng Zhang, Jimin Xiao, Jianbo Jiao, Yunchao Wei, Yao Zhao

More importantly, our approach can be readily applied to bounding box supervised instance segmentation task or other weakly supervised semantic segmentation tasks, with state-of-the-art or comparable performance among almot all weakly supervised tasks on PASCAL VOC or COCO dataset.

Box-supervised Instance Segmentation Model Optimization +3

Paper
Code

Consistent Multiple Graph Embedding for Multi-View Clustering

no code implementations • 11 May 2021 • Yiming Wang, Dongxia Chang, Zhiqiang Fu, Yao Zhao

Specifically, a multiple graph auto-encoder(M-GAE) is designed to flexibly encode the complementary information of multi-view data using a multi-graph attention fusion encoder.

Clustering Graph Attention +1

Paper
Add Code

Seeing All From a Few: Nodes Selection Using Graph Pooling for Graph Clustering

no code implementations • 30 Apr 2021 • Yiming Wang, Dongxia Chang, Zhiqian Fu, Yao Zhao

This paper is the first attempt to employ graph pooling technique for node clustering and we propose a novel dual graph embedding network (DGEN), which is designed as a two-step graph encoder connected by a graph pooling layer to learn the graph embedding.

Clustering Graph Clustering +2

Paper
Add Code

Auto-weighted low-rank representation for clustering

no code implementations • 26 Apr 2021 • Zhiqiang Fu, Yao Zhao, Dongxia Chang, Xingxing Zhang, Yiming Wang

In this paper, a novel unsupervised low-rank representation model, i. e., Auto-weighted Low-Rank Representation (ALRR), is proposed to construct a more favorable similarity graph (SG) for clustering.

Clustering Representation Learning

Paper
Add Code

Planning with Learned Entity Prompts for Abstractive Summarization

no code implementations • 15 Apr 2021 • Shashi Narayan, Yao Zhao, Joshua Maynez, Gonçalo Simoes, Vitaly Nikolaev, Ryan Mcdonald

Moreover, we demonstrate empirically that planning with entity chains provides a mechanism to control hallucinations in abstractive summaries.

Abstractive Text Summarization Specificity +1

Paper
Add Code

Towards Fast and Accurate Real-World Depth Super-Resolution: Benchmark Dataset and Baseline

1 code implementation • CVPR 2021 • Lingzhi He, Hongguang Zhu, Feng Li, Huihui Bai, Runmin Cong, Chunjie Zhang, Chunyu Lin, Meiqin Liu, Yao Zhao

Depth maps obtained by commercial depth sensors are always in low-resolution, making it difficult to be used in various computer vision tasks.

Depth Map Super-Resolution

Paper
Code

Progressively Complementary Network for Fisheye Image Rectification Using Appearance Flow

1 code implementation • CVPR 2021 • Shangrong Yang, Chunyu Lin, Kang Liao, Chunjie Zhang, Yao Zhao

We embed a correction layer in skip-connection and leverage the appearance flows in different layers to pre-correct the image features.

Paper
Code

Margin Preserving Self-paced Contrastive Learning Towards Domain Adaptation for Medical Image Segmentation

1 code implementation • 15 Mar 2021 • Zhizhe Liu, Zhenfeng Zhu, Shuai Zheng, Yang Liu, Jiayu Zhou, Yao Zhao

To bridge the gap between the source and target domains in unsupervised domain adaptation (UDA), the most common strategy puts focus on matching the marginal distributions in the feature space through adversarial learning.

Cardiac Segmentation Contrastive Learning +4

Paper
Code

Adversarial Graph Disentanglement

1 code implementation • 12 Mar 2021 • Shuai Zheng, Zhenfeng Zhu, Zhizhe Liu, Jian Cheng, Yao Zhao

For them, a component-specific aggregation approach is proposed to achieve micro-disentanglement by inferring latent components that cause the links between nodes.

Disentanglement Graph Representation Learning

Paper
Code

Just Noticeable Difference for Deep Machine Vision

no code implementations • 16 Feb 2021 • Jian Jin, Xingxing Zhang, Xin Fu, huan zhang, Weisi Lin, Jian Lou, Yao Zhao

Experimental results on image classification demonstrate that we successfully find the JND for deep machine vision.

Image Classification Neural Network Security +1

Paper
Add Code

Image Splicing Detection, Localization and Attribution via JPEG Primary Quantization Matrix Estimation and Clustering

no code implementations • 2 Feb 2021 • Yakun Niu, Benedetta Tondi, Yao Zhao, Rongrong Ni, Mauro Barni

We assume that both the spliced regions and the background image have undergone a double JPEG compression, and use a local estimate of the primary quantization matrix to distinguish between spliced regions taken from different sources.

Clustering Quantization

Paper
Add Code

Efficient video integrity analysis through container characterization

no code implementations • 26 Jan 2021 • Pengpeng Yang, Daniele Baracchi, Massimo Iuliani, Dasara Shullani, Rongrong Ni, Yao Zhao, Alessandro Piva

Furthermore, it is capable of correctly identifying the operating system of the source device for most of the tampered videos.

Paper
Add Code

Multi-Level Curriculum for Training a Distortion-Aware Barrel Distortion Rectification Model

no code implementations • ICCV 2021 • Kang Liao, Chunyu Lin, Lixin Liao, Yao Zhao, Weiyao Lin

In this paper, inspired by the curriculum learning, we analyze the barrel distortion rectification task in a progressive and meaningful manner.

Paper
Add Code

Towards Complete Scene and Regular Shape for Distortion Rectification by Curve-Aware Extrapolation

no code implementations • ICCV 2021 • Kang Liao, Chunyu Lin, Yunchao Wei, Feng Li, Shangrong Yang, Yao Zhao

To our knowledge, we are the first to tackle the challenging rectification via outpainting, and our curve-aware strategy can reach a rectification construction with complete content and regular shape.

Paper
Add Code

Learning Edge-Preserved Image Stitching from Large-Baseline Deep Homography

no code implementations • 11 Dec 2020 • Lang Nie, Chunyu Lin, Kang Liao, Yao Zhao

In this paper, we propose an image stitching learning framework, which consists of a large-baseline deep homography module and an edge-preserved deformation module.

Image Stitching

Paper
Add Code

Towards Natural Robustness Against Adversarial Examples

no code implementations • 4 Dec 2020 • Haoyu Chu, Shikui Wei, Yao Zhao

Thus, Neural ODEs have natural robustness against adversarial examples.

Adversarial Attack

Paper
Add Code

Dense Attention Fluid Network for Salient Object Detection in Optical Remote Sensing Images

3 code implementations • 26 Nov 2020 • Qijian Zhang, Runmin Cong, Chongyi Li, Ming-Ming Cheng, Yuming Fang, Xiaochun Cao, Yao Zhao, Sam Kwong

Despite the remarkable advances in visual saliency analysis for natural scene images (NSIs), salient object detection (SOD) for optical remote sensing images (RSIs) still remains an open and challenging problem.

object-detection Object Detection +1

Paper
Code

CoADNet: Collaborative Aggregation-and-Distribution Networks for Co-Salient Object Detection

1 code implementation • NeurIPS 2020 • Qijian Zhang, Runmin Cong, Junhui Hou, Chongyi Li, Yao Zhao

In the first stage, we propose a group-attentional semantic aggregation module that models inter-image relationships to generate the group-wise semantic representations.

Co-Salient Object Detection object-detection +1

Paper
Code

Learning Deep Interleaved Networks with Asymmetric Co-Attention for Image Restoration

1 code implementation • 29 Oct 2020 • Feng Li, Runmin Cong, Huihui Bai, Yifan He, Yao Zhao, Ce Zhu

In this paper, we present a deep interleaved network (DIN) that learns how information at different states should be combined for high-quality (HQ) images reconstruction.

Deblurring Image Deblurring +2

Paper
Code

Mining Generalized Features for Detecting AI-Manipulated Fake Faces

no code implementations • 27 Oct 2020 • Yang Yu, Rongrong Ni, Yao Zhao

Recently, AI-manipulated face techniques have developed rapidly and constantly, which has raised new security issues in society.

Paper
Add Code

LID 2020: The Learning from Imperfect Data Challenge Results

no code implementations • 17 Oct 2020 • Yunchao Wei, Shuai Zheng, Ming-Ming Cheng, Hang Zhao, LiWei Wang, Errui Ding, Yi Yang, Antonio Torralba, Ting Liu, Guolei Sun, Wenguan Wang, Luc van Gool, Wonho Bae, Junhyug Noh, Jinhwan Seo, Gunhee Kim, Hao Zhao, Ming Lu, Anbang Yao, Yiwen Guo, Yurong Chen, Li Zhang, Chuangchuang Tan, Tao Ruan, Guanghua Gu, Shikui Wei, Yao Zhao, Mariia Dobko, Ostap Viniavskyi, Oles Dobosevych, Zhendong Wang, Zhenyuan Chen, Chen Gong, Huanqing Yan, Jun He

The purpose of the Learning from Imperfect Data (LID) workshop is to inspire and facilitate the research in developing novel approaches that would harness the imperfect data and improve the data-efficiency during training.

object-detection Object Detection +5

Paper
Add Code

Taking Modality-free Human Identification as Zero-shot Learning

no code implementations • 2 Oct 2020 • Zhizhe Liu, Xingxing Zhang, Zhenfeng Zhu, Shuai Zheng, Yao Zhao, Jian Cheng

There have been numerous methods proposed for human identification, such as face identification, person re-identification, and gait identification.

Attribute Event Detection +4

Paper
Add Code

A Parallel Down-Up Fusion Network for Salient Object Detection in Optical Remote Sensing Images

no code implementations • 2 Oct 2020 • Chongyi Li, Runmin Cong, Chunle Guo, Hua Li, Chunjie Zhang, Feng Zheng, Yao Zhao

In this paper, we propose a novel Parallel Down-up Fusion network (PDF-Net) for SOD in optical RSIs, which takes full advantage of the in-path low- and high-level features and cross-path multi-resolution features to distinguish diversely scaled salient objects and suppress the cluttered backgrounds.

object-detection Object Detection +1

Paper
Add Code

A Deep Ordinal Distortion Estimation Approach for Distortion Rectification

no code implementations • 21 Jul 2020 • Kang Liao, Chunyu Lin, Yao Zhao

Distortion is widely existed in the images captured by popular wide-angle cameras and fisheye cameras.

Paper
Add Code

Pseudo-LiDAR Point Cloud Interpolation Based on 3D Motion Representation and Spatial Supervision

no code implementations • 20 Jun 2020 • Haojie Liu, Kang Liao, Chunyu Lin, Yao Zhao, Yulan Guo

Pseudo-LiDAR point cloud interpolation is a novel and challenging task in the field of autonomous driving, which aims to address the frequency mismatching problem between camera and LiDAR.

Autonomous Driving Optical Flow Estimation

Paper
Add Code

SEAL: Segment-wise Extractive-Abstractive Long-form Text Summarization

no code implementations • 18 Jun 2020 • Yao Zhao, Mohammad Saleh, Peter J. Liu

Most prior work in the sequence-to-sequence paradigm focused on datasets with input sequence lengths in the hundreds of tokens due to the computational constraints of common RNN and Transformer architectures.

Abstractive Text Summarization

Paper
Add Code

Referring Image Segmentation by Generative Adversarial Learning

no code implementations • IEEE 2020 • Shuang Qiu, Yao Zhao, Jianbo Jiao, Yunchao Wei, Shikui Wei

To this end, we propose to train the referring image segmentation model in a generative adversarial fashion, which well addresses the distribution similarity problem.

Image Segmentation Referring Expression +4

Paper
Add Code

Fast Template Matching and Update for Video Object Tracking and Segmentation

1 code implementation • CVPR 2020 • Mingjie Sun, Jimin Xiao, Eng Gee Lim, Bingfeng Zhang, Yao Zhao

Specifically, the reinforcement learning agent learns to decide whether to update the target template according to the quality of the predicted result.

reinforcement-learning Reinforcement Learning (RL) +5

Paper
Code

From Anchor Generation to Distribution Alignment: Learning a Discriminative Embedding Space for Zero-Shot Recognition

no code implementations • 10 Feb 2020 • Fuzhen Li, Zhenfeng Zhu, Xingxing Zhang, Jian Cheng, Yao Zhao

In zero-shot learning (ZSL), the samples to be classified are usually projected into side information templates such as attributes.

Zero-Shot Learning

Paper
Add Code

Concurrently Extrapolating and Interpolating Networks for Continuous Model Generation

1 code implementation • 12 Jan 2020 • Lijun Zhao, Jinjing Zhang, Fan Zhang, Anhong Wang, Huihui Bai, Yao Zhao

Most deep image smoothing operators are always trained repetitively when different explicit structure-texture pairs are employed as label images for each algorithm configured with different parameters.

image smoothing

Paper
Code

Deep Optimized Multiple Description Image Coding via Scalar Quantization Learning

2 code implementations • 12 Jan 2020 • Lijun Zhao, Huihui Bai, Anhong Wang, Yao Zhao

In this paper, we introduce a deep multiple description coding (MDC) framework optimized by minimizing multiple description (MD) compressive loss.

Quantization

Paper
Code

PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive Summarization

16 code implementations • ICML 2020 • Jingqing Zhang, Yao Zhao, Mohammad Saleh, Peter J. Liu

Recent work pre-training Transformers with self-supervised objectives on large text corpora has shown great success when fine-tuned on downstream NLP tasks including text summarization.

Ranked #1 on Abstractive Text Summarization on AESLC

Abstractive Text Summarization

124,527

Paper
Code

To See in the Dark: N2DGAN for Background Modeling in Nighttime Scene

no code implementations • 12 Dec 2019 • Zhenfeng Zhu, Yingying Meng, Deqiang Kong, Xingxing Zhang, Yandong Guo, Yao Zhao

Due to the deteriorated conditions of \mbox{illumination} lack and uneven lighting, nighttime images have lower contrast and higher noise than their daytime counterparts of the same scene, which limits seriously the performances of conventional background modeling methods.

Paper
Add Code

Distribution-induced Bidirectional Generative Adversarial Network for Graph Representation Learning

1 code implementation • CVPR 2020 • Shuai Zheng, Zhenfeng Zhu, Xingxing Zhang, Zhizhe Liu, Jian Cheng, Yao Zhao

Graph representation learning aims to encode all nodes of a graph into low-dimensional vectors that will serve as input of many compute vision tasks.

Generative Adversarial Network Graph Representation Learning

Paper
Code

Progressive Sample Mining and Representation Learning for One-Shot Person Re-identification with Adversarial Samples

1 code implementation • 2 Nov 2019 • Hui Li, Jimin Xiao, Ming-Jie Sun, Eng Gee Lim, Yao Zhao

To tackle this problem, we propose to iteratively guess pseudo labels for the unlabeled image samples, which are later used to update the re-identification model together with the labelled samples.

Person Re-Identification Pseudo Label +1

Paper
Code

Hierarchical Prototype Learning for Zero-Shot Recognition

no code implementations • 24 Oct 2019 • Xingxing Zhang, Shupeng Gui, Zhenfeng Zhu, Yao Zhao, Ji Liu

Specifically, HPL is able to obtain discriminability on both seen and unseen class domains by learning visual prototypes respectively under the transductive setting.

Attribute Image Captioning +3

Paper
Add Code

ATZSL: Defensive Zero-Shot Recognition in the Presence of Adversaries

no code implementations • 24 Oct 2019 • Xingxing Zhang, Shupeng Gui, Zhenfeng Zhu, Yao Zhao, Ji Liu

In this paper, we take an initial attempt, and propose a generic formulation to provide a systematical solution (named ATZSL) for learning a robust ZSL model.

Image Captioning Object Recognition +2

Paper
Add Code

ProLFA: Representative Prototype Selection for Local Feature Aggregation

1 code implementation • 24 Oct 2019 • Xingxing Zhang, Zhenfeng Zhu, Yao Zhao

Given a set of hand-crafted local features, acquiring a global representation via aggregation is a promising technique to boost computational efficiency and improve task performance.

Computational Efficiency Prototype Selection

Paper
Code

Convolutional Prototype Learning for Zero-Shot Recognition

no code implementations • 22 Oct 2019 • Zhizhe Liu, Xingxing Zhang, Zhenfeng Zhu, Shuai Zheng, Yao Zhao, Jian Cheng

The key to ZSL is to transfer knowledge from the seen to the unseen classes via auxiliary class attribute vectors.

Attribute Image Captioning +3

Paper
Add Code

PLIN: A Network for Pseudo-LiDAR Point Cloud Interpolation

no code implementations • 16 Sep 2019 • Haojie Liu, Kang Liao, Chunyu Lin, Yao Zhao, Yulan Guo

In this paper, we propose a novel Pseudo-LiDAR interpolation network (PLIN) to increase the frequency of LiDAR sensors.

Autonomous Driving

Paper
Add Code

Segmentation Mask Guided End-to-End Person Search

1 code implementation • 27 Aug 2019 • Dingyuan Zheng, Jimin Xiao, Kai-Zhu Huang, Yao Zhao

Person search aims to search for a target person among multiple images recorded by multiple surveillance cameras, which faces various challenges from both pedestrian detection and person re-identification.

Pedestrian Detection Person Re-Identification +2

Paper
Code

Primary quantization matrix estimation of double compressed JPEG images via CNN

1 code implementation • 9 Aug 2019 • Yakun Niu, Benedetta Tondi, Yao Zhao, Mauro Barni

Available model-based techniques for the estimation of the primary quantization matrix in double-compressed JPEG images work only under specific conditions regarding the relationship between the first and second compression quality factors, and the alignment of the first and second JPEG compression grids.

Quantization

Paper
Code

Edge Heuristic GAN for Non-uniform Blind Deblurring

no code implementations • 11 Jul 2019 • Shuai Zheng, Zhenfeng Zhu, Jian Cheng, Yandong Guo, Yao Zhao

Non-uniform blur, mainly caused by camera shake and motions of multiple objects, is one of the most common causes of image quality degradation.

Deblurring Generative Adversarial Network

Paper
Add Code

Architecture Selection via the Trade-off Between Accuracy and Robustness

no code implementations • 4 Jun 2019 • Zhun Deng, Cynthia Dwork, Jialiang Wang, Yao Zhao

We provide a general framework for characterizing the trade-off between accuracy and robustness in supervised learning.

Adversarial Attack

Paper
Add Code

EA-LSTM: Evolutionary Attention-based LSTM for Time Series Prediction

no code implementations • 9 Nov 2018 • Youru Li, Zhenfeng Zhu, Deqiang Kong, Hua Han, Yao Zhao

To address this issue, an evolutionary attention-based LSTM training with competitive random search is proposed for multivariate time series prediction.

Time Series Time Series Prediction

Paper
Add Code

Correlation Filter Selection for Visual Tracking Using Reinforcement Learning

no code implementations • 8 Nov 2018 • Yanchun Xie, Jimin Xiao, Kai-Zhu Huang, Jeyarajan Thiyagalingam, Yao Zhao

In this paper, we propose a novel approach to address the correlation filter update problem.

reinforcement-learning Reinforcement Learning (RL) +1

Paper
Add Code

Deep Multiple Description Coding by Learning Scalar Quantization

1 code implementation • 5 Nov 2018 • Lijun Zhao, Huihui Bai, Anhong Wang, Yao Zhao

Secondly, two entropy estimation networks are learned to estimate the informative amounts of the quantized tensors, which can further supervise the learning of multiple description encoder network to represent the input image delicately.

Quantization

Paper
Code

Paragraph-level Neural Question Generation with Maxout Pointer and Gated Self-attention Networks

1 code implementation • EMNLP 2018 • Yao Zhao, Xiaochuan Ni, Yuanyuan Ding, Qifa Ke

Long text has posed challenges for sequence to sequence neural models in question generation {--} worse performances were reported if using the whole paragraph (with multiple sentences) as the input.

Question Answering Question Generation +4

137

Paper
Code

Devil in the Details: Towards Accurate Single and Multiple Human Parsing

2 code implementations • 17 Sep 2018 • Tao Ruan, Ting Liu, Zilong Huang, Yunchao Wei, Shikui Wei, Yao Zhao, Thomas Huang

Human parsing has received considerable interest due to its wide application potentials.

Ranked #2 on Person Re-Identification on Market-1501-C

Human Parsing Person Re-Identification +1

212

Paper
Code

Virtual Codec Supervised Re-Sampling Network for Image Compression

1 code implementation • 22 Jun 2018 • Lijun Zhao, Huihui Bai, Anhong Wang, Yao Zhao

In order to train RSN network and IDN network together in an end-to-end fashion, our VCN network intimates projection from the re-sampled vectors to the IDN-decoded image.

Dimensionality Reduction Image Compression +1

Paper
Code

Adversarial Attacks and Defences Competition

1 code implementation • 31 Mar 2018 • Alexey Kurakin, Ian Goodfellow, Samy Bengio, Yinpeng Dong, Fangzhou Liao, Ming Liang, Tianyu Pang, Jun Zhu, Xiaolin Hu, Cihang Xie, Jian-Yu Wang, Zhishuai Zhang, Zhou Ren, Alan Yuille, Sangxia Huang, Yao Zhao, Yuzhe Zhao, Zhonglin Han, Junjiajia Long, Yerkebulan Berdibekov, Takuya Akiba, Seiya Tokui, Motoki Abe

To accelerate research on adversarial examples and robustness of machine learning classifiers, Google Brain organized a NIPS 2017 competition that encouraged researchers to develop new methods to generate adversarial examples as well as to develop new ways to defend against them.

BIG-bench Machine Learning

145

Paper
Code

Security Consideration For Deep Learning-Based Image Forensics

no code implementations • 29 Mar 2018 • Wei Zhao, Pengpeng Yang, Rongrong Ni, Yao Zhao, Haorui Wu

Instead of improving it, in this paper, the safety of deep learning based methods in the field of image forensics is taken into account.

Image Forensics

Paper
Add Code

Non-Local Graph-Based Prediction For Reversible Data Hiding In Images

no code implementations • 20 Feb 2018 • Qi Chang, Gene Cheung, Yao Zhao, Xiaolong Li, Rongrong Ni

If sufficiently smooth, we pose a maximum a posteriori (MAP) problem using either a quadratic Laplacian regularizer or a graph total variation (GTV) term as signal prior.

Paper
Add Code

Mixed-Resolution Image Representation and Compression with Convolutional Neural Networks

no code implementations • 2 Feb 2018 • Lijun Zhao, Huihui Bai, Feng Li, Anhong Wang, Yao Zhao

Firstly, given one input image, feature description neural network (FDNN) is used to generate a new representation of this image, so that this image representation can be more efficiently compressed by standard codec, as compared to the input image.

Image Compression Quantization

Paper
Add Code

Secure Detection of Image Manipulation by means of Random Feature Selection

no code implementations • 2 Feb 2018 • Zhipeng Chen, Benedetta Tondi, Xiaolong Li, Rongrong Ni, Yao Zhao, Mauro Barni

We address the problem of data-driven image manipulation detection in the presence of an attacker with limited knowledge about the detector.

Cryptography and Security

Paper
Add Code

Multiple Description Convolutional Neural Networks for Image Compression

no code implementations • 20 Jan 2018 • Lijun Zhao, Huihui Bai, Anhong Wang, Yao Zhao

Thirdly, multiple description virtual codec network (MDVCN) is proposed to bridge the gap between MDGN network and MDRN network in order to train an end-to-end MDC framework.

Image Compression

Paper
Add Code

Learning a Virtual Codec Based on Deep Convolutional Neural Network to Compress Image

1 code implementation • 16 Dec 2017 • Lijun Zhao, Huihui Bai, Anhong Wang, Yao Zhao

Due to the challenge of directly learning a non-linear function for a standard codec based on convolutional neural network, we propose to learn a virtual codec neural network to approximate the projection from the valid description image to the post-processed compressed image, so that the gradient could be efficiently back-propagated from the post-processing neural network to the feature description neural network during training.

Blocking Image Compression +2

Paper
Code

Simultaneously Color-Depth Super-Resolution with Conditional Generative Adversarial Network

no code implementations • 30 Aug 2017 • Lijun Zhao, Huihui Bai, Jie Liang, Bing Zeng, Anhong Wang, Yao Zhao

Firstly, given the low-resolution depth image and low-resolution color image, a generative network is proposed to leverage mutual information of color image and depth image to enhance each other in consideration of the geometry structural dependency of color-depth image in the same scene.

Edge Detection Generative Adversarial Network +5

Paper
Add Code

Local Activity-tuned Image Filtering for Noise Removal and Image Smoothing

no code implementations • 9 Jul 2017 • Lijun Zhao, Jie Liang, Huihui Bai, Lili Meng, Anhong Wang, Yao Zhao

Both frameworks employ the division of gradient and the local activity measurement to achieve noise removal.

Image Denoising image smoothing

Paper
Add Code

Object Region Mining with Adversarial Erasing: A Simple Classification to Semantic Segmentation Approach

no code implementations • CVPR 2017 • Yunchao Wei, Jiashi Feng, Xiaodan Liang, Ming-Ming Cheng, Yao Zhao, Shuicheng Yan

We investigate a principle way to progressively mine discriminative object regions using classification networks to address the weakly-supervised semantic segmentation problems.

Classification General Classification +4

Paper
Add Code

Source Camera Identification Based On Content-Adaptive Fusion Network

no code implementations • 15 Mar 2017 • Pengpeng Yang, Wei Zhao, Rongrong Ni, Yao Zhao

In this paper, we propose a solution to identify the source camera of the small-size images: content-adaptive fusion network.

Paper
Add Code

A New Evaluation Protocol and Benchmarking Results for Extendable Cross-media Retrieval

no code implementations • 10 Mar 2017 • Ruoyu Liu, Yao Zhao, Liang Zheng, Shikui Wei, Yi Yang

Additionally, a trivial solution, \ie, directly using the predicted class label for cross-media retrieval, is tested.

Benchmarking Image Retrieval +1

Paper
Add Code

Camera Fingerprint: A New Perspective for Identifying User's Identity

no code implementations • 25 Oct 2016 • Xiang Jiang, Shikui Wei, Ruizhen Zhao, Yao Zhao, Xindong Wu

The underlying assumption is that multiple accounts belonging to the same person contain the same or similar camera fingerprint information.

Product Recommendation

Paper
Add Code

STC: A Simple to Complex Framework for Weakly-supervised Semantic Segmentation

1 code implementation • 10 Sep 2015 • Yunchao Wei, Xiaodan Liang, Yunpeng Chen, Xiaohui Shen, Ming-Ming Cheng, Jiashi Feng, Yao Zhao, Shuicheng Yan

Then, a better network called Enhanced-DCNN is learned with supervision from the predicted segmentation masks of simple images based on the Initial-DCNN as well as the image-level annotations.

object-detection RGB Salient Object Detection +4

Paper
Code

Indexing of CNN Features for Large Scale Image Search

no code implementations • 2 Aug 2015 • Ruoyu Liu, Yao Zhao, Shikui Wei, Yi Yang

The convolutional neural network (CNN) features can give a good description of image content, which usually represent images with unique global vectors.

Clustering Image Retrieval +2

Paper
Add Code

Modality-dependent Cross-media Retrieval

no code implementations • 22 Jun 2015 • Yunchao Wei, Yao Zhao, Zhenfeng Zhu, Shikui Wei, Yanhui Xiao, Jiashi Feng, Shuicheng Yan

Specifically, by jointly optimizing the correlation between images and text and the linear regression from one modal space (image or text) to the semantic space, two couples of mappings are learned to project images and text from their original feature spaces into two common latent subspaces (one for I2T and the other for T2I).

Retrieval

Paper
Add Code

CNN: Single-label to Multi-label

no code implementations • 22 Jun 2014 • Yunchao Wei, Wei Xia, Junshi Huang, Bingbing Ni, Jian Dong, Yao Zhao, Shuicheng Yan

Convolutional Neural Network (CNN) has demonstrated promising performance in single-label image classification tasks.

Image Classification

Paper
Add Code

Kernel Reconstruction ICA for Sparse Representation

no code implementations • 9 Apr 2013 • Yanhui Xiao, Zhenfeng Zhu, Yao Zhao

However, ICA is not only sensitive to whitening but also difficult to learn an over-complete basis.

Image Classification

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.