Search Results for author: Bo Li

Found 577 papers, 246 papers with code

Relative Pose Estimation of Calibrated Cameras with Known SE(3) Invariants

1 code implementation ECCV 2020 Bo Li, Evgeniy Martyushev, Gim Hee Lee

In this paper, we present a complete comprehensive study of the relative pose estimation problem for a calibrated camera constrained by known $\mathrm{SE}(3)$ invariant, which involves 5 minimal problems in total.

Pose Estimation Translation

Profanity-Avoiding Training Framework for Seq2seq Models with Certified Robustness

no code implementations EMNLP 2021 Hengtong Zhang, Tianhang Zheng, Yaliang Li, Jing Gao, Lu Su, Bo Li

To address this problem, we propose a training framework with certified robustness to eliminate the causes that trigger the generation of profanity.

Dialogue Generation Style Transfer

Alibaba Speech Translation Systems for IWSLT 2018

no code implementations IWSLT (EMNLP) 2018 Nguyen Bach, Hongjie Chen, Kai Fan, Cheung-Chi Leung, Bo Li, Chongjia Ni, Rong Tong, Pei Zhang, Boxing Chen, Bin Ma, Fei Huang

This work describes the En→De Alibaba speech translation system developed for the evaluation campaign of the International Workshop on Spoken Language Translation (IWSLT) 2018.

Sentence Translation

Introducing v0.5 of the AI Safety Benchmark from MLCommons

1 code implementation18 Apr 2024 Bertie Vidgen, Adarsh Agrawal, Ahmed M. Ahmed, Victor Akinwande, Namir Al-Nuaimi, Najla Alfaraj, Elie Alhajjar, Lora Aroyo, Trupti Bavalatti, Borhane Blili-Hamelin, Kurt Bollacker, Rishi Bomassani, Marisa Ferrara Boston, Siméon Campos, Kal Chakra, Canyu Chen, Cody Coleman, Zacharie Delpierre Coudert, Leon Derczynski, Debojyoti Dutta, Ian Eisenberg, James Ezick, Heather Frase, Brian Fuller, Ram Gandikota, Agasthya Gangavarapu, Ananya Gangavarapu, James Gealy, Rajat Ghosh, James Goel, Usman Gohar, Sujata Goswami, Scott A. Hale, Wiebke Hutiri, Joseph Marvin Imperial, Surgan Jandial, Nick Judd, Felix Juefei-Xu, Foutse khomh, Bhavya Kailkhura, Hannah Rose Kirk, Kevin Klyman, Chris Knotz, Michael Kuchnik, Shachi H. Kumar, Chris Lengerich, Bo Li, Zeyi Liao, Eileen Peters Long, Victor Lu, Yifan Mai, Priyanka Mary Mammen, Kelvin Manyeki, Sean McGregor, Virendra Mehta, Shafee Mohammed, Emanuel Moss, Lama Nachman, Dinesh Jinenhally Naganna, Amin Nikanjam, Besmira Nushi, Luis Oala, Iftach Orr, Alicia Parrish, Cigdem Patlak, William Pietri, Forough Poursabzi-Sangdeh, Eleonora Presani, Fabrizio Puletti, Paul Röttger, Saurav Sahay, Tim Santos, Nino Scherrer, Alice Schoenauer Sebag, Patrick Schramowski, Abolfazl Shahbazi, Vin Sharma, Xudong Shen, Vamsi Sistla, Leonard Tang, Davide Testuggine, Vithursan Thangarasa, Elizabeth Anne Watkins, Rebecca Weiss, Chris Welty, Tyler Wilbers, Adina Williams, Carole-Jean Wu, Poonam Yadav, Xianjun Yang, Yi Zeng, Wenhui Zhang, Fedor Zhdanov, Jiacheng Zhu, Percy Liang, Peter Mattson, Joaquin Vanschoren

We created a new taxonomy of 13 hazard categories, of which 7 have tests in the v0. 5 benchmark.

SambaLingo: Teaching Large Language Models New Languages

no code implementations8 Apr 2024 Zoltan Csaki, Bo Li, Jonathan Li, Qiantong Xu, Pian Pawakapan, Leon Zhang, Yun Du, Hengyu Zhao, Changran Hu, Urmish Thakker

In this paper, we present a comprehensive investigation into the adaptation of LLMs to new languages.

ALERT: A Comprehensive Benchmark for Assessing Large Language Models' Safety through Red Teaming

1 code implementation6 Apr 2024 Simone Tedeschi, Felix Friedrich, Patrick Schramowski, Kristian Kersting, Roberto Navigli, Huu Nguyen, Bo Li

When building Large Language Models (LLMs), it is paramount to bear safety in mind and protect them with guardrails.

KnowHalu: Hallucination Detection via Multi-Form Knowledge Based Factual Checking

1 code implementation3 Apr 2024 Jiawei Zhang, Chejian Xu, Yu Gai, Freddy Lecue, Dawn Song, Bo Li

This paper introduces KnowHalu, a novel approach for detecting hallucinations in text generated by large language models (LLMs), utilizing step-wise reasoning, multi-formulation query, multi-form knowledge for factual checking, and fusion-based detection mechanism.

Fact Checking Hallucination +1

QuaRot: Outlier-Free 4-Bit Inference in Rotated LLMs

1 code implementation30 Mar 2024 Saleh Ashkboos, Amirkeivan Mohtashami, Maximilian L. Croci, Bo Li, Martin Jaggi, Dan Alistarh, Torsten Hoefler, James Hensman

We introduce QuaRot, a new Quantization scheme based on Rotations, which is able to quantize LLMs end-to-end, including all weights, activations, and KV cache in 4 bits.

Quantization

Checkpoint Merging via Bayesian Optimization in LLM Pretraining

no code implementations28 Mar 2024 Deyuan Liu, Zecheng Wang, Bingning Wang, WeiPeng Chen, Chunshan Li, Zhiying Tu, Dianhui Chu, Bo Li, Dianbo Sui

The rapid proliferation of large language models (LLMs) such as GPT-4 and Gemini underscores the intense demand for resources during their training processes, posing significant challenges due to substantial computational and environmental costs.

Bayesian Optimization

Multi-Task Dense Prediction via Mixture of Low-Rank Experts

1 code implementation26 Mar 2024 YuQi Yang, Peng-Tao Jiang, Qibin Hou, Hao Zhang, Jinwei Chen, Bo Li

Furthermore, to control the parameters and computational cost brought by the increase in the number of experts, we take inspiration from LoRA and propose to leverage the low-rank format of a vanilla convolution in the expert network.

PPA-Game: Characterizing and Learning Competitive Dynamics Among Online Content Creators

no code implementations22 Mar 2024 Renzhe Xu, Haotian Wang, Xingxuan Zhang, Bo Li, Peng Cui

We introduce the Proportional Payoff Allocation Game (PPA-Game) to model how agents, akin to content creators on platforms like YouTube and TikTok, compete for divisible resources and consumers' attention.

Empowering Segmentation Ability to Multi-modal Large Language Models

no code implementations21 Mar 2024 YuQi Yang, Peng-Tao Jiang, Jing Wang, Hao Zhang, Kai Zhao, Jinwei Chen, Bo Li

Multi-modal large language models (MLLMs) can understand image-language prompts and demonstrate impressive reasoning ability.

Dialogue Generation Segmentation +1

Contrastive Balancing Representation Learning for Heterogeneous Dose-Response Curves Estimation

1 code implementation21 Mar 2024 Minqin Zhu, Anpeng Wu, Haoxuan Li, Ruoxuan Xiong, Bo Li, Xiaoqing Yang, Xuan Qin, Peng Zhen, Jiecheng Guo, Fei Wu, Kun Kuang

Estimating the individuals' potential response to varying treatment doses is crucial for decision-making in areas such as precision medicine and management science.

counterfactual Decision Making +2

Few-shot Object Localization

1 code implementation19 Mar 2024 Yunhan Ren, Bo Li, Chengyang Zhang, Yong Zhang, BaoCai Yin

This task achieves generalized object localization by leveraging a small number of labeled support samples to query the positional information of objects within corresponding images.

Model Optimization Object +2

RigorLLM: Resilient Guardrails for Large Language Models against Undesired Content

no code implementations19 Mar 2024 Zhuowen Yuan, Zidi Xiong, Yi Zeng, Ning Yu, Ruoxi Jia, Dawn Song, Bo Li

The innovative use of constrained optimization and a fusion-based guardrail approach represents a significant step forward in developing more secure and reliable LLMs, setting a new standard for content moderation frameworks in the face of evolving digital threats.

Data Augmentation

Decoding Compressed Trust: Scrutinizing the Trustworthiness of Efficient LLMs Under Compression

no code implementations18 Mar 2024 Junyuan Hong, Jinhao Duan, Chenhui Zhang, Zhangheng Li, Chulin Xie, Kelsey Lieberman, James Diffenderfer, Brian Bartoldson, Ajay Jaiswal, Kaidi Xu, Bhavya Kailkhura, Dan Hendrycks, Dawn Song, Zhangyang Wang, Bo Li

While state-of-the-art (SoTA) compression methods boast impressive advancements in preserving benign task performance, the potential risks of compression in terms of safety and trustworthiness have been largely neglected.

Ethics Fairness +1

COLEP: Certifiably Robust Learning-Reasoning Conformal Prediction via Probabilistic Circuits

1 code implementation17 Mar 2024 Mintong Kang, Nezihe Merve Gürel, Linyi Li, Bo Li

In this work, we propose a certifiably robust learning-reasoning conformal prediction framework (COLEP) via probabilistic circuits, which comprise a data-driven learning component that trains statistical models to learn different semantic concepts, and a reasoning component that encodes knowledge and characterizes the relationships among the trained models for logic reasoning.

Conformal Prediction

Shake to Leak: Fine-tuning Diffusion Models Can Amplify the Generative Privacy Risk

1 code implementation14 Mar 2024 Zhangheng Li, Junyuan Hong, Bo Li, Zhangyang Wang

While diffusion models have recently demonstrated remarkable progress in generating realistic images, privacy risks also arise: published models or APIs could generate training images and thus leak privacy-sensitive training information.

Inference Attack Membership Inference Attack

3D-aware Image Generation and Editing with Multi-modal Conditions

no code implementations11 Mar 2024 Bo Li, Yi-ke Li, Zhi-fen He, Bin Liu, Yun-Kun Lai

3D-consistent image generation from a single 2D semantic label is an important and challenging research topic in computer graphics and computer vision.

Attribute Disentanglement +2

Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

no code implementations8 Mar 2024 Gemini Team, Machel Reid, Nikolay Savinov, Denis Teplyashin, Dmitry, Lepikhin, Timothy Lillicrap, Jean-Baptiste Alayrac, Radu Soricut, Angeliki Lazaridou, Orhan Firat, Julian Schrittwieser, Ioannis Antonoglou, Rohan Anil, Sebastian Borgeaud, Andrew Dai, Katie Millican, Ethan Dyer, Mia Glaese, Thibault Sottiaux, Benjamin Lee, Fabio Viola, Malcolm Reynolds, Yuanzhong Xu, James Molloy, Jilin Chen, Michael Isard, Paul Barham, Tom Hennigan, Ross Mcilroy, Melvin Johnson, Johan Schalkwyk, Eli Collins, Eliza Rutherford, Erica Moreira, Kareem Ayoub, Megha Goel, Clemens Meyer, Gregory Thornton, Zhen Yang, Henryk Michalewski, Zaheer Abbas, Nathan Schucher, Ankesh Anand, Richard Ives, James Keeling, Karel Lenc, Salem Haykal, Siamak Shakeri, Pranav Shyam, Aakanksha Chowdhery, Roman Ring, Stephen Spencer, Eren Sezener, Luke Vilnis, Oscar Chang, Nobuyuki Morioka, George Tucker, Ce Zheng, Oliver Woodman, Nithya Attaluri, Tomas Kocisky, Evgenii Eltyshev, Xi Chen, Timothy Chung, Vittorio Selo, Siddhartha Brahma, Petko Georgiev, Ambrose Slone, Zhenkai Zhu, James Lottes, Siyuan Qiao, Ben Caine, Sebastian Riedel, Alex Tomala, Martin Chadwick, Juliette Love, Peter Choy, Sid Mittal, Neil Houlsby, Yunhao Tang, Matthew Lamm, Libin Bai, Qiao Zhang, Luheng He, Yong Cheng, Peter Humphreys, Yujia Li, Sergey Brin, Albin Cassirer, Yingjie Miao, Lukas Zilka, Taylor Tobin, Kelvin Xu, Lev Proleev, Daniel Sohn, Alberto Magni, Lisa Anne Hendricks, Isabel Gao, Santiago Ontanon, Oskar Bunyan, Nathan Byrd, Abhanshu Sharma, Biao Zhang, Mario Pinto, Rishika Sinha, Harsh Mehta, Dawei Jia, Sergi Caelles, Albert Webson, Alex Morris, Becca Roelofs, Yifan Ding, Robin Strudel, Xuehan Xiong, Marvin Ritter, Mostafa Dehghani, Rahma Chaabouni, Abhijit Karmarkar, Guangda Lai, Fabian Mentzer, Bibo Xu, Yaguang Li, Yujing Zhang, Tom Le Paine, Alex Goldin, Behnam Neyshabur, Kate Baumli, Anselm Levskaya, Michael Laskin, Wenhao Jia, Jack W. Rae, Kefan Xiao, Antoine He, Skye Giordano, Lakshman Yagati, Jean-Baptiste Lespiau, Paul Natsev, Sanjay Ganapathy, Fangyu Liu, Danilo Martins, Nanxin Chen, Yunhan Xu, Megan Barnes, Rhys May, Arpi Vezer, Junhyuk Oh, Ken Franko, Sophie Bridgers, Ruizhe Zhao, Boxi Wu, Basil Mustafa, Sean Sechrist, Emilio Parisotto, Thanumalayan Sankaranarayana Pillai, Chris Larkin, Chenjie Gu, Christina Sorokin, Maxim Krikun, Alexey Guseynov, Jessica Landon, Romina Datta, Alexander Pritzel, Phoebe Thacker, Fan Yang, Kevin Hui, Anja Hauth, Chih-Kuan Yeh, David Barker, Justin Mao-Jones, Sophia Austin, Hannah Sheahan, Parker Schuh, James Svensson, Rohan Jain, Vinay Ramasesh, Anton Briukhov, Da-Woon Chung, Tamara von Glehn, Christina Butterfield, Priya Jhakra, Matthew Wiethoff, Justin Frye, Jordan Grimstad, Beer Changpinyo, Charline Le Lan, Anna Bortsova, Yonghui Wu, Paul Voigtlaender, Tara Sainath, Shane Gu, Charlotte Smith, Will Hawkins, Kris Cao, James Besley, Srivatsan Srinivasan, Mark Omernick, Colin Gaffney, Gabriela Surita, Ryan Burnell, Bogdan Damoc, Junwhan Ahn, Andrew Brock, Mantas Pajarskas, Anastasia Petrushkina, Seb Noury, Lorenzo Blanco, Kevin Swersky, Arun Ahuja, Thi Avrahami, Vedant Misra, Raoul de Liedekerke, Mariko Iinuma, Alex Polozov, Sarah York, George van den Driessche, Paul Michel, Justin Chiu, Rory Blevins, Zach Gleicher, Adrià Recasens, Alban Rrustemi, Elena Gribovskaya, Aurko Roy, Wiktor Gworek, Sébastien M. R. Arnold, Lisa Lee, James Lee-Thorp, Marcello Maggioni, Enrique Piqueras, Kartikeya Badola, Sharad Vikram, Lucas Gonzalez, Anirudh Baddepudi, Evan Senter, Jacob Devlin, James Qin, Michael Azzam, Maja Trebacz, Martin Polacek, Kashyap Krishnakumar, Shuo-Yiin Chang, Matthew Tung, Ivo Penchev, Rishabh Joshi, Kate Olszewska, Carrie Muir, Mateo Wirth, Ale Jakse Hartman, Josh Newlan, Sheleem Kashem, Vijay Bolina, Elahe Dabir, Joost van Amersfoort, Zafarali Ahmed, James Cobon-Kerr, Aishwarya Kamath, Arnar Mar Hrafnkelsson, Le Hou, Ian Mackinnon, Alexandre Frechette, Eric Noland, Xiance Si, Emanuel Taropa, Dong Li, Phil Crone, Anmol Gulati, Sébastien Cevey, Jonas Adler, Ada Ma, David Silver, Simon Tokumine, Richard Powell, Stephan Lee, Kiran Vodrahalli, Samer Hassan, Diana Mincu, Antoine Yang, Nir Levine, Jenny Brennan, Mingqiu Wang, Sarah Hodkinson, Jeffrey Zhao, Josh Lipschultz, Aedan Pope, Michael B. Chang, Cheng Li, Laurent El Shafey, Michela Paganini, Sholto Douglas, Bernd Bohnet, Fabio Pardo, Seth Odoom, Mihaela Rosca, Cicero Nogueira dos santos, Kedar Soparkar, Arthur Guez, Tom Hudson, Steven Hansen, Chulayuth Asawaroengchai, Ravi Addanki, Tianhe Yu, Wojciech Stokowiec, Mina Khan, Justin Gilmer, Jaehoon Lee, Carrie Grimes Bostock, Keran Rong, Jonathan Caton, Pedram Pejman, Filip Pavetic, Geoff Brown, Vivek Sharma, Mario Lučić, Rajkumar Samuel, Josip Djolonga, Amol Mandhane, Lars Lowe Sjösund, Elena Buchatskaya, Elspeth White, Natalie Clay, Jiepu Jiang, Hyeontaek Lim, Ross Hemsley, Zeyncep Cankara, Jane Labanowski, Nicola De Cao, David Steiner, Sayed Hadi Hashemi, Jacob Austin, Anita Gergely, Tim Blyth, Joe Stanton, Kaushik Shivakumar, Aditya Siddhant, Anders Andreassen, Carlos Araya, Nikhil Sethi, Rakesh Shivanna, Steven Hand, Ankur Bapna, Ali Khodaei, Antoine Miech, Garrett Tanzer, Andy Swing, Shantanu Thakoor, Lora Aroyo, Zhufeng Pan, Zachary Nado, Jakub Sygnowski, Stephanie Winkler, Dian Yu, Mohammad Saleh, Loren Maggiore, Yamini Bansal, Xavier Garcia, Mehran Kazemi, Piyush Patil, Ishita Dasgupta, Iain Barr, Minh Giang, Thais Kagohara, Ivo Danihelka, Amit Marathe, Vladimir Feinberg, Mohamed Elhawaty, Nimesh Ghelani, Dan Horgan, Helen Miller, Lexi Walker, Richard Tanburn, Mukarram Tariq, Disha Shrivastava, Fei Xia, Qingze Wang, Chung-Cheng Chiu, Zoe Ashwood, Khuslen Baatarsukh, Sina Samangooei, Raphaël Lopez Kaufman, Fred Alcober, Axel Stjerngren, Paul Komarek, Katerina Tsihlas, Anudhyan Boral, Ramona Comanescu, Jeremy Chen, Ruibo Liu, Chris Welty, Dawn Bloxwich, Charlie Chen, Yanhua Sun, Fangxiaoyu Feng, Matthew Mauger, Xerxes Dotiwalla, Vincent Hellendoorn, Michael Sharman, Ivy Zheng, Krishna Haridasan, Gabe Barth-Maron, Craig Swanson, Dominika Rogozińska, Alek Andreev, Paul Kishan Rubenstein, Ruoxin Sang, Dan Hurt, Gamaleldin Elsayed, Renshen Wang, Dave Lacey, Anastasija Ilić, Yao Zhao, Adam Iwanicki, Alejandro Lince, Alexander Chen, Christina Lyu, Carl Lebsack, Jordan Griffith, Meenu Gaba, Paramjit Sandhu, Phil Chen, Anna Koop, Ravi Rajwar, Soheil Hassas Yeganeh, Solomon Chang, Rui Zhu, Soroush Radpour, Elnaz Davoodi, Ving Ian Lei, Yang Xu, Daniel Toyama, Constant Segal, Martin Wicke, Hanzhao Lin, Anna Bulanova, Adrià Puigdomènech Badia, Nemanja Rakićević, Pablo Sprechmann, Angelos Filos, Shaobo Hou, Víctor Campos, Nora Kassner, Devendra Sachan, Meire Fortunato, Chimezie Iwuanyanwu, Vitaly Nikolaev, Balaji Lakshminarayanan, Sadegh Jazayeri, Mani Varadarajan, Chetan Tekur, Doug Fritz, Misha Khalman, David Reitter, Kingshuk Dasgupta, Shourya Sarcar, Tina Ornduff, Javier Snaider, Fantine Huot, Johnson Jia, Rupert Kemp, Nejc Trdin, Anitha Vijayakumar, Lucy Kim, Christof Angermueller, Li Lao, Tianqi Liu, Haibin Zhang, David Engel, Somer Greene, Anaïs White, Jessica Austin, Lilly Taylor, Shereen Ashraf, Dangyi Liu, Maria Georgaki, Irene Cai, Yana Kulizhskaya, Sonam Goenka, Brennan Saeta, Ying Xu, Christian Frank, Dario de Cesare, Brona Robenek, Harry Richardson, Mahmoud Alnahlawi, Christopher Yew, Priya Ponnapalli, Marco Tagliasacchi, Alex Korchemniy, Yelin Kim, Dinghua Li, Bill Rosgen, Kyle Levin, Jeremy Wiesner, Praseem Banzal, Praveen Srinivasan, Hongkun Yu, Çağlar Ünlü, David Reid, Zora Tung, Daniel Finchelstein, Ravin Kumar, Andre Elisseeff, Jin Huang, Ming Zhang, Ricardo Aguilar, Mai Giménez, Jiawei Xia, Olivier Dousse, Willi Gierke, Damion Yates, Komal Jalan, Lu Li, Eri Latorre-Chimoto, Duc Dung Nguyen, Ken Durden, Praveen Kallakuri, Yaxin Liu, Matthew Johnson, Tomy Tsai, Alice Talbert, Jasmine Liu, Alexander Neitz, Chen Elkind, Marco Selvi, Mimi Jasarevic, Livio Baldini Soares, Albert Cui, Pidong Wang, Alek Wenjiao Wang, Xinyu Ye, Krystal Kallarackal, Lucia Loher, Hoi Lam, Josef Broder, Dan Holtmann-Rice, Nina Martin, Bramandia Ramadhana, Mrinal Shukla, Sujoy Basu, Abhi Mohan, Nick Fernando, Noah Fiedel, Kim Paterson, Hui Li, Ankush Garg, Jane Park, DongHyun Choi, Diane Wu, Sankalp Singh, Zhishuai Zhang, Amir Globerson, Lily Yu, John Carpenter, Félix de Chaumont Quitry, Carey Radebaugh, Chu-Cheng Lin, Alex Tudor, Prakash Shroff, Drew Garmon, Dayou Du, Neera Vats, Han Lu, Shariq Iqbal, Alex Yakubovich, Nilesh Tripuraneni, James Manyika, Haroon Qureshi, Nan Hua, Christel Ngani, Maria Abi Raad, Hannah Forbes, Jeff Stanway, Mukund Sundararajan, Victor Ungureanu, Colton Bishop, Yunjie Li, Balaji Venkatraman, Bo Li, Chloe Thornton, Salvatore Scellato, Nishesh Gupta, Yicheng Wang, Ian Tenney, Xihui Wu, Ashish Shenoy, Gabriel Carvajal, Diana Gage Wright, Ben Bariach, Zhuyun Xiao, Peter Hawkins, Sid Dalmia, Clement Farabet, Pedro Valenzuela, Quan Yuan, Ananth Agarwal, Mia Chen, Wooyeol Kim, Brice Hulse, Nandita Dukkipati, Adam Paszke, Andrew Bolt, Kiam Choo, Jennifer Beattie, Jennifer Prendki, Harsha Vashisht, Rebeca Santamaria-Fernandez, Luis C. Cobo, Jarek Wilkiewicz, David Madras, Ali Elqursh, Grant Uy, Kevin Ramirez, Matt Harvey, Tyler Liechty, Heiga Zen, Jeff Seibert, Clara Huiyi Hu, Andrey Khorlin, Maigo Le, Asaf Aharoni, Megan Li, Lily Wang, Sandeep Kumar, Norman Casagrande, Jay Hoover, Dalia El Badawy, David Soergel, Denis Vnukov, Matt Miecnikowski, Jiri Simsa, Praveen Kumar, Thibault Sellam, Daniel Vlasic, Samira Daruki, Nir Shabat, John Zhang, Guolong Su, Jiageng Zhang, Jeremiah Liu, Yi Sun, Evan Palmer, Alireza Ghaffarkhah, Xi Xiong, Victor Cotruta, Michael Fink, Lucas Dixon, Ashwin Sreevatsa, Adrian Goedeckemeyer, Alek Dimitriev, Mohsen Jafari, Remi Crocker, Nicholas FitzGerald, Aviral Kumar, Sanjay Ghemawat, Ivan Philips, Frederick Liu, Yannie Liang, Rachel Sterneck, Alena Repina, Marcus Wu, Laura Knight, Marin Georgiev, Hyo Lee, Harry Askham, Abhishek Chakladar, Annie Louis, Carl Crous, Hardie Cate, Dessie Petrova, MICHAEL QUINN, Denese Owusu-Afriyie, Achintya Singhal, Nan Wei, Solomon Kim, Damien Vincent, Milad Nasr, Christopher A. Choquette-Choo, Reiko Tojo, Shawn Lu, Diego de Las Casas, Yuchung Cheng, Tolga Bolukbasi, Katherine Lee, Saaber Fatehi, Rajagopal Ananthanarayanan, Miteyan Patel, Charbel Kaed, Jing Li, Shreyas Rammohan Belle, Zhe Chen, Jaclyn Konzelmann, Siim Põder, Roopal Garg, Vinod Koverkathu, Adam Brown, Chris Dyer, Rosanne Liu, Azade Nova, Jun Xu, Alanna Walton, Alicia Parrish, Mark Epstein, Sara McCarthy, Slav Petrov, Demis Hassabis, Koray Kavukcuoglu, Jeffrey Dean, Oriol Vinyals

In this report, we present the latest model of the Gemini family, Gemini 1. 5 Pro, a highly compute-efficient multimodal mixture-of-experts model capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio.

Code Generation Retrieval

BjTT: A Large-scale Multimodal Dataset for Traffic Prediction

2 code implementations8 Mar 2024 Chengyang Zhang, Yong Zhang, Qitan Shao, Jiangtao Feng, Bo Li, Yisheng Lv, Xinglin Piao, BaoCai Yin

The key challenge of the TTG task is how to associate text with the spatial structure of the road network and traffic data for generating traffic situations.

Traffic Prediction

Testing Business Cycle Theories: Evidence from the Great Recession

no code implementations6 Mar 2024 Bo Li

Empirical business cycle studies using cross-country data usually cannot achieve causal relationships while within-country studies mostly focus on the bust period.

COMMIT: Certifying Robustness of Multi-Sensor Fusion Systems against Semantic Attacks

no code implementations4 Mar 2024 Zijian Huang, Wenda Chu, Linyi Li, Chejian Xu, Bo Li

In this work, we propose the first robustness certification framework COMMIT certify robustness of multi-sensor fusion systems against semantic attacks.

Autonomous Vehicles object-detection +2

Perceptive self-supervised learning network for noisy image watermark removal

1 code implementation4 Mar 2024 Chunwei Tian, Menghua Zheng, Bo Li, Yanning Zhang, Shichao Zhang, David Zhang

Specifically, mentioned paired watermark images are obtained in a self supervised way, and paired noisy images (i. e., noisy and reference images) are obtained in a supervised way.

Self-Supervised Learning

Improving Adversarial Energy-Based Model via Diffusion Process

no code implementations4 Mar 2024 Cong Geng, Tian Han, Peng-Tao Jiang, Hao Zhang, Jinwei Chen, Søren Hauberg, Bo Li

Generative models have shown strong generation ability while efficient likelihood estimation is less explored.

Denoising Density Estimation

Differentially Private Synthetic Data via Foundation Model APIs 2: Text

1 code implementation4 Mar 2024 Chulin Xie, Zinan Lin, Arturs Backurs, Sivakanth Gopi, Da Yu, Huseyin A Inan, Harsha Nori, Haotian Jiang, Huishuai Zhang, Yin Tat Lee, Bo Li, Sergey Yekhanin

Lin et al. (2024) recently introduced the Private Evolution (PE) algorithm to generate DP synthetic images with only API access to diffusion models.

Privacy Preserving

KeNet:Knowledge-enhanced Doc-Label Attention Network for Multi-label text classification

no code implementations4 Mar 2024 Bo Li, Yuyan Chen, Liang Zeng

It is imperative to additionally acknowledge that the significance of knowledge is substantiated in the realm of MLTC.

Information Retrieval Multi Label Text Classification +4

Boosting Box-supervised Instance Segmentation with Pseudo Depth

no code implementations2 Mar 2024 Xinyi Yu, Ling Yan, PengTao Jiang, Hao Chen, Bo Li, Lin Yuanbo Wu, Linlin Ou

This innovative approach empowers the network to simultaneously predict masks and depth, enhancing its ability to capture nuanced depth-related information during the instance segmentation process.

Box-supervised Instance Segmentation Depth Estimation +4

HALC: Object Hallucination Reduction via Adaptive Focal-Contrast Decoding

1 code implementation1 Mar 2024 Zhaorun Chen, Zhuokai Zhao, Hongyin Luo, Huaxiu Yao, Bo Li, Jiawei Zhou

While large vision-language models (LVLMs) have demonstrated impressive capabilities in interpreting multi-modal contexts, they invariably suffer from object hallucinations (OH).

Hallucination Object +1

Tree-Regularized Tabular Embeddings

1 code implementation1 Mar 2024 Xuan Li, Yun Wang, Bo Li

Tabular neural network (NN) has attracted remarkable attentions and its recent advances have gradually narrowed the performance gap with respect to tree-based models on many public datasets.

Binary Classification

A Novel Approach to Industrial Defect Generation through Blended Latent Diffusion Model with Online Adaptation

1 code implementation29 Feb 2024 Hanxi Li, Zhengxun Zhang, Hao Chen, Lin Wu, Bo Li, Deyin Liu, Mingwen Wang

Effectively addressing the challenge of industrial Anomaly Detection (AD) necessitates an ample supply of defective samples, a constraint often hindered by their scarcity in industrial contexts.

Anomaly Detection Image Generation

DART: Depth-Enhanced Accurate and Real-Time Background Matting

no code implementations24 Feb 2024 Hanxi Li, Guofeng Li, Bo Li, Lin Wu, Yan Cheng

In this paper, we leverage the rich depth information provided by the RGB-Depth (RGB-D) cameras to enhance background matting performance in real-time, dubbed DART.

Bayesian Inference Edge-computing +1

Mitigating Fine-tuning Jailbreak Attack with Backdoor Enhanced Alignment

no code implementations22 Feb 2024 Jiongxiao Wang, Jiazhao Li, Yiquan Li, Xiangyu Qi, Junjie Hu, Yixuan Li, Patrick McDaniel, Muhao Chen, Bo Li, Chaowei Xiao

Despite the general capabilities of Large Language Models (LLMs) like GPT-4 and Llama-2, these models still request fine-tuning or adaptation with customized data when it comes to meeting the specific business demands and intricacies of tailored use cases.

ArtPrompt: ASCII Art-based Jailbreak Attacks against Aligned LLMs

1 code implementation19 Feb 2024 Fengqing Jiang, Zhangchen Xu, Luyao Niu, Zhen Xiang, Bhaskar Ramasubramanian, Bo Li, Radha Poovendran

In this paper, we propose a novel ASCII art-based jailbreak attack and introduce a comprehensive benchmark Vision-in-Text Challenge (ViTC) to evaluate the capabilities of LLMs in recognizing prompts that cannot be solely interpreted by semantics.

Beyond Quantities: Machine Learning-based Characterization of Inequality in Infrastructure Quality Provision in Cities

no code implementations14 Feb 2024 Bo Li, Ali Mostafavi

While a growing of body of literature has recognized the importance of characterizing infrastructure inequality in cities and provided quantified metrics to inform urban development plans, the majority of the existing approaches focus primarily on measuring the quantity of infrastructure, assuming that more infrastructure is better.

Game of Trojans: Adaptive Adversaries Against Output-based Trojaned-Model Detectors

no code implementations12 Feb 2024 Dinuka Sahabandu, Xiaojun Xu, Arezoo Rajabi, Luyao Niu, Bhaskar Ramasubramanian, Bo Li, Radha Poovendran

We propose and analyze an adaptive adversary that can retrain a Trojaned DNN and is also aware of SOTA output-based Trojaned model detectors.

HarmBench: A Standardized Evaluation Framework for Automated Red Teaming and Robust Refusal

1 code implementation6 Feb 2024 Mantas Mazeika, Long Phan, Xuwang Yin, Andy Zou, Zifan Wang, Norman Mu, Elham Sakhaee, Nathaniel Li, Steven Basart, Bo Li, David Forsyth, Dan Hendrycks

Automated red teaming holds substantial promise for uncovering and mitigating the risks associated with the malicious use of large language models (LLMs), yet the field lacks a standardized evaluation framework to rigorously assess new methods.

C-RAG: Certified Generation Risks for Retrieval-Augmented Language Models

1 code implementation5 Feb 2024 Mintong Kang, Nezihe Merve Gürel, Ning Yu, Dawn Song, Bo Li

Specifically, we provide conformal risk analysis for RAG models and certify an upper confidence bound of generation risks, which we refer to as conformal generation risk.

Retrieval

Robust Prompt Optimization for Defending Language Models Against Jailbreaking Attacks

1 code implementation30 Jan 2024 Andy Zhou, Bo Li, Haohan Wang

Despite advances in AI alignment, language models (LM) remain vulnerable to adversarial attacks or jailbreaking, in which adversaries modify input prompts to induce harmful behavior.

Validating Climate Models with Spherical Convolutional Wasserstein Distance

no code implementations26 Jan 2024 Robert C. Garrett, Trevor Harris, Bo Li, Zhuo Wang

The validation of global climate models is crucial to ensure the accuracy and efficacy of model output.

GRATH: Gradual Self-Truthifying for Large Language Models

no code implementations22 Jan 2024 Weixin Chen, Dawn Song, Bo Li

GRATH iteratively refines truthfulness data and updates the model, leading to a gradual improvement in model truthfulness in a self-supervised manner.

Benchmarking Large Multimodal Models against Common Corruptions

1 code implementation22 Jan 2024 Jiawei Zhang, Tianyu Pang, Chao Du, Yi Ren, Bo Li, Min Lin

This technical report aims to fill a deficiency in the assessment of large multimodal models (LMMs) by specifically examining the self-consistency of their outputs when subjected to common corruptions.

Benchmarking

BadChain: Backdoor Chain-of-Thought Prompting for Large Language Models

1 code implementation20 Jan 2024 Zhen Xiang, Fengqing Jiang, Zidi Xiong, Bhaskar Ramasubramanian, Radha Poovendran, Bo Li

Moreover, we show that LLMs endowed with stronger reasoning capabilities exhibit higher susceptibility to BadChain, exemplified by a high average attack success rate of 97. 0% across the six benchmark tasks on GPT-4.

Backdoor Attack

Efficient Adapter Finetuning for Tail Languages in Streaming Multilingual ASR

no code implementations17 Jan 2024 Junwen Bai, Bo Li, Qiujia Li, Tara N. Sainath, Trevor Strohman

Meanwhile, the heterogeneous nature and imbalanced data abundance of different languages may cause performance degradation, leading to asynchronous peak performance for different languages during training, especially on tail ones.

Crafter: Facial Feature Crafting against Inversion-based Identity Theft on Deep Models

no code implementations14 Jan 2024 Shiming Wang, Zhe Ji, Liyao Xiang, Hao Zhang, Xinbing Wang, Chenghu Zhou, Bo Li

However, such methods can not defend against adaptive attacks, in which an attacker takes a countermove against a known defence strategy.

Convolutional Neural Network Ensemble Learning for Hyperspectral Imaging-based Blackberry Fruit Ripeness Detection in Uncontrolled Farm Environment

no code implementations9 Jan 2024 Chollette C. Olisah, Ben Trewhella, Bo Li, Melvyn L. Smith, Benjamin Winstone, E. Charles Whitfield, Felicidad Fernández Fernández, Harriet Duncalfe

To address this engineering application challenge, this paper proposes a novel multi-input convolutional neural network (CNN) ensemble classifier for detecting subtle traits of ripeness in blackberry fruits.

Ensemble Learning

CaMML: Context-Aware Multimodal Learner for Large Models

no code implementations6 Jan 2024 Yixin Chen, Shuai Zhang, Boran Han, Tong He, Bo Li

In this work, we introduce Context-Aware MultiModal Learner (CaMML), for tuning large multimodal models (LMMs).

Visual Question Answering

VOT: Revolutionizing Speaker Verification with Memory and Attention Mechanisms

no code implementations28 Dec 2023 Hongyu Wang, Hui Li, Bo Li

Speaker verification is to judge the similarity of two unknown voices in an open set, where the ideal speaker embedding should be able to condense discriminant information into a compact utterance-level representation that has small intra-speaker distances and large inter-speaker distances. We propose a novel model named Voice Transformer(VOT) for speaker verification.

Speaker Verification

Labels Need Prompts Too: Mask Matching for Natural Language Understanding Tasks

no code implementations14 Dec 2023 Bo Li, Wei Ye, Quansen Wang, Wen Zhao, Shikun Zhang

Textual label names (descriptions) are typically semantically rich in many natural language understanding (NLU) tasks.

Natural Language Understanding

Decoupling Degradation and Content Processing for Adverse Weather Image Restoration

no code implementations8 Dec 2023 Xi Wang, Xueyang Fu, Peng-Tao Jiang, Jie Huang, Mi Zhou, Bo Li, Zheng-Jun Zha

The former facilitates channel-dependent degradation removal operation, allowing the network to tailor responses to various adverse weather types; the latter, by integrating Fourier's global properties into channel-independent content features, enhances network capacity for consistent global content reconstruction.

Image Restoration

An explanation for the distribution characteristics of stock returns

no code implementations5 Dec 2023 Bo Li

In this work, we assume that the effects of events or information on prices obey normal distribution, while financial markets often overreact or underreact to events or information, resulting in non normal distributions of stock returns.

Efficient Incremental Potential Contact for Actuated Face Simulation

no code implementations3 Dec 2023 Bo Li, Lingchen Yang, Barbara Solenthaler

We present a quasi-static finite element simulator for human face animation.

Revisiting Single Image Reflection Removal In the Wild

1 code implementation29 Nov 2023 Yurui Zhu, Xueyang Fu, Peng-Tao Jiang, Hao Zhang, Qibin Sun, Jinwei Chen, Zheng-Jun Zha, Bo Li

This research focuses on the issue of single-image reflection removal (SIRR) in real-world conditions, examining it from two angles: the collection pipeline of real reflection pairs and the perception of real reflection locations.

Reflection Removal

Panoptic Video Scene Graph Generation

3 code implementations CVPR 2023 Jingkang Yang, Wenxuan Peng, Xiangtai Li, Zujin Guo, Liangyu Chen, Bo Li, Zheng Ma, Kaiyang Zhou, Wayne Zhang, Chen Change Loy, Ziwei Liu

PVSG relates to the existing video scene graph generation (VidSGG) problem, which focuses on temporal interactions between humans and objects grounded with bounding boxes in videos.

Graph Generation Panoptic Scene Graph Generation +5

ChatTraffic: Text-to-Traffic Generation via Diffusion Model

1 code implementation27 Nov 2023 Chengyang Zhang, Yong Zhang, Qitan Shao, Bo Li, Yisheng Lv, Xinglin Piao, BaoCai Yin

The key challenge of the TTG task is how to associate text with the spatial structure of the road network and traffic data for generating traffic situations.

Traffic Prediction

DP-OPT: Make Large Language Model Your Privacy-Preserving Prompt Engineer

1 code implementation27 Nov 2023 Junyuan Hong, Jiachen T. Wang, Chenhui Zhang, Zhangheng Li, Bo Li, Zhangyang Wang

To ensure that the prompts do not leak private information, we introduce the first private prompt generation mechanism, by a differentially-private (DP) ensemble of in-context learning with private demonstrations.

In-Context Learning Language Modelling +3

TextGuard: Provable Defense against Backdoor Attacks on Text Classification

1 code implementation19 Nov 2023 Hengzhi Pei, Jinyuan Jia, Wenbo Guo, Bo Li, Dawn Song

In this work, we propose TextGuard, the first provable defense against backdoor attacks on text classification.

Sentence text-classification +1

SparseSpikformer: A Co-Design Framework for Token and Weight Pruning in Spiking Transformer

no code implementations15 Nov 2023 Yue Liu, Shanlin Xiao, Bo Li, Zhiyi Yu

As the third-generation neural network, the Spiking Neural Network (SNN) has the advantages of low power consumption and high energy efficiency, making it suitable for implementation on edge devices.

Geometry-Calibrated DRO: Combating Over-Pessimism with Free Energy Implications

no code implementations8 Nov 2023 Jiashuo Liu, Jiayun Wu, Tianyu Wang, Hao Zou, Bo Li, Peng Cui

Machine learning algorithms minimizing average risk are susceptible to distributional shifts.

OtterHD: A High-Resolution Multi-modality Model

1 code implementation7 Nov 2023 Bo Li, Peiyuan Zhang, Jingkang Yang, Yuanhan Zhang, Fanyi Pu, Ziwei Liu

In this paper, we present OtterHD-8B, an innovative multimodal model evolved from Fuyu-8B, specifically engineered to interpret high-resolution visual inputs with granular precision.

Visual Question Answering

Identifying and Mitigating Vulnerabilities in LLM-Integrated Applications

no code implementations7 Nov 2023 Fengqing Jiang, Zhangchen Xu, Luyao Niu, Boxin Wang, Jinyuan Jia, Bo Li, Radha Poovendran

Successful exploits of the identified vulnerabilities result in the users receiving responses tailored to the intent of a threat initiator.

Code Completion

Invariant-Feature Subspace Recovery: A New Class of Provable Domain Generalization Algorithms

1 code implementation2 Nov 2023 Haoxiang Wang, Gargi Balasubramaniam, Haozhe Si, Bo Li, Han Zhao

First, in the binary classification setup of Rosenfeld et al. (2021), we show that our first algorithm, ISR-Mean, can identify the subspace spanned by invariant features from the first-order moments of the class-conditional distributions, and achieve provable domain generalization with $d_s+1$ training environments.

Binary Classification Domain Generalization +2

IMPRESS: Evaluating the Resilience of Imperceptible Perturbations Against Unauthorized Data Usage in Diffusion-Based Generative AI

1 code implementation NeurIPS 2023 Bochuan Cao, Changjiang Li, Ting Wang, Jinyuan Jia, Bo Li, Jinghui Chen

IMPRESS is based on the key observation that imperceptible perturbations could lead to a perceptible inconsistency between the original image and the diffusion-reconstructed image, which can be used to devise a new optimization strategy for purifying the image, which may weaken the protection of the original image from unauthorized data usage (e. g., style mimicking, malicious editing).

Image Generation

Bipartite Graph Pre-training for Unsupervised Extractive Summarization with Graph Convolutional Auto-Encoders

1 code implementation29 Oct 2023 Qianren Mao, Shaobo Zhao, Jiarui Li, Xiaolei Gu, Shizhu He, Bo Li, JianXin Li

Pre-trained sentence representations are crucial for identifying significant sentences in unsupervised document extractive summarization.

Extractive Summarization Sentence +2

DiffAttack: Evasion Attacks Against Diffusion-Based Adversarial Purification

1 code implementation NeurIPS 2023 Mintong Kang, Dawn Song, Bo Li

In particular, we propose a deviated-reconstruction loss at intermediate diffusion steps to induce inaccurate density gradient estimation to tackle the problem of vanishing/exploding gradients.

CBD: A Certified Backdoor Detector Based on Local Dominant Probability

1 code implementation NeurIPS 2023 Zhen Xiang, Zidi Xiong, Bo Li

Notably, for backdoor attacks with random perturbation triggers bounded by $\ell_2\leq0. 75$ which achieves more than 90\% attack success rate, CBD achieves 100\% (98\%), 100\% (84\%), 98\% (98\%), and 72\% (40\%) empirical (certified) detection true positive rates on the four benchmark datasets GTSRB, SVHN, CIFAR-10, and TinyImageNet, respectively, with low false positive rates.

Backdoor Attack Conformal Prediction

Gradual Domain Adaptation: Theory and Algorithms

1 code implementation20 Oct 2023 Yifei He, Haoxiang Wang, Bo Li, Han Zhao

Unsupervised domain adaptation (UDA) adapts a model from a labeled source domain to an unlabeled target domain in a one-off way.

Unsupervised Domain Adaptation

Effective and Efficient Federated Tree Learning on Hybrid Data

no code implementations18 Oct 2023 Qinbin Li, Chulin Xie, Xiaojun Xu, Xiaoyuan Liu, Ce Zhang, Bo Li, Bingsheng He, Dawn Song

To address this, we propose HybridTree, a novel federated learning approach that enables federated tree learning on hybrid data.

Federated Learning

Exploring Decision-based Black-box Attacks on Face Forgery Detection

no code implementations18 Oct 2023 Zhaoyu Chen, Bo Li, Kaixun Jiang, Shuang Wu, Shouhong Ding, Wenqiang Zhang

Further, the fake faces by our method can pass face forgery detection and face recognition, which exposes the security problems of face forgery detectors.

Face Recognition

RGM: A Robust Generalizable Matching Model

1 code implementation18 Oct 2023 Songyan Zhang, Xinyu Sun, Hao Chen, Bo Li, Chunhua Shen

Finding corresponding pixels within a pair of images is a fundamental computer vision task with various applications.

Optical Flow Estimation

Towards Training-free Open-world Segmentation via Image Prompt Foundation Models

no code implementations17 Oct 2023 Lv Tang, Peng-Tao Jiang, Hao-Ke Xiao, Bo Li

The realm of computer vision has witnessed a paradigm shift with the advent of foundational models, mirroring the transformative influence of large language models in the domain of natural language processing.

Segmentation

Ring-A-Bell! How Reliable are Concept Removal Methods for Diffusion Models?

1 code implementation16 Oct 2023 Yu-Lin Tsai, Chia-Yi Hsu, Chulin Xie, Chih-Hsun Lin, Jia-You Chen, Bo Li, Pin-Yu Chen, Chia-Mu Yu, Chun-Ying Huang

While efforts have been made to mitigate such problems, either by implementing a safety filter at the evaluation stage or by fine-tuning models to eliminate undesirable concepts or styles, the effectiveness of these safety measures in dealing with a wide range of prompts remains largely unexplored.

Unraveling Fundamental Properties of Power System Resilience Curves using Unsupervised Machine Learning

no code implementations16 Oct 2023 Bo Li, Ali Mostafavi

Trapezoidal archetypes explain resilience curves based on 1. duration of sustained function loss and 2. constant recovery rate.

LRRU: Long-short Range Recurrent Updating Networks for Depth Completion

no code implementations ICCV 2023 YuFei Wang, Bo Li, Ge Zhang, Qi Liu, Tao Gao, Yuchao Dai

Existing deep learning-based depth completion methods generally employ massive stacked layers to predict the dense depth map from sparse input data.

Depth Completion

InstructRetro: Instruction Tuning post Retrieval-Augmented Pretraining

1 code implementation11 Oct 2023 Boxin Wang, Wei Ping, Lawrence McAfee, Peng Xu, Bo Li, Mohammad Shoeybi, Bryan Catanzaro

After instruction tuning on Retro, InstructRetro demonstrates significant improvement over the instruction tuned GPT on a wide range of zero-shot tasks.

Question Answering Reading Comprehension +2

PST: Improving Quantitative Trading via Program Sketch-based Tuning

no code implementations9 Oct 2023 Zhiming Li, Junzhe Jiang, Yushi Cao, Aixin Cui, Bozhi Wu, Bo Li, Yang Liu, Dongning Sun

Particularly, PST first proposes using a novel symbolic program sketch to embed the abstract human expert knowledge of market trends.

Program Synthesis reinforcement-learning

AI-based association analysis for medical imaging using latent-space geometric confounder correction

no code implementations3 Oct 2023 Xianjing Liu, Bo Li, Meike W. Vernooij, Eppo B. Wolvius, Gennady V. Roshchupkin, Esther E. Bron

AI has greatly enhanced medical image analysis, yet its use in epidemiological population imaging studies remains limited due to visualization challenges in non-linear models and lack of confounder control.

RLLTE: Long-Term Evolution Project of Reinforcement Learning

2 code implementations28 Sep 2023 Mingqi Yuan, Zequn Zhang, Yang Xu, Shihao Luo, Bo Li, Xin Jin, Wenjun Zeng

We present RLLTE: a long-term evolution, extremely modular, and open-source framework for reinforcement learning (RL) research and application.

Language Modelling Large Language Model +2

Massive End-to-end Models for Short Search Queries

no code implementations22 Sep 2023 Weiran Wang, Rohit Prabhavalkar, Dongseong Hwang, Qiujia Li, Khe Chai Sim, Bo Li, James Qin, Xingyu Cai, Adam Stooke, Zhong Meng, CJ Zheng, Yanzhang He, Tara Sainath, Pedro Moreno Mengibar

In this work, we investigate two popular end-to-end automatic speech recognition (ASR) models, namely Connectionist Temporal Classification (CTC) and RNN-Transducer (RNN-T), for offline recognition of voice search queries, with up to 2B model parameters.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Visible and NIR Image Fusion Algorithm Based on Information Complementarity

no code implementations19 Sep 2023 Zhuo Li, Bo Li

Second, to generate the initial visible-NIR complementarity weight map, the difference maps of visible and NIR are filtered by the extend-DoG filter.

Self-supervised Multi-view Clustering in Computer Vision: A Survey

no code implementations18 Sep 2023 Jiatai Wang, Zhiwei Xu, Xuewen Yang, Hailong Li, Bo Li, Xuying Meng

However, as contrastive learning continues to evolve within the field of computer vision, self-supervised learning has also made substantial research progress and is progressively becoming dominant in MVC methods.

Clustering Contrastive Learning +3

Zero-Shot Co-salient Object Detection Framework

1 code implementation11 Sep 2023 Haoke Xiao, Lv Tang, Bo Li, Zhiming Luo, Shaozi Li

Despite recent advancements in deep learning models, these models still rely on training with well-annotated CoSOD datasets.

Co-Salient Object Detection Object +2

DiffSmooth: Certifiably Robust Learning via Diffusion Models and Local Smoothing

1 code implementation28 Aug 2023 Jiawei Zhang, Zhongzhu Chen, huan zhang, Chaowei Xiao, Bo Li

Diffusion models have been leveraged to perform adversarial purification and thus provide both empirical and certified robustness for a standard model.

Denoising

Classification Committee for Active Deep Object Detection

no code implementations16 Aug 2023 Lei Zhao, Bo Li, Xingxing Wei

The role of the classification committee is to select the most informative images according to their uncertainty values from the view of classification, which is expected to focus more on the discrepancy and representative of instances.

Active Learning Classification +3

Target before Shooting: Accurate Anomaly Detection and Localization under One Millisecond via Cascade Patch Retrieval

1 code implementation13 Aug 2023 Hanxi Li, Jianfei Hu, Bo Li, Hao Chen, Yongbin Zheng, Chunhua Shen

In this framework, the anomaly detection problem is solved via a cascade patch retrieval procedure that retrieves the nearest neighbors for each test image patch in a coarse-to-fine fashion.

Supervised Anomaly Detection

LoRA-FA: Memory-efficient Low-rank Adaptation for Large Language Models Fine-tuning

no code implementations7 Aug 2023 Longteng Zhang, Lin Zhang, Shaohuai Shi, Xiaowen Chu, Bo Li

The low-rank adaptation (LoRA) method can largely reduce the amount of trainable parameters for fine-tuning large language models (LLMs), however, it still requires expensive activation memory to update low-rank weights.

Eva: A General Vectorized Approximation Framework for Second-order Optimization

no code implementations4 Aug 2023 Lin Zhang, Shaohuai Shi, Bo Li

Second-order optimization algorithms exhibit excellent convergence properties for training deep learning models, but often incur significant computation and memory overheads.

Benchmarking and Analyzing Generative Data for Visual Recognition

no code implementations25 Jul 2023 Bo Li, Haotian Liu, Liangyu Chen, Yong Jae Lee, Chunyuan Li, Ziwei Liu

Advancements in large pre-trained generative models have expanded their potential as effective data generators in visual recognition.

Benchmarking Retrieval

Structured Network Pruning by Measuring Filter-wise Interactions

no code implementations3 Jul 2023 Wenting Tang, Xingxing Wei, Bo Li

Utilizing this new redundancy criterion, we propose a structured network pruning approach SNPFI (Structured Network Pruning by measuring Filter-wise Interaction).

Image Classification Network Pruning

Learning to Pan-sharpening with Memories of Spatial Details

1 code implementation28 Jun 2023 Maoxun Yuan, Tianyi Zhao, Bo Li, Xingxing Wei

To address this issue, in this paper we observe that the spatial details from PAN images are mainly high-frequency cues, i. e., the edges reflect the contour of input PAN images.

FunQA: Towards Surprising Video Comprehension

1 code implementation26 Jun 2023 Binzhu Xie, Sicheng Zhang, Zitang Zhou, Bo Li, Yuanhan Zhang, Jack Hessel, Jingkang Yang, Ziwei Liu

Surprising videos, such as funny clips, creative performances, or visual illusions, attract significant attention.

Question Answering Text Generation +3

Synthetic data shuffling accelerates the convergence of federated learning under data heterogeneity

1 code implementation23 Jun 2023 Bo Li, Yasin Esfandiari, Mikkel N. Schmidt, Tommy S. Alstrøm, Sebastian U. Stich

In this paper, we establish a precise and quantifiable correspondence between data heterogeneity and parameters in the convergence rate when a fraction of data is shuffled across clients.

Federated Learning

DecodingTrust: A Comprehensive Assessment of Trustworthiness in GPT Models

no code implementations NeurIPS 2023 Boxin Wang, Weixin Chen, Hengzhi Pei, Chulin Xie, Mintong Kang, Chenhui Zhang, Chejian Xu, Zidi Xiong, Ritik Dutta, Rylan Schaeffer, Sang T. Truong, Simran Arora, Mantas Mazeika, Dan Hendrycks, Zinan Lin, Yu Cheng, Sanmi Koyejo, Dawn Song, Bo Li

Yet, while the literature on the trustworthiness of GPT models remains limited, practitioners have proposed employing capable GPT models for sensitive applications such as healthcare and finance -- where mistakes can be costly.

Adversarial Robustness Ethics +1

Prior-knowledge-informed deep learning for lacune detection and quantification using multi-site brain MRI

no code implementations18 Jun 2023 Bo Li, Jeroen de Bresser, Wiro Niessen, Matthias Van Osch, Wiesje M. van der Flier, Geert Jan Biessels, Meike W. Vernooij, Esther Bron

Lacunes of presumed vascular origin, also referred to as lacunar infarcts, are important to assess cerebral small vessel disease and cognitive diseases such as dementia.

Evaluation and Optimization of Gradient Compression for Distributed Deep Learning

1 code implementation15 Jun 2023 Lin Zhang, Longteng Zhang, Shaohuai Shi, Xiaowen Chu, Bo Li

To accelerate distributed training, many gradient compression methods have been proposed to alleviate the communication bottleneck in synchronous stochastic gradient descent (S-SGD), but their efficacy in real-world applications still remains unclear.

Quantization

How to Estimate Model Transferability of Pre-Trained Speech Models?

1 code implementation1 Jun 2023 Zih-Ching Chen, Chao-Han Huck Yang, Bo Li, Yu Zhang, Nanxin Chen, Shuo-Yiin Chang, Rohit Prabhavalkar, Hung-Yi Lee, Tara N. Sainath

In this work, we introduce a "score-based assessment" framework for estimating the transferability of pre-trained speech models (PSMs) for fine-tuning target tasks.

Competing for Shareable Arms in Multi-Player Multi-Armed Bandits

1 code implementation30 May 2023 Renzhe Xu, Haotian Wang, Xingxuan Zhang, Bo Li, Peng Cui

In reality, agents often have to learn and maximize the rewards of the resources at the same time.

Multi-Armed Bandits

UMD: Unsupervised Model Detection for X2X Backdoor Attacks

no code implementations29 May 2023 Zhen Xiang, Zidi Xiong, Bo Li

Backdoor (Trojan) attack is a common threat to deep neural networks, where samples from one or more source classes embedded with a backdoor trigger will be misclassified to adversarial target classes.

On the Tool Manipulation Capability of Open-source Large Language Models

1 code implementation25 May 2023 Qiantong Xu, Fenglu Hong, Bo Li, Changran Hu, Zhengyu Chen, Jian Zhang

In this paper, we ask can we enhance open-source LLMs to be competitive to leading closed LLM APIs in tool manipulation, with practical amount of human supervision.

GrowSP: Unsupervised Semantic Segmentation of 3D Point Clouds

1 code implementation CVPR 2023 Zihui Zhang, Bo Yang, Bing Wang, Bo Li

Our method consists of three major components, 1) the feature extractor to learn per-point features from input point clouds, 2) the superpoint constructor to progressively grow the sizes of superpoints, and 3) the semantic primitive clustering module to group superpoints into semantic elements for the final semantic segmentation.

3D Semantic Segmentation Segmentation +1

Mixture-of-Expert Conformer for Streaming Multilingual ASR

no code implementations25 May 2023 Ke Hu, Bo Li, Tara N. Sainath, Yu Zhang, Francoise Beaufays

We evaluate the proposed model on a set of 12 languages, and achieve an average 11. 9% relative improvement in WER over the baseline.

Automatic Speech Recognition speech-recognition +1

Reconstructive Neuron Pruning for Backdoor Defense

1 code implementation24 May 2023 Yige Li, Xixiang Lyu, Xingjun Ma, Nodens Koren, Lingjuan Lyu, Bo Li, Yu-Gang Jiang

Specifically, RNP first unlearns the neurons by maximizing the model's error on a small subset of clean samples and then recovers the neurons by minimizing the model's error on the same data.

backdoor defense

Re-thinking Data Availablity Attacks Against Deep Neural Networks

no code implementations18 May 2023 Bin Fang, Bo Li, Shuang Wu, Ran Yi, Shouhong Ding, Lizhuang Ma

The unauthorized use of personal data for commercial purposes and the clandestine acquisition of private data for training machine learning models continue to raise concerns.

Towards Generalizable Data Protection With Transferable Unlearnable Examples

no code implementations18 May 2023 Bin Fang, Bo Li, Shuang Wu, Tianyi Zheng, Shouhong Ding, Ran Yi, Lizhuang Ma

One of the crucial factors contributing to this success has been the access to an abundance of high-quality data for constructing machine learning models.

Otter: A Multi-Modal Model with In-Context Instruction Tuning

1 code implementation5 May 2023 Bo Li, Yuanhan Zhang, Liangyu Chen, Jinghao Wang, Jingkang Yang, Ziwei Liu

Large language models (LLMs) have demonstrated significant universal capabilities as few/zero-shot learners in various tasks due to their pre-training on vast amounts of text data, as exemplified by GPT-3, which boosted to InstrctGPT and ChatGPT, effectively following natural language instructions to accomplish real-world tasks.

In-Context Learning Instruction Following +2

Evaluating ChatGPT's Information Extraction Capabilities: An Assessment of Performance, Explainability, Calibration, and Faithfulness

1 code implementation23 Apr 2023 Bo Li, Gexiang Fang, Yang Yang, Quansen Wang, Wei Ye, Wen Zhao, Shikun Zhang

The capability of Large Language Models (LLMs) like ChatGPT to comprehend user intent and provide reasonable responses has made them extremely popular lately.

Fast vehicle detection algorithm based on lightweight YOLO7-tiny

no code implementations12 Apr 2023 Bo Li, Yihua Chen, Hao Xu, Fei Zhong

The swift and precise detection of vehicles plays a significant role in intelligent transportation systems.

Fast Vehicle Detection

Can SAM Segment Anything? When SAM Meets Camouflaged Object Detection

1 code implementation10 Apr 2023 Lv Tang, Haoke Xiao, Bo Li

In this study, we try to ask if SAM can address the COD task and evaluate the performance of SAM on the COD benchmark by employing maximum segmentation evaluation and camouflage location evaluation.

Object object-detection +3

Predictive Heterogeneity: Measures and Applications

no code implementations1 Apr 2023 Jiashuo Liu, Jiayun Wu, Bo Li, Peng Cui

As an intrinsic and fundamental property of big data, data heterogeneity exists in a variety of real-world applications, such as precision medicine, autonomous driving, financial applications, etc.

Autonomous Driving Crop Yield Prediction +3

Invertible Convolution with Symmetric Paddings

1 code implementation30 Mar 2023 Bo Li

We show that symmetrically padded convolution can be analytically inverted via DFT.

Efficient Decision-based Black-box Patch Attacks on Video Recognition

no code implementations ICCV 2023 Kaixun Jiang, Zhaoyu Chen, Hao Huang, Jiafeng Wang, Dingkang Yang, Bo Li, Yan Wang, Wenqiang Zhang

First, STDE introduces target videos as patch textures and only adds patches on keyframes that are adaptively selected by temporal difference.

Video Recognition

Graph Transformer GANs for Graph-Constrained House Generation

no code implementations CVPR 2023 Hao Tang, Zhenyu Zhang, Humphrey Shi, Bo Li, Ling Shao, Nicu Sebe, Radu Timofte, Luc van Gool

We present a novel graph Transformer generative adversarial network (GTGAN) to learn effective graph node relations in an end-to-end fashion for the challenging graph-constrained house generation task.

Generative Adversarial Network House Generation +1

TrojDiff: Trojan Attacks on Diffusion Models with Diverse Targets

3 code implementations CVPR 2023 Weixin Chen, Dawn Song, Bo Li

To answer these questions, we propose an effective Trojan attack against diffusion models, TrojDiff, which optimizes the Trojan diffusion and generative processes during training.

Image Generation

DeAR: Accelerating Distributed Deep Learning with Fine-Grained All-Reduce Pipelining

1 code implementation24 Feb 2023 Lin Zhang, Shaohuai Shi, Xiaowen Chu, Wei Wang, Bo Li, Chengjian Liu

Communication scheduling has been shown to be effective in accelerating distributed training, which enables all-reduce communications to be overlapped with backpropagation computations.

Scheduling

Pose-Controllable 3D Facial Animation Synthesis using Hierarchical Audio-Vertex Attention

no code implementations24 Feb 2023 Bin Liu, Xiaolin Wei, Bo Li, Junjie Cao, Yu-Kun Lai

In this paper, a novel pose-controllable 3D facial animation synthesis method is proposed by utilizing hierarchical audio-vertex attention.

Attribute Face Model

UML: A Universal Monolingual Output Layer for Multilingual ASR

no code implementations22 Feb 2023 Chao Zhang, Bo Li, Tara N. Sainath, Trevor Strohman, Shuo-Yiin Chang

Consequently, the UML enables to switch in the interpretation of each output node depending on the language of the input speech.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Delving into the Adversarial Robustness of Federated Learning

no code implementations19 Feb 2023 Jie Zhang, Bo Li, Chen Chen, Lingjuan Lyu, Shuang Wu, Shouhong Ding, Chao Wu

In this work, we propose a novel algorithm called Decision Boundary based Federated Adversarial Training (DBFAT), which consists of two components (local re-weighting and global regularization) to improve both accuracy and robustness of FL systems.

Adversarial Robustness Federated Learning

JEIT: Joint End-to-End Model and Internal Language Model Training for Speech Recognition

no code implementations16 Feb 2023 Zhong Meng, Weiran Wang, Rohit Prabhavalkar, Tara N. Sainath, Tongzhou Chen, Ehsan Variani, Yu Zhang, Bo Li, Andrew Rosenberg, Bhuvana Ramabhadran

We propose JEIT, a joint end-to-end (E2E) model and internal language model (ILM) training method to inject large-scale unpaired text into ILM during E2E training which improves rare-word speech recognition.

Language Modelling speech-recognition +1

PerAda: Parameter-Efficient Federated Learning Personalization with Generalization Guarantees

no code implementations13 Feb 2023 Chulin Xie, De-An Huang, Wenda Chu, Daguang Xu, Chaowei Xiao, Bo Li, Anima Anandkumar

In this paper, we propose PerAda, a parameter-efficient pFL framework that reduces communication and computational costs and exhibits superior generalization performance, especially under test-time distribution shifts.

Generalization Bounds Knowledge Distillation +2

3D Colored Shape Reconstruction from a Single RGB Image through Diffusion

no code implementations11 Feb 2023 Bo Li, Xiaolin Wei, Fengwei Chen, Bin Liu

In shape prediction module, the reference RGB image is first encoded into a high-level shape feature and then the shape feature is utilized as a condition to predict the reverse geometric noise in diffusion model.

3D Reconstruction 3D Shape Generation +1

Interpolation for Robust Learning: Data Augmentation on Wasserstein Geodesics

no code implementations4 Feb 2023 Jiacheng Zhu, JieLin Qiu, Aritra Guha, Zhuolin Yang, XuanLong Nguyen, Bo Li, Ding Zhao

Our work provides a new perspective of model robustness through the lens of Wasserstein geodesic-based interpolation with a practical off-the-shelf strategy that can be combined with existing robust training methods.

Data Augmentation

Defensive ML: Defending Architectural Side-channels with Adversarial Obfuscation

no code implementations3 Feb 2023 Hyoungwook Nam, Raghavendra Pradyumna Pothukuchi, Bo Li, Nam Sung Kim, Josep Torrellas

To address this problem, this paper explores using Adversarial Machine Learning (AML) methods as a defense at the computer architecture layer to obfuscate side channels.

Computer Security

Automatic Intrinsic Reward Shaping for Exploration in Deep Reinforcement Learning

1 code implementation26 Jan 2023 Mingqi Yuan, Bo Li, Xin Jin, Wenjun Zeng

We present AIRS: Automatic Intrinsic Reward Shaping that intelligently and adaptively provides high-quality intrinsic rewards to enhance exploration in reinforcement learning (RL).

Benchmarking reinforcement-learning +1

From English to More Languages: Parameter-Efficient Model Reprogramming for Cross-Lingual Speech Recognition

no code implementations19 Jan 2023 Chao-Han Huck Yang, Bo Li, Yu Zhang, Nanxin Chen, Rohit Prabhavalkar, Tara N. Sainath, Trevor Strohman

In this work, we propose a new parameter-efficient learning framework based on neural model reprogramming for cross-lingual speech recognition, which can \textbf{re-purpose} well-trained English automatic speech recognition (ASR) models to recognize the other languages.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Proportional Fairness in Obnoxious Facility Location

no code implementations11 Jan 2023 Haris Aziz, Alexander Lam, Bo Li, Fahimeh Ramezani, Toby Walsh

On the other hand, in the randomized setting, we identify proportionally fair and strategyproof mechanisms that give an expected welfare within a constant factor of the optimal welfare.

Fairness

A Bertrand duopoly game with differentiated products reconsidered

no code implementations3 Jan 2023 Xiaoliang Li, Bo Li

In this paper, we explore a dynamic Bertrand duopoly game with differentiated products, where firms are boundedly rational and consumers are assumed to possess an underlying CES utility function.

PHA: Patch-Wise High-Frequency Augmentation for Transformer-Based Person Re-Identification

no code implementations CVPR 2023 Guiwei Zhang, Yongfei Zhang, Tianyu Zhang, Bo Li, ShiLiang Pu

Although recent studies empirically show that injecting Convolutional Neural Networks (CNNs) into Vision Transformers (ViTs) can improve the performance of person re-identification, the rationale behind it remains elusive.

Person Re-Identification

AREA: Adaptive Reweighting via Effective Area for Long-Tailed Classification

1 code implementation ICCV 2023 Xiaohua Chen, Yucan Zhou, Dayan Wu, Chule Yang, Bo Li, QinGhua Hu, Weiping Wang

Consequently, we estimate the size of the spanned space for each category, namely effective area, by detailedly analyzing its samples' distribution.

Reviewing Labels: Label Graph Network with Top-k Prediction Set for Relation Extraction

no code implementations29 Dec 2022 Bo Li, Wei Ye, Jinglei Zhang, Shikun Zhang

Specifically, for a given sample, we build a label graph to review candidate labels in the Top-k prediction set and learn the connections between them.

Relation Relation Extraction

Sequence Generation with Label Augmentation for Relation Extraction

1 code implementation29 Dec 2022 Bo Li, Dingyao Yu, Wei Ye, Jinglei Zhang, Shikun Zhang

Sequence generation demonstrates promising performance in recent information extraction efforts, by incorporating large-scale pre-trained Seq2Seq models.

Relation Relation Extraction

EDoG: Adversarial Edge Detection For Graph Neural Networks

no code implementations27 Dec 2022 Xiaojun Xu, Yue Yu, Hanzhang Wang, Alok Lal, Carl A. Gunter, Bo Li

In this paper, we propose a general adversarial edge detection pipeline EDoG without requiring knowledge of the attack strategies based on graph generation.

Edge Detection Graph Generation +2

Forecasting West Nile Virus with Graph Neural Networks: Harnessing Spatial Dependence in Irregularly Sampled Geospatial Data

no code implementations21 Dec 2022 Adam Tonks, Trevor Harris, Bo Li, William Brown, Rebecca Smith

Machine learning methods have seen increased application to geospatial environmental problems, such as precipitation nowcasting, haze forecasting, and crop yield prediction.

Crop Yield Prediction regression

On the effectiveness of partial variance reduction in federated learning with heterogeneous data

2 code implementations CVPR 2023 Bo Li, Mikkel N. Schmidt, Tommy S. Alstrøm, Sebastian U. Stich

In this paper, we first revisit the widely used FedAvg algorithm in a deep neural network to understand how data heterogeneity influences the gradient updates across the neural network layers.

Federated Learning

Rethinking Disparity: A Depth Range Free Multi-View Stereo Based on Disparity

1 code implementation30 Nov 2022 Qingsong Yan, Qiang Wang, Kaiyong Zhao, Bo Li, Xiaowen Chu, Fei Deng

Existing learning-based multi-view stereo (MVS) methods rely on the depth range to build the 3D cost volume and may fail when the range is too large or unreliable.

Logic and Commonsense-Guided Temporal Knowledge Graph Completion

1 code implementation30 Nov 2022 Guanglin Niu, Bo Li

To address these challenges, we propose a Logic and Commonsense-Guided Embedding model (LCGE) to jointly learn the time-sensitive representation involving timeliness and causality of events, together with the time-independent representation of events from the perspective of commonsense.

Causal Inference Knowledge Graph Completion +1

Confounder Balancing for Instrumental Variable Regression with Latent Variable

no code implementations18 Nov 2022 Anpeng Wu, Kun Kuang, Ruoxuan Xiong, Bo Li, Fei Wu

This paper studies the confounding effects from the unmeasured confounders and the imbalance of observed confounders in IV regression and aims at unbiased causal effect estimation.

regression valid

AnimeRun: 2D Animation Visual Correspondence from Open Source 3D Movies

1 code implementation10 Nov 2022 Li SiYao, Yuhang Li, Bo Li, Chao Dong, Ziwei Liu, Chen Change Loy

Existing correspondence datasets for two-dimensional (2D) cartoon suffer from simple frame composition and monotonic movements, making them insufficient to simulate real animations.

Optical Flow Estimation

HFedMS: Heterogeneous Federated Learning with Memorable Data Semantics in Industrial Metaverse

1 code implementation7 Nov 2022 Shenglai Zeng, Zonghang Li, Hongfang Yu, Zhihao Zhang, Long Luo, Bo Li, Dusit Niyato

Federated Learning (FL), as a rapidly evolving privacy-preserving collaborative machine learning paradigm, is a promising approach to enable edge intelligence in the emerging Industrial Metaverse.

Federated Learning Privacy Preserving

Resource-Efficient Transfer Learning From Speech Foundation Model Using Hierarchical Feature Fusion

no code implementations4 Nov 2022 Zhouyuan Huo, Khe Chai Sim, Bo Li, Dongseong Hwang, Tara N. Sainath, Trevor Strohman

Experimental results show that the proposed method can achieve better performance on speech recognition task than existing algorithms with fewer number of trainable parameters, less computational memory cost and faster training speed.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Fairness in Federated Learning via Core-Stability

no code implementations3 Nov 2022 Bhaskar Ray Chaudhury, Linyi Li, Mintong Kang, Bo Li, Ruta Mehta

Nonetheless, the heterogeneity nature of distributed data makes it challenging to define and ensure fairness among local agents.

Decision Making Fairness +1

A Quantum Kernel Learning Approach to Acoustic Modeling for Spoken Command Recognition

no code implementations2 Nov 2022 Chao-Han Huck Yang, Bo Li, Yu Zhang, Nanxin Chen, Tara N. Sainath, Sabato Marco Siniscalchi, Chin-Hui Lee

We propose a quantum kernel learning (QKL) framework to address the inherent data sparsity issues often encountered in training large-scare acoustic models in low-resource scenarios.

Spoken Command Recognition

Unified End-to-End Speech Recognition and Endpointing for Fast and Efficient Speech Systems

no code implementations1 Nov 2022 Shaan Bijwadia, Shuo-Yiin Chang, Bo Li, Tara Sainath, Chao Zhang, Yanzhang He

In this work, we propose a method to jointly train the ASR and EP tasks in a single end-to-end (E2E) multitask model, improving EP quality by optionally leveraging information from the ASR audio encoder.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

DensePure: Understanding Diffusion Models towards Adversarial Robustness

no code implementations1 Nov 2022 Chaowei Xiao, Zhongzhu Chen, Kun Jin, Jiongxiao Wang, Weili Nie, Mingyan Liu, Anima Anandkumar, Bo Li, Dawn Song

By using the highest density point in the conditional distribution as the reversed sample, we identify the robust region of a given instance under the diffusion model's reverse process.

Adversarial Robustness Denoising

Shape Matters: Deformable Patch Attack

1 code implementation European Conference on Computer Vision 2022 Zhaoyu Chen, Bo Li, Shuang Wu, Jianghe Xu, Shouhong Ding, Wenqiang Zhang

Though deep neural networks (DNNs) have demonstrated excellent performance in computer vision, they are susceptible and vulnerable to carefully crafted adversarial examples which can mislead DNNs to incorrect outputs.

CU-Net: LiDAR Depth-Only Completion With Coupled U-Net

1 code implementation26 Oct 2022 YuFei Wang, Yuchao Dai, Qi Liu, Peng Yang, Jiadai Sun, Bo Li

We find that existing depth-only methods can obtain satisfactory results in the areas where the measurement points are almost accurate and evenly distributed (denoted as normal areas), while the performance is limited in the areas where the foreground and background points are overlapped due to occlusion (denoted as overlap areas) and the areas where there are no measurement points around (denoted as blank areas) since the methods have no reliable input information in these areas.

LOT: Layer-wise Orthogonal Training on Improving $\ell_2$ Certified Robustness

1 code implementation20 Oct 2022 Xiaojun Xu, Linyi Li, Bo Li

On the other hand, as existing works show that semi-supervised training helps improve empirical robustness, we aim to bridge the gap and prove that semi-supervised learning also improves the certified robustness of Lipschitz-bounded models.

Adversarial Robustness

Handling Label Uncertainty for Camera Incremental Person Re-Identification

no code implementations17 Oct 2022 Zexian Yang, Dayan Wu, Wanqian Zhang, Bo Li, Weiping Wang

Specifically, new data collected from new cameras may probably contain an unknown proportion of identities seen before.

Incremental Learning Person Re-Identification

Product Ranking for Revenue Maximization with Multiple Purchases

1 code implementation15 Oct 2022 Renzhe Xu, Xingxuan Zhang, Bo Li, Yafeng Zhang, Xiaolong Chen, Peng Cui

In this paper, we assume that each consumer can purchase multiple products at will.

OpenOOD: Benchmarking Generalized Out-of-Distribution Detection

3 code implementations13 Oct 2022 Jingkang Yang, Pengyun Wang, Dejian Zou, Zitang Zhou, Kunyuan Ding, Wenxuan Peng, Haoqi Wang, Guangyao Chen, Bo Li, Yiyou Sun, Xuefeng Du, Kaiyang Zhou, Wayne Zhang, Dan Hendrycks, Yixuan Li, Ziwei Liu

Out-of-distribution (OOD) detection is vital to safety-critical machine learning applications and has thus been extensively studied, with a plethora of methods developed in the literature.

Anomaly Detection Benchmarking +3

JOIST: A Joint Speech and Text Streaming Model For ASR

no code implementations13 Oct 2022 Tara N. Sainath, Rohit Prabhavalkar, Ankur Bapna, Yu Zhang, Zhouyuan Huo, Zhehuai Chen, Bo Li, Weiran Wang, Trevor Strohman

In addition, we explore JOIST using a streaming E2E model with an order of magnitude more data, which are also novelties compared to previous works.

Feature Reconstruction Attacks and Countermeasures of DNN training in Vertical Federated Learning

no code implementations13 Oct 2022 Peng Ye, Zhifeng Jiang, Wei Wang, Bo Li, Baochun Li

To address this problem, we develop a novel feature protection scheme against the reconstruction attack that effectively misleads the search to some pre-specified random values.

Reconstruction Attack Vertical Federated Learning

Improving Long-tailed Object Detection with Image-Level Supervision by Multi-Task Collaborative Learning

1 code implementation11 Oct 2022 Bo Li, Yongqiang Yao, Jingru Tan, Xin Lu, Fengwei Yu, Ye Luo, Jianwei Lu

Specifically, there are an object detection task (consisting of an instance-classification task and a localization task) and an image-classification task in our framework, responsible for utilizing the two types of supervision.

Classification Contrastive Learning +4

Semantics-Consistent Cross-domain Summarization via Optimal Transport Alignment

no code implementations10 Oct 2022 JieLin Qiu, Jiacheng Zhu, Mengdi Xu, Franck Dernoncourt, Trung Bui, Zhaowen Wang, Bo Li, Ding Zhao, Hailin Jin

Multimedia summarization with multimodal output (MSMO) is a recently explored application in language grounding.

Towards Stable Co-saliency Detection and Object Co-segmentation

no code implementations25 Sep 2022 Bo Li, Lv Tang, Senyun Kuang, Mofei Song, Shouhong Ding

In this paper, we present a novel model for simultaneous stable co-saliency detection (CoSOD) and object co-segmentation (CoSEG).

Object Saliency Detection +1

Attributed Network Embedding Model for Exposing COVID-19 Spread Trajectory Archetypes

no code implementations20 Sep 2022 Junwei Ma, Bo Li, Qingchun Li, Chao Fan, Ali Mostafavi

To this end, this study creates a network embedding model capturing cross-county visitation networks, as well as heterogeneous features to uncover clusters of counties in the United States based on their pandemic spread transmission trajectories.

Network Embedding

Rewarding Episodic Visitation Discrepancy for Exploration in Reinforcement Learning

no code implementations19 Sep 2022 Mingqi Yuan, Bo Li, Xin Jin, Wenjun Zeng

Exploration is critical for deep reinforcement learning in complex environments with high-dimensional observations and sparse rewards.

Atari Games Benchmarking +3

Trustworthy Reinforcement Learning Against Intrinsic Vulnerabilities: Robustness, Safety, and Generalizability

no code implementations16 Sep 2022 Mengdi Xu, Zuxin Liu, Peide Huang, Wenhao Ding, Zhepeng Cen, Bo Li, Ding Zhao

A trustworthy reinforcement learning algorithm should be competent in solving challenging real-world problems, including {robustly} handling uncertainties, satisfying {safety} constraints to avoid catastrophic failures, and {generalizing} to unseen scenarios during deployments.

reinforcement-learning Reinforcement Learning (RL)

Graph Contrastive Learning with Personalized Augmentation

no code implementations14 Sep 2022 Xin Zhang, Qiaoyu Tan, Xiao Huang, Bo Li

Thus, blindly augmenting all graphs without considering their individual characteristics may undermine the performance of GCL arts. To deal with this, we propose the first principled framework, termed as \textit{G}raph contrastive learning with \textit{P}ersonalized \textit{A}ugmentation (GPA), to advance conventional GCL by allowing each graph to choose its own suitable augmentation operations. In essence, GPA infers tailored augmentation strategies for each graph based on its topology and node attributes via a learnable augmentation selector, which is a plug-and-play module and can be effectively trained with downstream GCL models end-to-end.

Contrastive Learning Data Augmentation

Streaming End-to-End Multilingual Speech Recognition with Joint Language Identification

no code implementations13 Sep 2022 Chao Zhang, Bo Li, Tara Sainath, Trevor Strohman, Sepand Mavandadi, Shuo-Yiin Chang, Parisa Haghani

Language identification is critical for many downstream tasks in automatic speech recognition (ASR), and is beneficial to integrate into multilingual end-to-end ASR as an additional task.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

CARE: Certifiably Robust Learning with Reasoning via Variational Inference

1 code implementation12 Sep 2022 Jiawei Zhang, Linyi Li, Ce Zhang, Bo Li

In particular, we propose a certifiably robust learning with reasoning pipeline (CARE), which consists of a learning component and a reasoning component.

Variational Inference

Unraveling the Connections between Privacy and Certified Robustness in Federated Learning Against Poisoning Attacks

no code implementations8 Sep 2022 Chulin Xie, Yunhui Long, Pin-Yu Chen, Qinbin Li, Arash Nourian, Sanmi Koyejo, Bo Li

We then provide two robustness certification criteria: certified prediction and certified attack inefficacy for DPFL on both user and instance levels.

Federated Learning

Privacy of Autonomous Vehicles: Risks, Protection Methods, and Future Directions

no code implementations8 Sep 2022 Chulin Xie, Zhong Cao, Yunhui Long, Diange Yang, Ding Zhao, Bo Li

However, training AVs usually requires a large amount of training data collected from different driving environments (e. g., cities) as well as different types of personal information (e. g., working hours and routes).

Autonomous Vehicles

Synergistic Redundancy: Towards Verifiable Safety for Autonomous Vehicles

no code implementations4 Sep 2022 Ayoosh Bansal, Simon Yu, Hunmin Kim, Bo Li, Naira Hovakimyan, Marco Caccamo, Lui Sha

The synergistic safety layer uses only verifiable and logically analyzable software to fulfill its tasks.

Autonomous Driving

Federated Learning with Label Distribution Skew via Logits Calibration

2 code implementations1 Sep 2022 Jie Zhang, Zhiqi Li, Bo Li, Jianghe Xu, Shuang Wu, Shouhong Ding, Chao Wu

Extensive experiments on federated datasets and real-world datasets demonstrate that FedLC leads to a more accurate global model and much improved performance.

Federated Learning

Verifiable Obstacle Detection

1 code implementation30 Aug 2022 Ayoosh Bansal, Hunmin Kim, Simon Yu, Bo Li, Naira Hovakimyan, Marco Caccamo, Lui Sha

Perception of obstacles remains a critical safety concern for autonomous vehicles.

Autonomous Driving

Turn-Taking Prediction for Natural Conversational Speech

no code implementations29 Aug 2022 Shuo-Yiin Chang, Bo Li, Tara N. Sainath, Chao Zhang, Trevor Strohman, Qiao Liang, Yanzhang He

This makes doing speech recognition with conversational speech, including one with multiple queries, a challenging task.

speech-recognition Speech Recognition

Cannot find the paper you are looking for? You can Submit a new open access paper.