no code implementations • NeurIPS 2010 • Jun Zhu, Li-Jia Li, Li Fei-Fei, Eric P. Xing
This paper presents a joint max-margin and max-likelihood learning method for upstream scene understanding models, in which latent topic discovery and prediction model estimation are closely coupled and well-balanced.
no code implementations • NeurIPS 2010 • Li-Jia Li, Hao Su, Li Fei-Fei, Eric P. Xing
Robust low-level image features have been proven to be effective representations for a variety of visual recognition tasks such as object recognition and scene classification; but pixels, or even local image patches, carry little semantic meanings.
no code implementations • CVPR 2014 • Kevin Tang, Armand Joulin, Li-Jia Li, Li Fei-Fei
In this paper, we tackle the problem of co-localization in real-world images.
no code implementations • 6 Jul 2014 • Xiangnan Kong, Zhaoming Wu, Li-Jia Li, Ruofei Zhang, Philip S. Yu, Hang Wu, Wei Fan
Unlike prior works, our method can effectively and efficiently consider missing labels and label correlations simultaneously, and is very scalable, that has linear time complexities over the size of the data.
no code implementations • 21 Nov 2014 • Can Xu, Suleyman Cetintas, Kuang-Chih Lee, Li-Jia Li
Images have become one of the most popular types of media through which users convey their emotions within online social networks.
3 code implementations • 10 Feb 2015 • Sachin Sudhakar Farfade, Mohammad Saberian, Li-Jia Li
In this paper we propose Deep Dense Face Detector (DDFD), a method that does not require pose/landmark annotation and is able to detect faces in a wide range of orientations using a single model based on deep convolutional neural networks.
2 code implementations • 5 Mar 2015 • Bart Thomee, David A. Shamma, Gerald Friedland, Benjamin Elizalde, Karl Ni, Douglas Poland, Damian Borth, Li-Jia Li
We present the Yahoo Flickr Creative Commons 100 Million Dataset (YFCC100M), the largest public multimedia collection that has ever been released.
Multimedia Computers and Society H.3.7
no code implementations • CVPR 2015 • Justin Johnson, Ranjay Krishna, Michael Stark, Li-Jia Li, David Shamma, Michael Bernstein, Li Fei-Fei
We introduce a novel dataset of 5, 000 human-generated scene graphs grounded to images and use this dataset to evaluate our method for image retrieval.
no code implementations • CVPR 2015 • Olga Russakovsky, Li-Jia Li, Li Fei-Fei
This paper brings together the latest advancements in object detection and in crowd engineering into a principled framework for accurately and efficiently localizing objects in images.
1 code implementation • 23 Feb 2016 • Ranjay Krishna, Yuke Zhu, Oliver Groth, Justin Johnson, Kenji Hata, Joshua Kravitz, Stephanie Chen, Yannis Kalantidis, Li-Jia Li, David A. Shamma, Michael S. Bernstein, Fei-Fei Li
Despite progress in perceptual tasks such as image classification, computers still perform poorly on cognitive tasks such as image description and question answering.
1 code implementation • CVPR 2017 • Linjie Yang, Kevin Tang, Jianchao Yang, Li-Jia Li
The goal is to densely detect visual concepts (e. g., objects, object parts, and interactions between them) from images, labeling each with a short descriptive phrase.
no code implementations • ICCV 2017 • Yuncheng Li, Jianchao Yang, Yale Song, Liangliang Cao, Jiebo Luo, Li-Jia Li
The ability of learning from noisy labels is very useful in many visual recognition tasks, as a vast amount of data with noisy labels are relatively easy to obtain.
no code implementations • CVPR 2017 • Zhou Ren, Xiaoyu Wang, Ning Zhang, Xutao Lv, Li-Jia Li
The policy network serves as a local guidance by providing the confidence of predicting the next word according to the current state.
1 code implementation • CVPR 2018 • Zhe Li, Chong Wang, Mei Han, Yuan Xue, Wei Wei, Li-Jia Li, Li Fei-Fei
Accurate identification and localization of abnormalities from radiology images play an integral part in clinical diagnosis and treatment planning.
18 code implementations • ECCV 2018 • Chenxi Liu, Barret Zoph, Maxim Neumann, Jonathon Shlens, Wei Hua, Li-Jia Li, Li Fei-Fei, Alan Yuille, Jonathan Huang, Kevin Murphy
We propose a new method for learning the structure of convolutional neural networks (CNNs) that is more efficient than recent state-of-the-art methods based on reinforcement learning and evolutionary algorithms.
Ranked #15 on Neural Architecture Search on NAS-Bench-201, ImageNet-16-120 (Accuracy (Val) metric)
1 code implementation • ICML 2018 • Lu Jiang, Zhengyuan Zhou, Thomas Leung, Li-Jia Li, Li Fei-Fei
Recent deep networks are capable of memorizing the entire data even when the labels are completely random.
Ranked #16 on Image Classification on WebVision-1000
no code implementations • ICLR 2018 • Wei Wei, Quoc V. Le, Andrew M. Dai, Li-Jia Li
One challenge in applying such techniques to building goal-oriented conversation models is that maximum likelihood-based models are not optimized toward accomplishing goals.
12 code implementations • ECCV 2018 • Yihui He, Ji Lin, Zhijian Liu, Hanrui Wang, Li-Jia Li, Song Han
Model compression is a critical technique to efficiently deploy neural network models on mobile devices which have limited computation resources and tight power budgets.
1 code implementation • ICLR 2018 • Kiran K. Thekumparampil, Chong Wang, Sewoong Oh, Li-Jia Li
Recently popularized graph neural networks achieve the state-of-the-art accuracy on a number of standard benchmark datasets for graph-based semi-supervised learning, improving significantly over existing approaches.
Ranked #10 on Graph Regression on Lipophilicity
no code implementations • CVPR 2018 • Xinlei Chen, Li-Jia Li, Li Fei-Fei, Abhinav Gupta
The framework consists of two core modules: a local module that uses spatial memory to store previous beliefs with parallel updates; and a global graph-reasoning module.
2 code implementations • CVPR 2018 • Junwei Liang, Lu Jiang, Liangliang Cao, Li-Jia Li, Alexander Hauptmann
Recent insights on language and vision with neural networks have been successfully applied to simple single-image visual question answering.
Ranked #1 on Memex Question Answering on MemexQA
no code implementations • ICML 2018 • Zhengyuan Zhou, Panayotis Mertikopoulos, Nicholas Bambos, Peter Glynn, Yinyu Ye, Li-Jia Li, Li Fei-Fei
One of the most widely used optimization methods for large-scale machine learning problems is distributed asynchronous stochastic gradient descent (DASGD).
no code implementations • ICCV 2019 • JIyang Gao, Jiang Wang, Shengyang Dai, Li-Jia Li, Ram Nevatia
Comparing to standard Faster RCNN, it contains three highlights: an ensemble of two classification heads and a distillation head to avoid overfitting on noisy labels and improve the mining precision, masking the negative sample loss in box predictor to avoid the harm of false negative labels, and training box regression head only on seed annotations to eliminate the harm from inaccurate boundaries of mined bounding boxes.
no code implementations • 1 Dec 2018 • David Xue, Anin Sayana, Evan Darke, Kelly Shen, Jun-Ting Hsieh, Zelun Luo, Li-Jia Li, N. Lance Downing, Arnold Milstein, Li Fei-Fei
As the senior population rapidly increases, it is challenging yet crucial to provide effective long-term care for seniors who live at home or in senior care facilities.
1 code implementation • IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI) 2018 • Junwei Liang, Lu Jiang, Liangliang Cao, Yannis Kalantidis, Li-Jia Li, and Alexander Hauptmann
In addition to a text answer, a few grounding photos are also given to justify the answer.
Ranked #1 on Memex Question Answering on MemexQA
4 code implementations • CVPR 2019 • Nam Vo, Lu Jiang, Chen Sun, Kevin Murphy, Li-Jia Li, Li Fei-Fei, James Hays
In this paper, we study the task of image retrieval, where the input query is specified in the form of an image plus some text that describes desired modifications to the input image.
Ranked #2 on Image Retrieval with Multi-Modal Query on MIT-States
3 code implementations • ICLR 2019 • Yunbo Wang, Lu Jiang, Ming-Hsuan Yang, Li-Jia Li, Mingsheng Long, Li Fei-Fei
We first evaluate the E3D-LSTM network on widely-used future video prediction datasets and achieve the state-of-the-art performance.
Ranked #1 on Video Prediction on KTH (Cond metric)
no code implementations • ICLR 2020 • Alejandro Newell, Lu Jiang, Chong Wang, Li-Jia Li, Jia Deng
Multi-task learning holds the promise of less data, parameters, and time than training of separate models.
1 code implementation • ICCV 2019 • Lanlan Liu, Michael Muelly, Jia Deng, Tomas Pfister, Li-Jia Li
This paper explores object detection in the small data regime, where only a limited number of annotated bounding boxes are available due to data rarity and annotation expense.
1 code implementation • 19 Nov 2021 • Phoenix X. Huang, Wenze Hu, William Brendel, Manmohan Chandraker, Li-Jia Li, Xiaoyu Wang
This paper introduces an open source platform to support the rapid development of computer vision applications at scale.
1 code implementation • 27 Jul 2022 • Zhanpeng Feng, Shiliang Zhang, Rinyoichi Takezoe, Wenze Hu, Manmohan Chandraker, Li-Jia Li, Vijay K. Narayanan, Xiaoyu Wang
To facilitate the research in this field, this paper contributes an active learning benchmark framework named as ALBench for evaluating active learning in object detection.
1 code implementation • 26 Sep 2022 • Sophie Ostmeier, Brian Axelrod, Jeroen Bertels, Fabian Isensee, Maarten G. Lansberg, Soren Christensen, Gregory W. Albers, Li-Jia Li, Jeremy J. Heit
We study how uncertain, small, and empty reference annotations influence the value of metrics for medical image segmentation on an in-house data set regardless of the model.
1 code implementation • 24 Nov 2022 • Sophie Ostmeier, Brian Axelrod, Benjamin F. J. Verhaaren, Soren Christensen, Abdelkader Mahammedi, Yongkai Liu, Benjamin Pulli, Li-Jia Li, Greg Zaharchuk, Jeremy J. Heit
The optimized model trained on expert A was compared to test experts B and C. We used a one-sided Wilcoxon signed-rank test to test for the non-inferiority of the model-expert compared to the inter-expert agreement.
no code implementations • 30 Aug 2023 • Kilichbek Haydarov, Xiaoqian Shen, Avinash Madasu, Mahmoud Salem, Li-Jia Li, Gamaleldin Elsayed, Mohamed Elhoseiny
We introduce Affective Visual Dialog, an emotion explanation and reasoning task as a testbed for research on understanding the formation of emotions in visually grounded conversations.
no code implementations • 21 Sep 2023 • Mahyar Abbasian, Elahe Khatibi, Iman Azimi, David Oniani, Zahra Shakeri Hossein Abad, Alexander Thieme, Ram Sriram, Zhongqi Yang, Yanshan Wang, Bryant Lin, Olivier Gevaert, Li-Jia Li, Ramesh Jain, Amir M. Rahmani
The purpose of this paper is to explore state-of-the-art LLM-based evaluation metrics that are specifically applicable to the assessment of interactive conversational models in healthcare.