no code implementations • 29 Jul 2024 • Ray Zhang, Zheming Zhou, Min Sun, Omid Ghasemalizadeh, Cheng-Hao Kuo, Ryan Eustice, Maani Ghaffari, Arnie Sen
This paper introduces a robust unsupervised SE(3) point cloud registration method that operates without requiring point correspondences.
no code implementations • 24 Jul 2024 • Jing Liang, Zhuo Deng, Zheming Zhou, Min Sun, Omid Ghasemalizadeh, Cheng-Hao Kuo, Arnie Sen, Dinesh Manocha
We present a new algorithm, Cross-Source-Context Place Recognition (CSCPR), for RGB-D indoor place recognition that integrates global retrieval and reranking into a single end-to-end model.
no code implementations • 21 Jul 2024 • Bolivar Solarte, Chin-Hsuan Wu, Jin-Cheng Jhang, Jonathan Lee, Yi-Hsuan Tsai, Min Sun
In this paper, we introduce a novel geometry-aware self-training framework for room layout estimation models on unseen scenes with unlabeled data.
1 code implementation • 17 Jul 2024 • Ming-Feng Li, Yueh-Feng Ku, Hong-Xuan Yen, Chi Liu, Yu-Lun Liu, Albert Y. C. Chen, Cheng-Hao Kuo, Min Sun
GenRC outperforms state-of-the-art methods under most appearance and geometric metrics on ScanNet and ARKitScenes datasets, even though GenRC is not trained on these datasets nor using predefined camera trajectories.
no code implementations • 17 Jun 2024 • Xuefeng Hu, Ke Zhang, Min Sun, Albert Chen, Cheng-Hao Kuo, Ram Nevatia
Large-scale pretrained vision-language models like CLIP have demonstrated remarkable zero-shot image classification capabilities across diverse domains.
no code implementations • 15 Apr 2024 • Yu-Ju Tsai, Jin-Cheng Jhang, Jingjing Zheng, Wei Wang, Albert Y. C. Chen, Min Sun, Cheng-Hao Kuo, Ming-Hsuan Yang
A unique property of our Bi-Layout model is its ability to inherently detect ambiguous regions by comparing the two predictions.
no code implementations • 3 Apr 2024 • Jing Liang, Zhuo Deng, Zheming Zhou, Omid Ghasemalizadeh, Dinesh Manocha, Min Sun, Cheng-Hao Kuo, Arnie Sen
We present a novel end-to-end algorithm (PoCo) for the indoor RGB-D place recognition task, aimed at identifying the most likely match for a given query frame within a reference database.
no code implementations • CVPR 2024 • Yun-Yun Tsai, Fu-Chen Chen, Albert Y. C. Chen, Junfeng Yang, Che-Chun Su, Min Sun, Cheng-Hao Kuo
For vision tasks, recent studies have shown that test-time adaptation employing diffusion models can achieve state-of-the-art accuracy improvements on OOD samples by generating new samples that align with the model's domain without the need to modify the model's weights.
no code implementations • 6 Mar 2024 • Zewei Tian, Min Sun, Alex Liu, Shawon Sarkar, Jing Liu
This paper explores the transformative potential of computer-assisted textual analysis in enhancing instructional quality through in-depth insights from educational artifacts.
no code implementations • CVPR 2024 • Yu-Ju Tsai, Jin-Cheng Jhang, Jingjing Zheng, Wei Wang, Albert Y. C. Chen, Min Sun, Cheng-Hao Kuo, Ming-Hsuan Yang
Specifically on the MatterportLayout dataset it improves 3DIoU from 81. 70% to 82. 57% across the full test set and notably from 54. 80% to 59. 97% in subsets with significant ambiguity.
1 code implementation • 28 Dec 2023 • Chin-Hsuan Wu, Yen-Chun Chen, Bolivar Solarte, Lu Yuan, Min Sun
Our strategy unfolds in three steps: (1) We invert the diffusion model for camera pose estimation instead of synthesizing novel views.
no code implementations • 5 Dec 2023 • Tao Tu, Ming-Feng Li, Chieh Hubert Lin, Yen-Chi Cheng, Min Sun, Ming-Hsuan Yang
In this work, we study articulated 3D shape reconstruction from a single and casually captured internet video, where the subject's view coverage is incomplete.
no code implementations • 2 Dec 2023 • Alex Liu, Min Sun
Obtaining stakeholders' diverse experiences and opinions about current policy in a timely manner is crucial for policymakers to identify strengths and gaps in resource allocation, thereby supporting effective policy design and implementation.
no code implementations • 15 Oct 2023 • Xiaotong Chen, Zheming Zhou, Zhuo Deng, Omid Ghasemalizadeh, Min Sun, Cheng-Hao Kuo, Arnie Sen
Reconstructing transparent objects using affordable RGB-D cameras is a persistent challenge in robotic perception due to inconsistent appearances across views in the RGB domain and inaccurate depth readings in each single-view.
no code implementations • 18 Sep 2023 • Ting-Ying Lin, Lin-Yung Hsieh, Fu-En Wang, Wen-Shen Wuen, Min Sun
We propose a sparse and privacy-enhanced representation for Human Pose Estimation (HPE).
no code implementations • 18 Sep 2023 • Yu-Cheng Hsieh, Cheng Sun, Suraj Dengale, Min Sun
The volume and diversity of training data are critical for modern deep learningbased methods.
no code implementations • ICCV 2023 • Tao Tu, Shun-Po Chuang, Yu-Lun Liu, Cheng Sun, Ke Zhang, Donna Roy, Cheng-Hao Kuo, Min Sun
The results demonstrate that ImGeoNet outperforms the current state-of-the-art multi-view image-based method, ImVoxelNet, on all three datasets in terms of detection accuracy.
Ranked #25 on 3D Object Detection on ScanNetV2
1 code implementation • 4 Aug 2023 • Xuefeng Hu, Ke Zhang, Lu Xia, Albert Chen, Jiajia Luo, Yuyin Sun, Ken Wang, Nan Qiao, Xiao Zeng, Min Sun, Cheng-Hao Kuo, Ram Nevatia
Large-scale Pre-Training Vision-Language Model such as CLIP has demonstrated outstanding performance in zero-shot classification, e. g. achieving 76. 3% top-1 accuracy on ImageNet without seeing any example, which leads to potential benefits to many tasks that have no labeled data.
no code implementations • 22 Apr 2023 • Yuan-Fu Yang, Iuan-Kai Fang, Min Sun, Su-Chu Hsu
We find similarities of color structure and color stacking in the Impressionist paintings and the illustrations of the novel coronavirus by artists around the world.
no code implementations • 9 Apr 2023 • Ming-Feng Li, Min Sun
However, existing methods encounter exponential growth of runtime and undesirable phenomena of deadlocks and rerouting as the map size or agent density grows.
no code implementations • 22 Mar 2023 • Yi-Shan Lee, Wei-Cheng Tseng, Fu-En Wang, Min Sun
We propose a content-based system for matching video and background music.
1 code implementation • ICCV 2023 • Liqiang He, Wei Wang, Albert Chen, Min Sun, Cheng-Hao Kuo, Sinisa Todorovic
We propose a Bidirectional Alignment for domain adaptive Detection with Transformers (BiADT) to improve cross domain object detection performance.
1 code implementation • 2 Dec 2022 • Tobias Fischer, Yung-Hsu Yang, Suryansh Kumar, Min Sun, Fisher Yu
To track the 3D locations and trajectories of the other traffic participants at any given time, modern autonomous vehicles are equipped with multiple cameras that cover the vehicle's full surroundings.
1 code implementation • 1 Dec 2022 • YuanFu Yang, Min Sun
In this paper, we present a novel architecture that can perform defect classification in a more efficient way.
1 code implementation • 28 Nov 2022 • Fu-En Wang, Chien-Yi Wang, Min Sun, Shang-Hong Lai
In this paper, we propose MixFairFace framework to improve the fairness in face recognition models.
no code implementations • 7 Nov 2022 • YuanFu Yang, Min Sun
Achieving photorealistic rendering of real-world scenes poses a significant challenge with diverse applications, including mixed reality and virtual reality.
1 code implementation • 24 Oct 2022 • Bolivar Solarte, Chin-Hsuan Wu, Yueh-Cheng Liu, Yi-Hsuan Tsai, Min Sun
In addition, since ground truth annotations are not available during training nor in testing, we leverage the entropy information in multiple layout estimations as a quantitative metric to measure the geometry consistency of the scene, allowing us to evaluate any layout estimator for hyper-parameter tuning, including model selection without ground truth annotations.
1 code implementation • 7 Sep 2022 • Fu-En Wang, Yu-Hsuan Yeh, Yi-Hsuan Tsai, Wei-Chen Chiu, Min Sun
Thus, state-of-the-art frameworks for monocular 360 depth estimation such as bi-projection fusion in BiFuse are proposed.
Ranked #12 on Depth Estimation on Stanford2D3D Panoramic
1 code implementation • CVPR 2022 • YuanFu Yang, Min Sun
However, the massive expansion of semiconductor manufacturing and the development of new technology will bring many defect wafers.
1 code implementation • 5 Apr 2022 • An-Chieh Cheng, Xueting Li, Sifei Liu, Min Sun, Ming-Hsuan Yang
With the capacity of modeling long-range dependencies in sequential data, transformers have shown remarkable performances in a variety of generative tasks such as image, audio, and text generation.
1 code implementation • 16 Mar 2022 • Ping-Chung Yu, Cheng Sun, Min Sun
In this work, we deal with the data scarcity challenge of 3D tasks by transferring knowledge from strong 2D models via RGB-D images.
no code implementations • 1 Feb 2022 • Wei-Cheng Tseng, Hung-Ju Liao, Lin Yen-Chen, Min Sun
We propose CLA-NeRF -- a Category-Level Articulated Neural Radiance Field that can perform view synthesis, part segmentation, and articulated pose estimation.
no code implementations • 14 Dec 2021 • Wei-Cheng Tseng, Wei Wei, Da-Cheng Juan, Min Sun
The number of agents can grow or an environment sometimes needs to interact with a changing number of agents in real-world scenarios.
1 code implementation • 12 Dec 2021 • Bolivar Solarte, Yueh-Cheng Liu, Chin-Hsuan Wu, Yi-Hsuan Tsai, Min Sun
We present 360-DFPE, a sequential floor plan estimation method that directly takes 360-images as input without relying on active sensors or 3D information.
no code implementations • 1 Dec 2021 • Wei-Cheng Tseng, Po-Han Chi, Jia-Hua Wu, Min Sun
In contrast, most of the existing methods delete the rare protein functions to reduce the label space.
2 code implementations • CVPR 2022 • Cheng Sun, Min Sun, Hwann-Tzong Chen
Finally, evaluation on five inward-facing benchmarks shows that our method matches, if not surpasses, NeRF's quality, yet it only takes about 15 minutes to train from scratch for a new scene.
no code implementations • 1 Nov 2021 • Yung-Hsu Yang, Thomas E. Huang, Min Sun, Samuel Rota Bulò, Peter Kontschieder, Fisher Yu
Our experiments show consistent and significant improvements on challenging semantic segmentation benchmarks, including Cityscapes, BDD100K, and Mapillary Vistas, at negligible computational and parameter overhead.
no code implementations • ICCV 2021 • Chi-Wei Hsiao, Cheng Sun, Hwann-Tzong Chen, Min Sun
We present a novel pyramidal output representation to ensure parsimony with our "specialize and fuse" process for semantic segmentation.
no code implementations • NeurIPS 2021 • An-Chieh Cheng, Xueting Li, Min Sun, Ming-Hsuan Yang, Sifei Liu
We propose a canonical point autoencoder (CPAE) that predicts dense correspondences between 3D shapes of the same category.
1 code implementation • CVPR 2021 • Cheng Sun, Chi-Wei Hsiao, Ning-Hsu Wang, Min Sun, Hwann-Tzong Chen
Indoor panorama typically consists of human-made structures parallel or perpendicular to gravity.
no code implementations • CVPR 2021 • Fu-En Wang, Yu-Hsuan Yeh, Min Sun, Wei-Chen Chiu, Yi-Hsuan Tsai
Although significant progress has been made in room layout estimation, most methods aim to reduce the loss in the 2D pixel coordinate rather than exploiting the room structure in the 3D space.
1 code implementation • 22 Apr 2021 • Bolivar Solarte, Chin-Hsuan Wu, Kuan-Wei Lu, Min Sun, Wei-Chen Chiu, Yi-Hsuan Tsai
This paper presents a novel preconditioning strategy for the classic 8-point algorithm (8-PA) for estimating an essential matrix from 360-FoV images (i. e., equirectangular images) in spherical projection.
1 code implementation • 1 Apr 2021 • Fu-En Wang, Yu-Hsuan Yeh, Min Sun, Wei-Chen Chiu, Yi-Hsuan Tsai
Although significant progress has been made in room layout estimation, most methods aim to reduce the loss in the 2D pixel coordinate rather than exploiting the room structure in the 3D space.
3D Room Layouts From A Single RGB Panorama Depth Estimation +2
1 code implementation • 12 Mar 2021 • Hou-Ning Hu, Yung-Hsu Yang, Tobias Fischer, Trevor Darrell, Fisher Yu, Min Sun
Experiments on our proposed simulation data and real-world benchmarks, including KITTI, nuScenes, and Waymo datasets, show that our tracking framework offers robust object association and tracking on urban-driving scenarios.
Ranked #8 on Multiple Object Tracking on KITTI Tracking test
no code implementations • 4 Mar 2021 • Wei-Cheng Tseng, Jin-Siang Lin, Yao-Min Feng, Min Sun
We also design two regularization terms to improve the diversity and utilization rate of the primitives in the pre-training phase.
no code implementations • 12 Dec 2020 • Chun-Hung Chao, Hsien-Tzu Cheng, Tsung-Ying Ho, Le Lu, Min Sun
The proposed method is evaluated on two published radiotherapy target contouring datasets of nasopharyngeal and esophageal cancer.
no code implementations • NeurIPS 2020 • Hung-Jen Chen, An-Chieh Cheng, Da-Cheng Juan, Wei Wei, Min Sun
To preserve the knowledge we learn from previous instances, we proposed a method to protect the path by restricting the gradient updates of one instance from overriding past updates calculated from previous instances if these instances are not similar.
1 code implementation • CVPR 2021 • Cheng Sun, Min Sun, Hwann-Tzong Chen
We present HoHoNet, a versatile and efficient framework for holistic understanding of an indoor 360-degree panorama using a Latent Horizontal Feature (LHFeat).
3D Room Layouts From A Single RGB Panorama Depth Estimation +1
no code implementations • 29 Aug 2020 • Chun-Hung Chao, Zhuotun Zhu, Dazhou Guo, Ke Yan, Tsung-Ying Ho, Jinzheng Cai, Adam P. Harrison, Xianghua Ye, Jing Xiao, Alan Yuille, Min Sun, Le Lu, Dakai Jin
Specifically, we first utilize a 3D convolutional neural network with ROI-pooling to extract the GTV$_{LN}$'s instance-wise appearance features.
no code implementations • ECCV 2020 • Yen-Chi Cheng, Hsin-Ying Lee, Min Sun, Ming-Hsuan Yang
We also apply an off-the-shelf image-to-image translation model to generate realistic RGB images to better understand the quality of the synthesized semantic maps.
1 code implementation • 30 Mar 2020 • Fu-En Wang, Yu-Hsuan Yeh, Min Sun, Wei-Chen Chiu, Yi-Hsuan Tsai
Inferring the information of 3D layout from a single equirectangular panorama is crucial for numerous applications of virtual reality or robotics (e. g., scene understanding and navigation).
no code implementations • 10 Jan 2020 • Shih-Han Chou, Wei-Lun Chao, Wei-Sheng Lai, Min Sun, Ming-Hsuan Yang
We then study two different VQA models on VQA 360, including one conventional model that takes an equirectangular image (with intrinsic distortion) as input and one dedicated model that first projects a 360 image onto cubemaps and subsequently aggregates the information from multiple spatial resolutions.
no code implementations • 18 Nov 2019 • Wen-Yen Chang, Wen-Huan Chiang, Shao-Hao Lu, Tingfan Wu, Min Sun
Last but not least, we investigate the generalization of the HAL policy learned on MNIST dataset by directly applying it on MNIST-M. We show that the agent can generalize and outperform directly-learned policy under constrained labeled sets.
1 code implementation • 11 Nov 2019 • Ning-Hsu Wang, Bolivar Solarte, Yi-Hsuan Tsai, Wei-Chen Chiu, Min Sun
Recently, end-to-end trainable deep neural networks have significantly improved stereo depth estimation for perspective images.
no code implementations • 3 Oct 2019 • Shih-Han Chou, Cheng Sun, Wen-Yen Chang, Wan-Ting Hsu, Min Sun, Jianlong Fu
In this paper, our goal is to provide a standard dataset to facilitate the vision and machine learning communities in 360{\deg} domain.
no code implementations • 29 May 2019 • Chi-Wei Hsiao, Cheng Sun, Min Sun, Hwann-Tzong Chen
This paper also constructs a benchmark for validating the performance on general layout topologies, where Flat2Layout achieves good performance on general room types.
no code implementations • 5 Apr 2019 • Chun-Hung Chao, Yen-Chi Cheng, Hsien-Tzu Cheng, Chi-Wen Huang, Tsung-Ying Ho, Chen-Kan Tseng, Le Lu, Min Sun
Instead, inspired by the treating methodology of considering meaningful information across slices, we used Gated Graph Neural Network to frame this problem more efficiently.
1 code implementation • 5 Apr 2019 • Tsun-Hsuan Wang, Hou-Ning Hu, Chieh Hubert Lin, Yi-Hsuan Tsai, Wei-Chen Chiu, Min Sun
The complementary characteristics of active and passive depth sensing techniques motivate the fusion of the Li-DAR sensor and stereo camera for improved depth perception.
3 code implementations • ICCV 2019 • Tsun-Hsuan Wang, Yen-Chi Cheng, Chieh Hubert Lin, Hwann-Tzong Chen, Min Sun
We introduce point-to-point video generation that controls the generation process with two control points: the targeted start- and end-frames.
no code implementations • 25 Mar 2019 • Fang-I Hsiao, Jui-Hsuan Kuo, Min Sun
The encoder infers discrete latent factors corresponding to different behaviors from demonstrations.
1 code implementation • CVPR 2019 • Cheng Sun, Chi-Wei Hsiao, Min Sun, Hwann-Tzong Chen
We present a new approach to the problem of estimating the 3D room layout from a single panoramic image.
3D Room Layouts From A Single RGB Panorama Data Augmentation
2 code implementations • 20 Dec 2018 • Tsun-Hsuan Wang, Fu-En Wang, Juan-Ting Lin, Yi-Hsuan Tsai, Wei-Chen Chiu, Min Sun
We propose a novel plug-and-play (PnP) module for improving depth prediction with taking arbitrary patterns of sparse depths as input.
1 code implementation • CVPR 2019 • Shang-Ta Yang, Fu-En Wang, Chi-Han Peng, Peter Wonka, Min Sun, Hung-Kuo Chu
We present a deep learning framework, called DuLa-Net, to predict Manhattan-world 3D room layouts from a single RGB panorama.
2 code implementations • 26 Nov 2018 • An-Chieh Cheng, Chieh Hubert Lin, Da-Cheng Juan, Wei Wei, Min Sun
Conventional Neural Architecture Search (NAS) aims at finding a single architecture that achieves the best performance, which usually optimizes task related learning objectives such as accuracy.
1 code implementation • ICCV 2019 • Hou-Ning Hu, Qi-Zhi Cai, Dequan Wang, Ji Lin, Min Sun, Philipp Krähenbühl, Trevor Darrell, Fisher Yu
The framework can not only associate detections of vehicles in motion over time, but also estimate their complete 3D bounding box information from a sequence of 2D images captured on a moving platform.
Ranked #13 on Multiple Object Tracking on KITTI Tracking test
no code implementations • 13 Nov 2018 • Fu-En Wang, Hou-Ning Hu, Hsien-Tzu Cheng, Juan-Ting Lin, Shang-Ta Yang, Meng-Li Shih, Hung-Kuo Chu, Min Sun
We propose a novel self-supervised learning approach for predicting the omnidirectional depth and camera motion from a 360{\deg} video.
no code implementations • 11 Sep 2018 • Cheng Kuan Chen, Zhu Feng Pan, Min Sun, Ming-Yu Liu
It can learn to generate stylish image descriptions that are more related to image content and can be trained with the arbitrary monolingual corpus without collecting new paired image and stylish descriptions.
no code implementations • 29 Aug 2018 • An-Chieh Cheng, Jin-Dong Dong, Chi-Hung Hsu, Shu-Huan Chang, Min Sun, Shih-Chieh Chang, Jia-Yu Pan, Yu-Ting Chen, Wei Wei, Da-Cheng Juan
Recent breakthroughs in Neural Architectural Search (NAS) have achieved state-of-the-art performance in many tasks such as image classification and language understanding.
no code implementations • ECCV 2018 • Tz-Ying Wu, Juan-Ting Lin, Tsun-Hsuang Wang, Chan-Wei Hu, Juan Carlos Niebles, Min Sun
In the closed-loop system, the ability to monitor the state of the task via rich sensory information is important but often less studied.
no code implementations • ECCV 2018 • Yu-Ting Chen, Wen-Yen Chang, Hai-Lun Lu, Ting-Fan Wu, Min Sun
Recently, a few domain adaptation and active learning approaches have been proposed to mitigate the performance drop.
1 code implementation • ECCV 2018 • Po-Yu Huang, Wan-Ting Hsu, Chun-Yueh Chiu, Ting-Fan Wu, Min Sun
Uncertainty estimation in deep learning becomes more important recently.
Ranked #19 on Semantic Segmentation on CamVid
no code implementations • ECCV 2018 • Jin-Dong Dong, An-Chieh Cheng, Da-Cheng Juan, Wei Wei, Min Sun
We propose DPP-Net: Device-aware Progressive Search for Pareto-optimal Neural Architectures, optimizing for both device-related (e. g., inference time and memory usage) and device-agnostic (e. g., accuracy and model size) objectives.
no code implementations • CVPR 2018 • Hsien-Tzu Cheng, Chun-Hung Chao, Jin-Dong Dong, Hao-Kai Wen, Tyng-Luh Liu, Min Sun
Then, we concatenate all six faces while utilizing the connectivity between faces on the cube for image padding (i. e., Cube Padding) in convolution, pooling, convolutional LSTM layers.
no code implementations • CVPR 2018 • Hsien-Tzu Cheng, Chun-Hung Chao, Jin-Dong Dong, Hao-Kai Wen, Tyng-Luh Liu, Min Sun
Then, we concatenate all six faces while utilizing the connectivity between faces on the cube for image padding (i. e., Cube Padding) in convolution, pooling, convolutional LSTM layers.
1 code implementation • ACL 2018 • Wan-Ting Hsu, Chieh-Kai Lin, Ming-Ying Lee, Kerui Min, Jing Tang, Min Sun
On the one hand, a simple extractive model can obtain sentence-level attention with high ROUGE scores but less readable.
Ranked #39 on Abstractive Text Summarization on CNN / Daily Mail
no code implementations • 12 Mar 2018 • Tsun-Hsuan Wang, Hung-Jui Huang, Juan-Ting Lin, Chan-Wei Hu, Kuo-Hao Zeng, Min Sun
Given a visual input, the task of the O-CNN is not to retrieve the matched place exemplar, but to retrieve the closest place exemplar and estimate the relative distance between the input and the closest place.
1 code implementation • 2 Dec 2017 • Yong-Siang Shih, Kai-Yueh Chang, Hsuan-Tien Lin, Min Sun
In our learned space, we introduce a novel Projected Compatibility Distance (PCD) function which is differentiable and ensures diversity by aiming for at least one prototype to be close to a compatible item, whereas none of the prototypes are close to an incompatible item.
1 code implementation • 23 Nov 2017 • Shih-Han Chou, Yi-Chun Chen, Kuo-Hao Zeng, Hou-Ning Hu, Jianlong Fu, Min Sun
The negative log reconstruction loss of the reverse sentence (referred to as "irrelevant loss") is jointly minimized to encourage the reverse sentence to be different from the given sentence.
1 code implementation • ICCV 2017 • Tz-Ying Wu, Ting-An Chien, Cheng-Sheng Chan, Chan-Wei Hu, Min Sun
The core of the system is a novel Recurrent Neural Network (RNN) and Policy Network (PN), where the RNN encodes visual and motion observation to anticipate intention, and the PN parsimoniously triggers the process of visual observation to reduce computation requirement.
2 code implementations • 2 Oct 2017 • Yen-Chen Lin, Ming-Yu Liu, Min Sun, Jia-Bin Huang
Our core idea is that the adversarial examples targeting at a neural network-based policy are not effective for the frame prediction model.
no code implementations • ICCV 2017 • Kuo-Hao Zeng, William B. Shen, De-An Huang, Min Sun, Juan Carlos Niebles
This allows us to apply IRL at scale and directly imitate the dynamics in high-dimensional continuous visual sequences from the raw pixel values.
no code implementations • CVPR 2017 • Hou-Ning Hu, Yen-Chen Lin, Ming-Yu Liu, Hsien-Tzu Cheng, Yung-Ju Chang, Min Sun
Given the main object and previously selected viewing angles, our method regresses a shift in viewing angle to move to the next one.
no code implementations • CVPR 2017 • Kuo-Hao Zeng, Shih-Han Chou, Fu-Hsiang Chan, Juan Carlos Niebles, Min Sun
For survival, a living agent must have the ability to assess risk (1) by temporally anticipating accidents before they occur, and (2) by spatially localizing risky regions in the environment to move away from threats.
1 code implementation • CVPR 2017 • Hou-Ning Hu, Yen-Chen Lin, Ming-Yu Liu, Hsien-Tzu Cheng, Yung-Ju Chang, Min Sun
Watching a 360{\deg} sports video requires a viewer to continuously select a viewing angle, either through a sequence of mouse clicks or head movements.
1 code implementation • ICCV 2017 • Tseng-Hung Chen, Yuan-Hong Liao, Ching-Yao Chuang, Wan-Ting Hsu, Jianlong Fu, Min Sun
The domain critic assesses whether the generated sentences are indistinguishable from sentences in the target domain.
9 code implementations • ICCV 2017 • Yi-Hsin Chen, Wei-Yu Chen, Yu-Ting Chen, Bo-Cheng Tsai, Yu-Chiang Frank Wang, Min Sun
Despite the recent success of deep-learning based semantic segmentation, deploying a pre-trained road scene segmenter to a city whose images are not presented in the training set would not achieve satisfactory performance due to dataset biases.
no code implementations • 8 Mar 2017 • Yen-Chen Lin, Zhang-Wei Hong, Yuan-Hong Liao, Meng-Li Shih, Ming-Yu Liu, Min Sun
In the strategically-timed attack, the adversary aims at minimizing the agent's reward by only attacking the agent at a small subset of time steps in an episode.
1 code implementation • 1 Feb 2017 • Yi-Ling Chen, Jan Klopp, Min Sun, Shao-Yi Chien, Kwan-Liu Ma
Photo composition is an important factor affecting the aesthetics in photography.
no code implementations • 12 Nov 2016 • Kuo-Hao Zeng, Tseng-Hung Chen, Ching-Yao Chuang, Yuan-Hong Liao, Juan Carlos Niebles, Min Sun
Then, a large number of candidate QA pairs are automatically generated from descriptions rather than manually annotated.
no code implementations • 25 Aug 2016 • Kuo-Hao Zeng, Tseng-Hung Chen, Juan Carlos Niebles, Min Sun
Finally, our sentence augmentation method also outperforms the baselines on the M-VAD dataset.
no code implementations • 7 Dec 2015 • Cheng-Sheng Chan, Shou-Zhong Chen, Pei-Xuan Xie, Chiung-Chih Chang, Min Sun
We have collected a new synchronized HandCam and HeadCam dataset with 20 videos captured in three scenes for hand states recognition.