no code implementations • 5 Jan 2018 • Yunyi Liang, Zhiyong Cui, Yu Tian, Huimiao Chen, Yinhai Wang
The GAA is able to combine traffic flow theory with neural networks and thus improve the accuracy of traffic state estimation.
1 code implementation • 28 Jun 2018 • Yu Tian, Xi Peng, Long Zhao, Shaoting Zhang, Dimitris N. Metaxas
Generating multi-view images from a single-view input is an essential yet challenging problem.
1 code implementation • ECCV 2018 • Long Zhao, Xi Peng, Yu Tian, Mubbasir Kapadia, Dimitris Metaxas
We consider the problem of image-to-video translation, where an input image is translated into an output video containing motions of a single object.
5 code implementations • CVPR 2019 • Long Zhao, Xi Peng, Yu Tian, Mubbasir Kapadia, Dimitris N. Metaxas
In this paper, we study the problem of learning Graph Convolutional Networks (GCNs) for regression.
Ranked #25 on Monocular 3D Human Pose Estimation on Human3.6M
1 code implementation • NeurIPS 2019 • Yu Tian, Long Zhao, Xi Peng, Dimitris N. Metaxas
Graph kernels are kernel methods measuring graph similarity and serve as a standard tool for graph classification.
Ranked #8 on Link Prediction on Cora
no code implementations • 23 Oct 2019 • Yuyuan Liu, Yu Tian, Gabriel Maicas, Leonardo Z. C. T. Pu, Rajvinder Singh, Johan W. Verjans, Gustavo Carneiro
We show that our proposed approach achieves the state-of-the-art result on this data set, compared with recently proposed anomaly detection systems.
no code implementations • 9 Jan 2020 • Ruigang Niu, Xian Sun, Yu Tian, Wenhui Diao, Kaiqiang Chen, Kun fu
Semantic segmentation in very high resolution (VHR) aerial images is one of the most challenging tasks in remote sensing image understanding.
no code implementations • 18 Mar 2020 • Yu Tian, Kunbo Zhang, Leyuan Wang, Zhenan Sun
Extensive experiments demonstrate the advantages of the PAAS technique to counter diverse face spoofing attacks (print, replay, mask) in uncontrolled indoor and outdoor conditions by learning polarized face images of 33 people.
no code implementations • 10 Jun 2020 • Yu Tian, Gaofeng Pan, Mohamed-Slim Alouini
To illustrate how DL-based CV can be applied in wireless communications, an example of using a DL-based CV with a millimeter-wave (mmWave) system is given to realize optimal mmWave multiple-input and multiple-output (MIMO) beamforming in mobile scenarios.
1 code implementation • 26 Jun 2020 • Yu Tian, Gabriel Maicas, Leonardo Zorron Cheng Tao Pu, Rajvinder Singh, Johan W. Verjans, Gustavo Carneiro
Anomaly detection methods generally target the learning of a normal image distribution (i. e., inliers showing healthy cases) and during testing, samples relatively far from the learned distribution are classified as anomalies (i. e., outliers showing disease cases).
no code implementations • 15 Sep 2020 • Yu Tian, Gaofeng Pan, Mohamed-Slim
Two power allocation strategies are considered: the first one is a general (fixed) power allocation scheme under which we derive the OP and EC of NOMA users in closed form; the other one is an optimal power allocation scheme that can achieve the maximum sum rate for the whole system.
no code implementations • Findings of the Association for Computational Linguistics 2020 • Xin Guo, Yu Tian, Qinghan Xue, Panos Lampropoulos, Steven Eliuk, Kenneth Barner, Xiaolong Wang
Catastrophic forgetting in neural networks indicates the performance decreasing of deep learning models on previous tasks while learning new tasks.
no code implementations • IEEE 2020 • Jianyi Liu, Yu Tian, Ru Zhang, (Member, IEEE), YOUQIANG SUN, AND CHAN WANG
And 77. 58% success rate is obtained in the transfer test attack.
no code implementations • 9 Jan 2021 • Yu Tian, Leonardo Zorron Cheng Tao Pu, Yuyuan Liu, Gabriel Maicas, Johan W. Verjans, Alastair D. Burt, Seon Ho Shin, Rajvinder Singh, Gustavo Carneiro
In this paper, we propose and analyse a system that can automatically detect, localise and classify polyps from colonoscopy videos.
2 code implementations • 25 Jan 2021 • Yuanhong Chen, Yu Tian, Guansong Pang, Gustavo Carneiro
The adversarial interpolation is enforced to consistently learn a smooth Gaussian descriptor, even when the training data is small or contaminated with anomalous samples.
Ranked #2 on Anomaly Detection on MNIST (using extra training data)
3 code implementations • ICCV 2021 • Yu Tian, Guansong Pang, Yuanhong Chen, Rajvinder Singh, Johan W. Verjans, Gustavo Carneiro
To address this issue, we introduce a novel and theoretically sound method, named Robust Temporal Feature Magnitude learning (RTFM), which trains a feature magnitude learning function to effectively recognise the positive instances, substantially improving the robustness of the MIL approach to the negative instances from abnormal videos.
Anomaly Detection In Surveillance Videos Contrastive Learning +2
1 code implementation • 5 Mar 2021 • Fengbei Liu, Yu Tian, Filipe R. Cordeiro, Vasileios Belagiannis, Ian Reid, Gustavo Carneiro
In this paper, we propose Self-supervised Mean Teacher for Semi-supervised (S$^2$MTS$^2$) learning that combines self-supervised mean-teacher pre-training with semi-supervised fine-tuning.
1 code implementation • 5 Mar 2021 • Yu Tian, Guansong Pang, Fengbei Liu, Yuanhong Chen, Seon Ho Shin, Johan W. Verjans, Rajvinder Singh, Gustavo Carneiro
Unsupervised anomaly detection (UAD) learns one-class classifiers exclusively with normal (i. e., healthy) images to detect any abnormal (i. e., unhealthy) samples that do not conform to the expected normal patterns.
Ranked #1 on Anomaly Detection on LAG
1 code implementation • 6 Mar 2021 • Fengbei Liu, Yuanhong Chen, Yu Tian, Yuyuan Liu, Chong Wang, Vasileios Belagiannis, Gustavo Carneiro
In this paper, we propose a new training module called Non-Volatile Unbiased Memory (NVUM), which non-volatility stores running average of model logits for a new regularization loss on noisy multi-label problem.
Image Classification with Label Noise Learning with noisy labels +1
no code implementations • 17 Apr 2021 • Jing Xu, Yu Tian, Shuai Yuan, Naijin Liu
In this paper, a noise attention method is proposed for unsupervised spectrum anomaly detection in unauthorized bands.
no code implementations • 30 Apr 2021 • TianHao Li, Yu Tian, Shuai Yuan, Naijin Liu
In this paper, a novel bandwidth negotiation mechanism is proposed for massive devices wireless spectrum sharing, in which individual device locally negotiates bandwidth usage with neighbor devices and globally optimal spectrum utilization is achieved through distributed decision-making.
1 code implementation • ICLR 2021 • Yu Tian, Jian Ren, Menglei Chai, Kyle Olszewski, Xi Peng, Dimitris N. Metaxas, Sergey Tulyakov
We introduce a motion generator that discovers the desired trajectory, in which content and motion are disentangled.
Ranked #32 on Video Generation on UCF-101
no code implementations • 15 May 2021 • Yu Tian, Shuai Yuan, Weisheng Chen, Naijin Liu
Radio Map Prediction (RMP), aiming at estimating coverage of radio wave, has been widely recognized as an enabling technology for improving radio spectrum efficiency.
1 code implementation • 26 May 2021 • Yasmin SarcheshmehPour, Yu Tian, Linli Zhang, Alexander Jung
Our main analytic contribution is an upper bound on the deviation between the local model parameters learnt by our algorithm and an oracle-based clustered federated learning method.
1 code implementation • ICCV 2021 • Ligong Han, Martin Renqiang Min, Anastasis Stathopoulos, Yu Tian, Ruijiang Gao, Asim Kadav, Dimitris Metaxas
We then propose an improved cGAN model with Auxiliary Classification that directly aligns the fake and real conditionals $P(\text{class}|\text{image})$ by minimizing their $f$-divergence.
2 code implementations • 3 Sep 2021 • Yu Tian, Fengbei Liu, Guansong Pang, Yuanhong Chen, Yuyuan Liu, Johan W. Verjans, Rajvinder Singh, Gustavo Carneiro
Pre-training UAD methods with self-supervised learning, based on computer vision techniques, can mitigate this challenge, but they are sub-optimal because they do not explore domain knowledge for designing the pretext tasks, and their contrastive learning losses do not try to cluster the normal training images, which may result in a sparse distribution of normal images that is ineffective for anomaly detection.
no code implementations • 22 Sep 2021 • Yu Tian, Chengguang Li, Sen yang
In this paper, we propose a deep learning model for Demodulation Reference Signal (DMRS) based channel estimation task.
no code implementations • 28 Sep 2021 • Yiyu Liu, Qian Liu, Yu Tian, Changping Wang, Yanan Niu, Yang song, Chenliang Li
In this paper, we propose a novel concept-aware denoising graph neural network (named CONDE) for micro-video recommendation.
no code implementations • 29 Sep 2021 • Yu Tian, Chenwei Wang
We investigate the problem of wireless beam tracking on mmWave bands with the assistance of camera images.
1 code implementation • 17 Oct 2021 • Ligong Han, Sri Harsha Musunuri, Martin Renqiang Min, Ruijiang Gao, Yu Tian, Dimitris Metaxas
StyleGANs have shown impressive results on data generation and manipulation in recent years, thanks to its disentangled style latent space.
3 code implementations • 24 Nov 2021 • Yu Tian, Yuyuan Liu, Guansong Pang, Fengbei Liu, Yuanhong Chen, Gustavo Carneiro
However, previous uncertainty approaches that directly associate high uncertainty to anomaly may sometimes lead to incorrect anomaly predictions, and external reconstruction models tend to be too inefficient for real-time self-driving embedded systems.
Ranked #2 on Anomaly Detection on Lost and Found (using extra training data)
1 code implementation • CVPR 2022 • Yuyuan Liu, Yu Tian, Yuanhong Chen, Fengbei Liu, Vasileios Belagiannis, Gustavo Carneiro
The accurate prediction by this model allows us to use a challenging combination of network, input data and feature perturbations to improve the consistency learning generalisation, where the feature perturbations consist of a new adversarial perturbation.
1 code implementation • CVPR 2022 • Fengbei Liu, Yu Tian, Yuanhong Chen, Yuyuan Liu, Vasileios Belagiannis, Gustavo Carneiro
Effective semi-supervised learning (SSL) in medical image analysis (MIA) must address two challenges: 1) work effectively on both multi-class (e. g., lesion classification) and multi-label (e. g., multiple-disease diagnosis) problems, and 2) handle imbalanced learning (because of the high variance in disease prevalence).
no code implementations • 28 Jan 2022 • Yu Tian, Zhangkai Ni, Baoliang Chen, Shiqi Wang, Hanli Wang, Sam Kwong
However, little work has been dedicated to automatic quality assessment of such GAN-generated face images (GFIs), even less have been devoted to generalized and robust quality assessment of GFIs generated with unseen GAN model.
2 code implementations • ICCV 2023 • Yuanhong Chen, Fengbei Liu, Hu Wang, Chong Wang, Yu Tian, Yuyuan Liu, Gustavo Carneiro
Deep learning methods have shown outstanding classification accuracy in medical imaging problems, which is largely attributed to the availability of large-scale datasets manually annotated with clean labels.
1 code implementation • 22 Mar 2022 • Yu Tian, Guansong Pang, Yuyuan Liu, Chong Wang, Yuanhong Chen, Fengbei Liu, Rajvinder Singh, Johan W Verjans, Mengyu Wang, Gustavo Carneiro
Our UAD approach, the memory-augmented multi-level cross-attentional masked autoencoder (MemMC-MAE), is a transformer-based approach, consisting of a novel memory-augmented self-attention operator for the encoder and a new multi-level cross-attention operator for the decoder.
1 code implementation • 23 Mar 2022 • Yu Tian, Guansong Pang, Fengbei Liu, Yuyuan Liu, Chong Wang, Yuanhong Chen, Johan W Verjans, Gustavo Carneiro
Current polyp detection methods from colonoscopy videos use exclusively normal (i. e., healthy) training images, which i) ignore the importance of temporal information in consecutive video frames, and ii) lack knowledge about the polyps.
1 code implementation • 28 Mar 2022 • Yuyuan Liu, Yu Tian, Chong Wang, Yuanhong Chen, Fengbei Liu, Vasileios Belagiannis, Gustavo Carneiro
The most successful SSL approaches are based on consistency learning that minimises the distance between model responses obtained from perturbed views of the unlabelled data.
1 code implementation • 3 May 2022 • Yu Tian, Jianxin Chang, Yannan Niu, Yang song, Chenliang Li
Specifically, multi-interest methods such as ComiRec and MIMN, focus on extracting different interests for a user by performing historical item clustering, while graph convolution methods including TGSRec and SURGE elect to refine user preferences based on multi-level correlations between historical items.
1 code implementation • 20 Jul 2022 • Yuxiao Chen, Long Zhao, Jianbo Yuan, Yu Tian, Zhaoyang Xia, Shijie Geng, Ligong Han, Dimitris N. Metaxas
Despite the success of fully-supervised human skeleton sequence modeling, utilizing self-supervised pre-training for skeleton sequence representation learning has been an active field because acquiring task-specific skeleton annotations at large scales is difficult.
no code implementations • COLING 2022 • Zhaoye Fei, Yu Tian, Yongkang Wu, Xinyu Zhang, Yutao Zhu, Zheng Liu, Jiawen Wu, Dejiang Kong, Ruofei Lai, Zhao Cao, Zhicheng Dou, Xipeng Qiu
Our experiments on 13 benchmark datasets across five natural language understanding tasks demonstrate the superiority of our method.
1 code implementation • 2 Sep 2022 • Min Shi, Anagha Lokhande, Mojtaba S. Fazli, Vishal Sharma, Yu Tian, Yan Luo, Louis R. Pasquale, Tobias Elze, Michael V. Boland, Nazlee Zebardast, David S. Friedman, Lucy Q. Shen, Mengyu Wang
Ophthalmic images and derivatives such as the retinal nerve fiber layer (RNFL) thickness map are crucial for detecting and monitoring ophthalmic diseases (e. g., glaucoma).
no code implementations • 13 Sep 2022 • Yu Tian, Zhangkai Ni, Baoliang Chen, Shurun Wang, Shiqi Wang, Hanli Wang, Sam Kwong
In particular, in order to maximum redundancy removal without impairment of robust identity information, we apply the encoder with multiple feature extraction and attention-based feature decomposition modules to progressively decompose face features into two uncorrelated components, i. e., identity and residual features, via self-supervised learning.
1 code implementation • 21 Sep 2022 • Yuanhong Chen, Hu Wang, Chong Wang, Yu Tian, Fengbei Liu, Michael Elliott, Davis J. McCarthy, Helen Frazer, Gustavo Carneiro
When analysing screening mammograms, radiologists can naturally process information across two ipsilateral views of each breast, namely the cranio-caudal (CC) and mediolateral-oblique (MLO) views.
no code implementations • 26 Sep 2022 • Chong Wang, Yuanhong Chen, Yuyuan Liu, Yu Tian, Fengbei Liu, Davis J. McCarthy, Michael Elliott, Helen Frazer, Gustavo Carneiro
On the other hand, prototype-based models improve interpretability by associating predictions with training image prototypes, but they are less accurate than global models and their prototypes tend to have poor diversity.
1 code implementation • ICCV 2023 • Yuyuan Liu, Choubo Ding, Yu Tian, Guansong Pang, Vasileios Belagiannis, Ian Reid, Gustavo Carneiro
Semantic segmentation models classify pixels into a set of known (``in-distribution'') visual classes.
Ranked #1 on Anomaly Detection on Fishyscapes (using extra training data)
1 code implementation • ICCV 2023 • Chong Wang, Yuyuan Liu, Yuanhong Chen, Fengbei Liu, Yu Tian, Davis J. McCarthy, Helen Frazer, Gustavo Carneiro
Prototypical part network (ProtoPNet) methods have been designed to achieve interpretable classification by associating predictions with a set of training prototypes, which we refer to as trivial prototypes because they are trained to lie far from the classification boundary in the feature space.
Explainable Artificial Intelligence (XAI) Image Classification +1
no code implementations • 31 Jan 2023 • Yuanhong Chen, Yuyuan Liu, Chong Wang, Michael Elliott, Chun Fung Kwok, Carlos Pena-Solorzano, Yu Tian, Fengbei Liu, Helen Frazer, Davis J. McCarthy, Gustavo Carneiro
Given the large size of such datasets, researchers usually face a dilemma with the weakly annotated subset: to not use it or to fully annotate it.
1 code implementation • 6 Mar 2023 • Shijie Geng, Jianbo Yuan, Yu Tian, Yuxiao Chen, Yongfeng Zhang
The success of large-scale contrastive vision-language pretraining (CLIP) has benefited both visual recognition and multimodal content understanding.
1 code implementation • CVPR 2023 • Yuxiao Chen, Jianbo Yuan, Yu Tian, Shijie Geng, Xinyu Li, Ding Zhou, Dimitris N. Metaxas, Hongxia Yang
However, direct aligning cross-modal information using such representations is challenging, as visual patches and text tokens differ in semantic levels and granularities.
no code implementations • 6 Apr 2023 • Xin Hu, Yu Tian, Keisuke Nagato, Masayuki Nakao, Ang Liu
Recent advancements in Natural Language Processing have opened up new possibilities for the development of large language models like ChatGPT, which can facilitate knowledge management in the design process by providing designers with access to a vast array of relevant information.
no code implementations • 8 May 2023 • Jiguang He, Aymen Fakhreddine, Arthur S. de Sena, Yu Tian, Merouane Debbah
Reconfigurable intelligent surfaces (RISs) bring various benefits to the current and upcoming wireless networks, including enhanced spectrum and energy efficiency, soft handover, transmission reliability, and even localization accuracy.
1 code implementation • 29 May 2023 • Jinan Zou, Maihao Guo, Yu Tian, YuHao Lin, Haiyao Cao, Lingqiao Liu, Ehsan Abbasnejad, Javen Qinfeng Shi
Identifying unexpected domain-shifted instances in natural language processing is crucial in real-world applications.
1 code implementation • 15 Jun 2023 • Yan Luo, Yu Tian, Min Shi, Louis R. Pasquale, Lucy Q. Shen, Nazlee Zebardast, Tobias Elze, Mengyu Wang
To address this gap, we introduce Harvard Glaucoma Fairness (Harvard-GF), a retinal nerve disease dataset with both 2D and 3D imaging data and balanced racial groups for glaucoma detection.
no code implementations • 17 Jun 2023 • Lina Bariah, Qiyang Zhao, Hang Zou, Yu Tian, Faouzi Bader, Merouane Debbah
To be specific, large GenAI models are envisioned to open up a new era of autonomous wireless networks, in which multi-modal GenAI models trained over various Telecom data, can be fine-tuned to perform several downstream tasks, eliminating the need for building and training dedicated AI models for each specific task and paving the way for the realization of artificial general intelligence (AGI)-empowered wireless networks.
no code implementations • 29 Jun 2023 • Yu Tian, Bofang Li, Si Chen, Xubin Li, Hongbo Deng, Jian Xu, Bo Zheng, Qian Wang, Chenliang Li
Recently, Multi-Scenario Learning (MSL) is widely used in recommendation and retrieval systems in the industry because it facilitates transfer learning from different scenarios, mitigating data sparsity and reducing maintenance cost.
no code implementations • 4 Jul 2023 • Yu Tian, Renaud Lambiotte
Here, we focus on the case when the weight matrix is Hermitian, a reasonable assumption in many applications, and investigate both structural and dynamical properties of the complex-weighted networks.
no code implementations • 19 Jul 2023 • Yu Tian, Zachary Lubberts, Melanie Weber
We consider several discrete curvature notions and analyze the utility of the resulting algorithms.
1 code implementation • ICCV 2023 • Cheng-En Wu, Yu Tian, Haichao Yu, Heng Wang, Pedro Morgado, Yu Hen Hu, Linjie Yang
Vision-language models such as CLIP learn a generic text-image embedding from large-scale training data.
no code implementations • ICCV 2023 • Yan Luo, Min Shi, Yu Tian, Tobias Elze, Mengyu Wang
This is the largest glaucoma detection dataset with 3D OCT imaging data and the first glaucoma progression forecasting dataset that is publicly available.
1 code implementation • 21 Sep 2023 • Yu Tian, Qiyang Zhao, Zine el abidine Kherroubi, Fouzi Boukhalfa, Kebin Wu, Faouzi Bader
Wireless communications at high-frequency bands with large antenna arrays face challenges in beam management, which can potentially be improved by multimodality sensing information from cameras, LiDAR, radar, and GPS.
1 code implementation • 21 Sep 2023 • Yinpeng Dong, Huanran Chen, Jiawei Chen, Zhengwei Fang, Xiao Yang, Yichi Zhang, Yu Tian, Hang Su, Jun Zhu
By attacking white-box surrogate vision encoders or MLLMs, the generated adversarial examples can mislead Bard to output wrong image descriptions with a 22% success rate based solely on the transferability.
no code implementations • 27 Sep 2023 • Haichao Yu, Yu Tian, Sateesh Kumar, Linjie Yang, Heng Wang
DataComp is a new benchmark dedicated to evaluating different methods for data filtering.
no code implementations • 3 Oct 2023 • Yan Luo, Yu Tian, Min Shi, Tobias Elze, Mengyu Wang
Our FIS approach is compared with various state-of-the-art fairness learning methods with superior performance in the racial, gender, and ethnicity fairness tasks with 2D and 3D imaging data, which demonstrate the utilities of our Harvard-EF dataset for fairness learning.
1 code implementation • 19 Oct 2023 • Jiawen Zhu, Choubo Ding, Yu Tian, Guansong Pang
Extensive experiments on nine real-world anomaly detection datasets show that AHL can 1) substantially enhance different state-of-the-art OSAD models in detecting seen and unseen anomalies, and 2) effectively generalize to unseen anomalies in new domains.
3 code implementations • 29 Oct 2023 • Qihang Zhou, Guansong Pang, Yu Tian, Shibo He, Jiming Chen
It is a crucial task when training data is not accessible due to various concerns, eg, data privacy, yet it is challenging since the models need to generalize to anomalies across different domains where the appearance of foreground objects, abnormal regions, and background features, such as defects/tumors on different products/organs, can vary significantly.
1 code implementation • 3 Nov 2023 • Yu Tian, Min Shi, Yan Luo, Ava Kouhana, Tobias Elze, Mengyu Wang
Existing medical fairness datasets are all for classification tasks, and no fairness datasets are available for medical segmentation, while medical segmentation is an equally important clinical task as classifications, which can provide detailed spatial information on organ abnormalities ready to be assessed by clinicians.
1 code implementation • 16 Nov 2023 • Chris Kelly, Luhui Hu, Cindy Yang, Yu Tian, Deshun Yang, Bang Yang, Zaoshan Huang, Zihao Li, Yuexian Zou
In the current landscape of artificial intelligence, foundation models serve as the bedrock for advancements in both language and vision domains.
1 code implementation • 20 Nov 2023 • Yu Tian, Xiao Yang, Jingyuan Zhang, Yinpeng Dong, Hang Su
Rapid advancements in large language models (LLMs) have revitalized in LLM-based agents, exhibiting impressive human-like behaviors and cooperative capabilities in various scenarios.
no code implementations • 4 Feb 2024 • Linnéa Gyllingberg, Yu Tian, David J. T. Sumpter
We then show that in an oscillatory environment our model builds efficient solutions, provided the environmental oscillations are sufficiently out of phase.
no code implementations • 8 Feb 2024 • Yu Tian, Ahmed Alhammadi, Abdullah Quran, Abubakar Sani Ali
In this paper, we address the intricate issue of RF signal separation by presenting a novel adaptation of the WaveNet architecture that introduces learnable dilation parameters, significantly enhancing signal separation in dense RF spectrums.
no code implementations • 23 Feb 2024 • Yu Tian, Xiao Yang, Yinpeng Dong, Heming Yang, Hang Su, Jun Zhu
It allows users to design specific prompts to generate realistic images through some black-box APIs.
no code implementations • 26 Feb 2024 • Hang Zou, Qiyang Zhao, Lina Bariah, Yu Tian, Mehdi Bennis, Samson Lasaulce, Merouane Debbah, Faouzi Bader
Connecting GenAI agents over a wireless network can potentially unleash the power of collective intelligence and pave the way for artificial general intelligence (AGI).
no code implementations • 5 Mar 2024 • Weizhi Wang, Khalil Mrini, Linjie Yang, Sateesh Kumar, Yu Tian, Xifeng Yan, Heng Wang
Our MLM filter can generalize to different models and tasks, and be used as a drop-in replacement for CLIPScore.
no code implementations • 10 Mar 2024 • Deshun Yang, Luhui Hu, Yu Tian, Zihao Li, Chris Kelly, Bang Yang, Cindy Yang, Yuexian Zou
Several text-to-video diffusion models have demonstrated commendable capabilities in synthesizing high-quality video content.
no code implementations • 14 Mar 2024 • Chris Kelly, Luhui Hu, Jiayin Hu, Yu Tian, Deshun Yang, Bang Yang, Cindy Yang, Zihao Li, Zaoshan Huang, Yuexian Zou
It seamlessly integrates various SOTA vision models and brings the automation in the selection of SOTA vision models, identifies the suitable 3D mesh creation algorithms corresponding to 2D depth maps analysis, generates optimal results based on diverse multimodal inputs such as text prompts.
no code implementations • 14 Mar 2024 • Chris Kelly, Luhui Hu, Bang Yang, Yu Tian, Deshun Yang, Cindy Yang, Zaoshan Huang, Zihao Li, Jiayin Hu, Yuexian Zou
With the emergence of large language models (LLMs) and vision foundation models, how to combine the intelligence and capacity of these open-sourced or API-available models to achieve open-world visual perception remains an open question.