In Chinese, the derivation may be marked either with the standard adverbial marker DI, or the non-standard marker DE.
Weakly supervised semantic segmentation with image-level labels has attracted a lot of attention recently because these labels are already available in most datasets.
As the fundamental basis of sponsored search, relevance modeling has attracted increasing attention due to the tremendous practical value.
First, to solve the problem of inconsistency of codec caused by the uncertainty of floating point calculations across platforms, we design a calibration transmitting system to guarantee the consistent quantization of entropy parameters between the encoding and decoding stages.
A practical solution to this problem would be to utilize the available multimodal large language models (MLLMs) to generate instruction data for vision-language tasks.
We thus introduce a dynamic gating network on top of the low-rank adaptation method, in order to decide which decoder layer should employ adaptation.
Despite the simplicity of our method, an IP-Adapter with only 22M parameters can achieve comparable or even better performance to a fully fine-tuned image prompt model.
To address these issues, we proposed RLTF, i. e., Reinforcement Learning from Unit Test Feedback, a novel online RL framework with unit test feedback of multi-granularity for refining code LLMs.
Specifically, we first equip the diffusion model with 3D awareness by leveraging landmark-based control and a learned textual embedding representing the back view appearance of heads, enabling 3D-consistent head avatar generations.
Ensuring fairness in anomaly detection models has received much attention recently as many anomaly detection applications involve human beings.
In the fashion domain, there exists a variety of vision-and-language (V+L) tasks, including cross-modal retrieval, text-guided image retrieval, multi-modal classification, and image captioning.
Therefore, the ranking stage is still essential for most applications to provide high-quality candidate set for the re-ranking stage.
Finally, FCL brings a robust, accurate, low-cost AI training model to biomedical research, effectively protecting medical data privacy.
In this paper, for a common disease in ICU patients, sepsis, we propose a novel cross-center collaborative learning framework guided by medical knowledge, SofaNet, to achieve early recognition of this disease.
In order to improve the information-sharing capability and innovation of various healthcare-related institutions, and then to establish a next-generation open medical collaboration network, we propose a unified framework for vertical federated knowledge transfer mechanism (VFedTrans) based on a novel cross-hospital representation distillation component.
The OSrisk for the prediction of 5-year survival status achieved AUC of 0. 784 (0. 746-0. 819) in the TCGA cohort, which was further verified in the independent General cohort and the CPTAC cohort, with AUC of 0. 774 (0. 723-0. 820) and 0. 702 (0. 632-0. 765), respectively.
Intuitively, these poor clients may come from biased universal information shared with others.
We propose KnowledgeDA, a unified domain language model development service to enhance the task-specific training procedure with domain knowledge graphs.
After that, we further propose an anomaly mitigation approach that aims to recommend mitigation actions on abnormal features to revert the abnormal outcomes such that the counterfactuals guided by the causal mechanism are normal.
Most of the existing methods rely on a multiple instance learning framework that requires densely sampling local patches at high magnification.
Nuclear magnetic resonance (NMR) spectroscopy has become a formidable tool for biochemistry and medicine.
LiDAR can capture accurate depth information in large-scale scenarios without the effect of light conditions, and the captured point cloud contains gait-related 3D geometric properties and dynamic motion characteristics.
Bilingual lexicon induction induces the word translations by aligning independently trained word embeddings in two languages.
The high-content image-based assay is commonly leveraged for identifying the phenotypic impact of genetic perturbations in biology field.
By simulating the attack mechanism as the safety test, SafeCompress can automatically compress a big model to a small one following the dynamic sparse training paradigm.
A new deep neural network based on the WaveNet architecture (WNN) is presented, which is designed to grasp specific patterns in the NMR spectra.
Large-scale weakly supervised product retrieval is a practically useful yet computationally challenging problem.
We thus propose a Multi-View Contrastive Learning task for pulling closer the visual representation of one image to the compositional multimodal representation of another image+text.
Basically, we propose to perturb the original network by adding or removing links, and expect the embedding generated on the perturbed network can leak little information about private links but hold high utility for various downstream tasks.
MC-TMB algorithm also exhibited good generalization on the external validation cohort with an AUC of 0. 732 (0. 683-0. 761), and better performance when compared to other methods.
In this paper, we propose a hierarchical global-to-local clustering strategy to build a Node-Aligned GCN (NAGCN) to represent WSI with rich local structural information as well as global distribution.
Federated learning (FL) is a promising machine learning paradigm that enables cross-party data collaboration for real-world AI applications in a privacy-preserving and law-regulated way.
We then design a novel hybrid protection mechanism called HyObscure, to cross-iteratively optimize the generalization and obfuscation operations for maximum privacy protection under a certain utility guarantee.
In this paper, we propose Label Hierarchy Transition, a unified probabilistic framework based on deep learning, to address hierarchical classification.
Labels are costly and sometimes unreliable.
Ranked #5 on Image Classification on mini WebVision 1.0
Firstly, to fully utilize the existing small-scale benchmarking datasets for more discriminative feature learning, we introduce a cross-modal momentum contrastive learning framework to enrich the training data for a given mini-batch.
Ranked #7 on Text based Person Retrieval on CUHK-PEDES (using extra training data)
LCM can learn label confusion to capture semantic overlap among labels by calculating the similarity between instances and labels during training and generate a better label distribution to replace the original one-hot label vector, thus improving the final classification performance.
In particular, we first propose a federated crowdsensing framework, which analyzes the privacy concerns of each crowdsensing stage (i. e., task creation, task assignment, task execution, and data aggregation) and discuss how federated learning techniques may take effect.
Recently, artificial intelligence (AI) has been used in various disease diagnosis to improve diagnostic accuracy and reliability, but the interpretation of diagnosis results is still an open problem.
Due to the related stringent requirements, supporting these applications over wireless local area network (WLAN) is far beyond the capabilities of the new WLAN standard -- IEEE 802. 11ax.
We present Covidex, a search engine that exploits the latest neural ranking models to provide information access to the COVID-19 Open Research Dataset curated by the Allen Institute for AI.
Deep learning demands a huge amount of well-labeled data to train the network parameters.
Both tests are of the Hotelling-type statistics based on the rows of empirical eigenvectors or their ratios, whose asymptotic covariance matrices are very challenging to derive and estimate.
In one stage, the user and item are represented from multiple perspectives and in each perspective, the representations of user and item put attentions to each other.
Liver lesion segmentation is an important step for liver cancer diagnosis, treatment planning and treatment evaluation.
Applying a trained model to generate a complete sCT volume for each new patient MR image only took 9 s, which was much faster than the atlas‐based approach.