In this paper, we propose a novel DEC model, which we named the deep embedded clustering model with cluster-level representation learning (DECCRL) to jointly learn cluster and instance level representations.
Recent advancements in data-driven task-oriented dialogue systems (ToDs) struggle with incremental learning due to computational constraints and time-consuming issues.
Our method is verified on these datasets, and experimental results exhibit that the VIS-Net can significantly outperform existing state-of-the-art referring segmentation methods.
Recent studies have demonstrated promising performance of ChatGPT and GPT-4 on several medical domain tasks.
no code implementations • 11 Jul 2023 • Yinghao Ma, Ruibin Yuan, Yizhi Li, Ge Zhang, Xingran Chen, Hanzhi Yin, Chenghua Lin, Emmanouil Benetos, Anton Ragni, Norbert Gyenge, Ruibo Liu, Gus Xia, Roger Dannenberg, Yike Guo, Jie Fu
Our findings suggest that training with music data can generally improve performance on MIR tasks, even when models are trained using paradigms designed for speech.
no code implementations • 29 Jun 2023 • Le Zhuo, Ruibin Yuan, Jiahao Pan, Yinghao Ma, Yizhi Li, Ge Zhang, Si Liu, Roger Dannenberg, Jie Fu, Chenghua Lin, Emmanouil Benetos, Wenhu Chen, Wei Xue, Yike Guo
We introduce LyricWhiz, a robust, multilingual, and zero-shot automatic lyrics transcription method achieving state-of-the-art performance on various lyrics transcription datasets, even in challenging genres such as rock and metal.
no code implementations • 18 Jun 2023 • Ruibin Yuan, Yinghao Ma, Yizhi Li, Ge Zhang, Xingran Chen, Hanzhi Yin, Le Zhuo, Yiqi Liu, Jiawen Huang, Zeyue Tian, Binyue Deng, Ningzhi Wang, Chenghua Lin, Emmanouil Benetos, Anton Ragni, Norbert Gyenge, Roger Dannenberg, Wenhu Chen, Gus Xia, Wei Xue, Si Liu, Shi Wang, Ruibo Liu, Yike Guo, Jie Fu
This is evident in the limited work on deep music representations, the scarcity of large-scale datasets, and the absence of a universal and community-driven benchmark.
1 code implementation • 31 May 2023 • Yizhi Li, Ruibin Yuan, Ge Zhang, Yinghao Ma, Xingran Chen, Hanzhi Yin, Chenghua Lin, Anton Ragni, Emmanouil Benetos, Norbert Gyenge, Roger Dannenberg, Ruibo Liu, Wenhu Chen, Gus Xia, Yemin Shi, Wenhao Huang, Yike Guo, Jie Fu
To address this research gap, we propose an acoustic Music undERstanding model with large-scale self-supervised Training (MERT), which incorporates teacher models to provide pseudo labels in the masked language modelling (MLM) style acoustic pre-training.
no code implementations • 22 May 2023 • Zekun Wang, Ge Zhang, Kexin Yang, Ning Shi, Wangchunshu Zhou, Shaochun Hao, Guangzheng Xiong, Yizhi Li, Mong Yuan Sim, Xiuying Chen, Qingqing Zhu, Zhenzhu Yang, Adam Nik, Qi Liu, Chenghua Lin, Shi Wang, Ruibo Liu, Wenhu Chen, Ke Xu, Dayiheng Liu, Yike Guo, Jie Fu
Interactive Natural Language Processing (iNLP) has emerged as a novel paradigm within the field of NLP, aimed at addressing limitations in existing frameworks while aligning with the ultimate goals of artificial intelligence.
Since expert knowledge is hard to acquire, it hinders the flexibility to quickly design and tune digital synthesizers for diverse sounds.
Stagnant weather condition is one of the major contributors to air pollution as it is favorable for the formation and accumulation of pollutants.
In this paper, we propose a "Co"nsistency "Mo"del-based "Speech" synthesis method, CoMoSpeech, which achieve speech synthesis through a single diffusion sampling step while achieving high audio quality.
no code implementations • 18 Mar 2023 • Sibo Cheng, Cesar Quilodran-Casas, Said Ouala, Alban Farchi, Che Liu, Pierre Tandeo, Ronan Fablet, Didier Lucor, Bertrand Iooss, Julien Brajard, Dunhui Xiao, Tijana Janjic, Weiping Ding, Yike Guo, Alberto Carrassi, Marc Bocquet, Rossella Arcucci
Data Assimilation (DA) and Uncertainty quantification (UQ) are extensively used in analysing and reducing error propagation in high-dimensional spatial-temporal dynamics.
Recent research is revealing how cognitive processes are supported by a complex interplay between the brain and the rest of the body, which can be investigated by the analysis of physiological features such as breathing rhythms, heart rate, and skin conductance.
This report presents a comprehensive view of our vision on the development path of the human-machine symbiotic art creation.
In order to demonstrate the proposed model's capability in dealing with severe data loss scenarios, we contribute a high-accuracy and challenging motion capture dataset of multi-person interactions with severe occlusion.
We evaluate the framework on two different brain image analysis tasks, namely brain tumour segmentation and whole brain segmentation.
Structured (tabular) data in the preclinical and clinical domains contains valuable information about individuals and an efficient table-to-text summarization system can drastically reduce manual efforts to condense this data into reports.
This paper proposes a scalable workflow which leverages both structured data and unstructured textual notes from EHRs with techniques including NLP, AutoML and Clinician-in-the-Loop mechanism to build machine learning classifiers to identify patients at scale with given diseases, especially those who might currently be miscoded or missed by ICD codes.
Extracting phenotypes from clinical text has been shown to be useful for a variety of clinical use cases such as identifying patients with rare diseases.
To tackle this issue, we introduce a simple BatchNorm variation with bounded scaling parameters, based on which we design a novel regularisation term that suppresses only neurons with low importance.
Thirdly, both label-dependent and event-guided representations are integrated to make a robust prediction, in which the interpretability is enabled by the attention weights over words from medical notes.
In this paper, we propose a label dependent attention model LDAM to 1) improve the interpretability by exploiting Clinical-BERT (a biomedical language model pre-trained on a large clinical corpus) to encode biomedically meaningful features and labels jointly; 2) extend the idea of joint embedding to the processing of time-series data, and develop a multi-modal learning framework for integrating heterogeneous information from medical notes and time-series health status indicators.
1 code implementation • 19 Dec 2021 • Raghav Mehta, Angelos Filos, Ujjwal Baid, Chiharu Sako, Richard McKinley, Michael Rebsamen, Katrin Datwyler, Raphael Meier, Piotr Radojewski, Gowtham Krishnan Murugesan, Sahil Nalawade, Chandan Ganesh, Ben Wagner, Fang F. Yu, Baowei Fei, Ananth J. Madhuranthakam, Joseph A. Maldjian, Laura Daza, Catalina Gomez, Pablo Arbelaez, Chengliang Dai, Shuo Wang, Hadrien Reynaud, Yuan-han Mo, Elsa Angelini, Yike Guo, Wenjia Bai, Subhashis Banerjee, Lin-min Pei, Murat AK, Sarahi Rosas-Gonzalez, Ilyess Zemmoura, Clovis Tauber, Minh H. Vu, Tufve Nyholm, Tommy Lofstedt, Laura Mora Ballestar, Veronica Vilaplana, Hugh McHugh, Gonzalo Maso Talou, Alan Wang, Jay Patel, Ken Chang, Katharina Hoebel, Mishka Gidwani, Nishanth Arun, Sharut Gupta, Mehak Aggarwal, Praveer Singh, Elizabeth R. Gerstner, Jayashree Kalpathy-Cramer, Nicolas Boutry, Alexis Huard, Lasitha Vidyaratne, Md Monibor Rahman, Khan M. Iftekharuddin, Joseph Chazalon, Elodie Puybareau, Guillaume Tochon, Jun Ma, Mariano Cabezas, Xavier Llado, Arnau Oliver, Liliana Valencia, Sergi Valverde, Mehdi Amian, Mohammadreza Soltaninejad, Andriy Myronenko, Ali Hatamizadeh, Xue Feng, Quan Dou, Nicholas Tustison, Craig Meyer, Nisarg A. Shah, Sanjay Talbar, Marc-Andre Weber, Abhishek Mahajan, Andras Jakab, Roland Wiest, Hassan M. Fathallah-Shaykh, Arash Nazeri, Mikhail Milchenko1, Daniel Marcus, Aikaterini Kotrotsou, Rivka Colen, John Freymann, Justin Kirby, Christos Davatzikos, Bjoern Menze, Spyridon Bakas, Yarin Gal, Tal Arbel
In this study, we explore and evaluate a score developed during the BraTS 2019 and BraTS 2020 task on uncertainty quantification (QU-BraTS) and designed to assess and rank uncertainty estimates for brain tumor multi-compartment segmentation.
With the rapid development of high-throughput experimental technologies, different types of omics (e. g., genomics, epigenomics, transcriptomics, proteomics, and metabolomics) data can be produced from clinical samples.
Contextualised word embeddings is a powerful tool to detect contextual synonyms.
We propose the automatic annotation of phenotypes from clinical notes as a method to capture essential information, which is complementary to typically used vital signs and laboratory test results, to predict outcomes in the Intensive Care Unit (ICU).
In cardiac magnetic resonance (CMR) imaging, a 3D high-resolution segmentation of the heart is essential for detailed description of its anatomical structures.
To the best of our knowledge, XOmiVAE is one of the first activation level-based interpretable deep learning models explaining novel clusters generated by VAE.
Our two-step method integrates a Principal Components Analysis (PCA) based adversarial autoencoder (PC-AAE) with adversarial Long short-term memory (LSTM) networks.
Epidemiology models play a key role in understanding and responding to the COVID-19 pandemic.
To modify a design semantic of a given product from personalised brain activity via adversarial learning, in this work, we propose a deep generative transformation model to modify product semantics from the brain signal.
A recurrent neural network is used as the encoder to learn latent representation from electroencephalogram (EEG) signals, recorded while subjects looked at 50 categories of images.
For this reason, we propose an end-to-end brain decoding framework which translates brain activity into an image by latent space alignment.
Here we introduce two digital twins of a SEIRS model applied to an idealised town.
To tackle this problem and pave the way for machine learning aided precision medicine, we proposed a unified multi-task deep learning framework named OmiEmbed to capture biomedical information from high-dimensional omics data with the deep embedding and downstream task modules.
Blockchain technology has been envisaged to commence an era of decentralised applications and services (DApps) without the need for a trusted intermediary.
Cryptography and Security Distributed, Parallel, and Cluster Computing
This adversarially trained LSTM-based approach is used on the ROM in order to produce faster forecasts of the air pollution tracer.
Furthermore, in the era of the Internet of Things and big data in which data is essentially distributed, transferring a vast amount of data to a data centre for processing seems to be a cumbersome solution.
Machine learning has been widely adopted for medical image analysis in recent years given its promising performance in image segmentation and classification tasks.
Our approach provides a real-time and model-agnostic quality control for cardiac MRI segmentation, which has the potential to be integrated into clinical image analysis workflows.
Supervised deep learning requires a large amount of training samples with annotations (e. g. label class for classification task, pixel- or voxel-wised label map for segmentation tasks), which are expensive and time-consuming to obtain.
Supervised deep learning for medical imaging analysis requires a large amount of training samples with annotations (e. g. label class for classification task, pixel- or voxel-wised label map for medical segmentation tasks), which are expensive and time-consuming to obtain.
Deep neural networks are a promising approach towards multi-task learning because of their capability to leverage knowledge across domains and learn general purpose representations.
Deep reinforcement learning requires a heavy price in terms of sample efficiency and overparameterization in the neural networks used for function approximation.
Gliomas are the most common malignant brain tumourswith intrinsic heterogeneity.
The extraction of phenotype information which is naturally contained in electronic health records (EHRs) has been found to be useful in various clinical informatics applications such as disease diagnosis.
Deep reinforcement learning requires a heavy price in terms of sample efficiency and overparameterization in the neural networks used for function approximation.
Brain MR image segmentation is a key task in neuroimaging studies.
The training procedure of OmiVAE is comprised of an unsupervised phase without the classifier and a supervised phase with the classifier.
1 code implementation • • Yimin Wang, Qi Li, Li-Juan Liu, Zhi Zhou, Zongcai Ruan, Lingsheng Kong, Yaoyao Li, Yun Wang, Ning Zhong, Renjie Chai, Xiangfeng Luo, Yike Guo, Michael Hawrylycz, Qingming Luo, Zhongze Gu, Wei Xie, Hongkui Zeng, Hanchuan Peng
Neuron morphology is recognized as a key determinant of cell type, yet the quantitative profiling of a mammalian neuron’s complete three-dimensional (3-D) morphology remains arduous when the neuron has complex arborization and long projection.
In the recent years, convolutional neural networks have transformed the field of medical image analysis due to their capacity to learn discriminative image features for a variety of classification and regression tasks.
Recent seminal work at the intersection of deep neural networks practice and random matrix theory has linked the convergence speed and robustness of these networks with the combination of random weight initialization and nonlinear activation function in use.
Insufficient or even unavailable training data of emerging classes is a big challenge of many classification tasks, including text classification.
The role of $L^2$ regularization, in the specific case of deep neural networks rather than more traditional machine learning models, is still not fully elucidated.
Predicting traffic conditions from online route queries is a challenging task as there are many complicated interactions over the roads and crowds involved.
Ranked #1 on Traffic Prediction on Q-Traffic
Bionic design refers to an approach of generative creativity in which a target object (e. g. a floor lamp) is designed to contain features of biological source objects (e. g. flowers), resulting in creative biologically-inspired design.
However, current architecture of deep networks suffers the privacy issue that users need to give out their data to the model (typically hosted in a server or a cluster on Cloud) for training or prediction.
Deep learning has enabled major advances in the fields of computer vision, natural language processing, and multimedia among many others.
In this paper, we propose a way of synthesizing realistic images directly with natural language description, which has many useful applications, e. g. intelligent image manipulation.
Fast Magnetic Resonance Imaging (MRI) is highly in demand for many clinical applications in order to reduce the scanning cost and improve the patient experience.
In this context, a reliable fully automatic segmentation method for the brain tumor segmentation is necessary for an efficient measurement of the tumor extent.
Our method is evaluated on two datasets, namely the Sunnybrook Cardiac Dataset (SCD) and data from the STACOM 2011 LV segmentation challenge.
We demonstrate that %the capability of our method to understand the sentence descriptions, so as to I2T2I can generate better multi-categories images using MSCOCO than the state-of-the-art.
This demonstrated that, without changing the model architecture and the training algorithm, our model could automatically learn features for sleep stage scoring from different raw single-channel EEGs from different datasets without utilizing any hand-engineered features.
Ranked #4 on Sleep Stage Detection on MASS SS3
It's useful to automatically transform an image from its original form to some synthetic form (style, partial contents, etc.
Use of this recording configuration with neural network deconvolution promises to make clinically indicated home sleep studies practical.
We used convolutional neural networks (CNNs) for automatic sleep stage scoring based on single-channel electroencephalography (EEG) to learn task-specific filters for classification without using prior domain knowledge.