Although deep learning techniques have largely improved face recognition, unconstrained surveillance face recognition (FR) is still an unsolved challenge, due to the limited training data and the gap of domain distribution.
Transformer has become ubiquitous due to its dominant performance in various NLP and image processing tasks.
The two recent methods, Balancing GAN (BAGAN) and improved BAGAN (BAGAN-GP), are proposed as an augmentation tool to handle this problem and restore the balance to the data.
The ubiquity of offensive and hateful content on online fora necessitates the need for automatic solutions that detect such content competently across target groups.
As an effective method for intellectual property (IP) protection, model watermarking technology has been applied on a wide variety of deep neural networks (DNN), including speech classification models.
As more and more people begin to wear masks due to current COVID-19 pandemic, existing face recognition systems may encounter severe performance degradation when recognizing masked faces.
Ranked #1 on Face Recognition on MLFW
To address such limitations, we proposed a novel end-to-end training architecture, which utilizes Mini-Batch of Real and Simulated JPEG compression (MBRS) to enhance the JPEG robustness.
However, little attention has been devoted to the protection of DNNs in image processing tasks.
We present CLIP2Video network to transfer the image-language pre-training model to video-text retrieval in an end-to-end manner.
Ranked #4 on Video Retrieval on VATEX (using extra training data)
Large pre-trained language models (LMs) have demonstrated remarkable ability as few-shot learners.
Ranked #1 on Topic Classification on OS
Many critical policy decisions, from strategic investments to the allocation of humanitarian aid, rely on data about the geographic distribution of wealth and poverty.
Large transformer models have shown extraordinary success in achieving state-of-the-art results in many natural language processing applications.
In this way, when the attacker trains one surrogate model by using the input-output pairs of the target model, the hidden watermark will be learned and extracted afterward.
We propose a Denoiser and UPsampler Network (DUP-Net) structure as defenses for 3D adversarial point cloud classification, where the two modules reconstruct surface smoothness by dropping or adding points.