Trending Research

LLM2Vec: Large Language Models Are Secretly Powerful Text Encoders

mcgill-nlp/llm2vec • • 9 Apr 2024

We outperform encoder-only models by a large margin on word-level tasks and reach a new unsupervised state-of-the-art performance on the Massive Text Embeddings Benchmark (MTEB).

Contrastive Learning

347

0.60 stars / hour

Paper
Code

LAFS: Landmark-based Facial Self-supervised Learning for Face Recognition

Recognito-Vision/Face-SDK-Linux-Demos • 13 Mar 2024

This enables our method - namely LAndmark-based Facial Self-supervised learning LAFS), to learn key representation that is more critical for face recognition.

Face Recognition Self-Supervised Learning

201

0.58 stars / hour

Paper
Code

Multi-domain Learning for Updating Face Anti-spoofing Models

Recognito-Vision/Linux-FaceRecognition-FaceLivenessDetection • 23 Aug 2022

In this work, we study multi-domain learning for face anti-spoofing(MD-FAS), where a pre-trained FAS model needs to be updated to perform equally well on both source and target domains while only using target domain data for updating.

Face Anti-Spoofing

201

0.57 stars / hour

Paper
Code

Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models

dvlab-research/minigemini • • 27 Mar 2024

We try to narrow the gap by mining the potential of VLMs for better performance and any-to-any workflow from three aspects, i. e., high-resolution visual tokens, high-quality data, and VLM-guided generation.

Ranked #9 on Visual Question Answering on MM-Vet

Image Comprehension Visual Dialog +1

2,892

0.57 stars / hour

Paper
Code

The Unreasonable Ineffectiveness of the Deeper Layers

arcee-ai/PruneMe • • 26 Mar 2024

We empirically study a simple layer-pruning strategy for popular families of open-weight pretrained LLMs, finding minimal degradation of performance on different question-answering benchmarks until after a large fraction (up to half) of the layers are removed.

Quantization Question Answering

0.56 stars / hour

Paper
Code

UniMERNet: A Universal Network for Real-World Mathematical Expression Recognition

opendatalab/unimernet • • 23 Apr 2024

This paper presents the UniMER dataset to provide the first study on Mathematical Expression Recognition (MER) towards complex real-world scenarios.

Image Augmentation

0.54 stars / hour

Paper
Code

emotion2vec: Self-Supervised Pre-Training for Speech Emotion Representation

alibaba-damo-academy/FunASR • • 23 Dec 2023

To the best of our knowledge, emotion2vec is the first universal representation model in various emotion-related tasks, filling a gap in the field.

Self-Supervised Learning Sentiment Analysis +1

3,321

0.54 stars / hour

Paper
Code

MoVA: Adapting Mixture of Vision Experts to Multimodal Context

templex98/mova • 19 Apr 2024

Although some large-scale pretrained vision encoders such as vision encoders in CLIP and DINOv2 have brought promising performance, we found that there is still no single vision encoder that can dominate various image content understanding, e. g., the CLIP vision encoder leads to outstanding results on general image understanding but poor performance on document or chart content.

Language Modelling Large Language Model

0.53 stars / hour

Paper
Code

STaRK: Benchmarking LLM Retrieval on Textual and Relational Knowledge Bases

snap-stanford/stark • • 19 Apr 2024

Answering real-world user queries, such as product search, often requires accurate retrieval of information from semi-structured knowledge bases or databases that involve blend of unstructured (e. g., textual descriptions of products) and structured (e. g., entity relations of products) information.

Benchmarking Retrieval

0.51 stars / hour

Paper
Code

ControlNet++: Improving Conditional Controls with Efficient Consistency Feedback

liming-ai/ControlNet_Plus_Plus • • 11 Apr 2024

To this end, we propose ControlNet++, a novel approach that improves controllable generation by explicitly optimizing pixel-level cycle consistency between generated images and conditional controls.

SSIM

162

0.50 stars / hour

Paper
Code