Cross-sentence attention has been widely applied in text matching, in which model learns the aligned information between two intermediate sequence representations to capture their semantic relationship.
In the next stage, we train the skill router using task-specific downstream data and use this router to integrate the acquired skills with LLMs during inference.
Argument generation is a challenging task in natural language processing, which requires rigorous reasoning and proper content organization.
Sufficient conditions are derived to guarantee both the minimal-time deadbeat consensus and the instant individual disagreement degree prediction.
Despite recent progress of pre-trained language models on generating fluent text, existing methods still suffer from incoherence problems in long-form text generation tasks that require proper content control and planning to form a coherent high-level logical flow.
Controllable text generation is an appealing but challenging task, which allows users to specify particular attributes of the generated outputs.
Impressive milestones have been achieved in text matching by adopting a cross-attention mechanism to capture pertinent semantic connections between two sentence representations.
In this paper, we propose a robust and efficient end-to-end non-local spatial propagation network for depth completion.
Ranked #1 on Depth Completion on NYU-Depth V2
To address the issue of preserving spatial information in the U-Net architecture, we design a dense feature fusion module using the back-projection feedback scheme.
Ranked #9 on Image Dehazing on Haze4k
To address this problem, we propose a dual-branch convolutional neural network to extract base features and recovered features separately.
Human judges further rate our system summaries as more informative and coherent than those by popular summarization models.
As a way to significantly reduce model size and computation time, binarized neural network has only been shown to excel on semantic-level tasks such as image classification and recognition.
In this paper, we present a robotic navigation algorithm with natural language interfaces, which enables a robot to safely walk through a changing environment with moving persons by following human instructions such as "go to the restaurant and keep away from people".
Single-image super-resolution is a fundamental task for vision applications to enhance the image quality with respect to spatial resolution.
Traditional approaches to interpolate/extrapolate frames in a video sequence require accurate pixel correspondences between images, e. g., using optical flow.
Using these datasets, we conduct a large-scale user study to quantify the performance of several representative state-of-the-art blind deblurring algorithms.
Removing image blur caused by camera shake is an ill-posed problem, as both the latent image and the point spread function (PSF) are unknown.
Images taken in low-light conditions with handheld cameras are often blurry due to the required long exposure time.
Ranked #9 on Deblurring on RealBlur-R (trained on GoPro)
We propose a simple yet effective L_0-regularized prior based on intensity and gradient for text image deblurring.