no code implementations • 12 Mar 2025 • Songlin Yang, Tao Yang, Bo Hu
Inductive spatio-temporal kriging with increment training strategy has demonstrated its effectiveness using virtual nodes to simulate unobserved nodes.
no code implementations • 20 Feb 2025 • Songlin Yang, Yushi Lan, Honghua Chen, Xingang Pan
Unlike previous methods that depend on explicit correspondences and deformations, our method eliminates the additional need for obtaining correspondence and uses the 3D diffusion prior to generate morphing.
no code implementations • 27 Jan 2025 • Mude Hui, Rui-Jie Zhu, Songlin Yang, Yu Zhang, ZiRui Wang, Yuyin Zhou, Jason Eshraghian, Cihang Xie
Flow models are effective at progressively generating realistic images, but they generally struggle to capture long-range dependencies during the generation process as they compress all the information from previous time steps into a single corrupted image.
3 code implementations • 9 Dec 2024 • Songlin Yang, Jan Kautz, Ali Hatamizadeh
Linear Transformers have gained attention as efficient alternatives to standard Transformers, but their performance in retrieval and long-context tasks has been limited.
2 code implementations • 23 Oct 2024 • Shawn Tan, Yikang Shen, Songlin Yang, Aaron Courville, Rameswar Panda
We propose an alternative attention mechanism based on the stick-breaking process: For each token before the current, we determine a break point $\beta_{i, j}$, which represents the proportion of the remaining stick to allocate to the current token.
1 code implementation • 18 Sep 2024 • Yi Lu, Jing Nathan Yan, Songlin Yang, Justin T. Chiu, Siyu Ren, Fei Yuan, Wenting Zhao, Zhiyong Wu, Alexander M. Rush
Broad textual understanding and in-context learning require language models that utilize full document contexts.
2 code implementations • 11 Sep 2024 • Yu Zhang, Songlin Yang, Ruijie Zhu, Yue Zhang, Leyang Cui, Yiqiao Wang, Bolun Wang, Freda Shi, Bailin Wang, Wei Bi, Peng Zhou, Guohong Fu
Linear attention Transformers and their gated variants, celebrated for enabling parallel training and efficient recurrent inference, still fall short in recall-intensive tasks compared to traditional Transformers and demand significant resources for training from scratch.
2 code implementations • 10 Jun 2024 • Songlin Yang, Bailin Wang, Yu Zhang, Yikang Shen, Yoon Kim
Transformers with linear attention (i. e., linear transformers) and state-space models have recently been suggested as a viable linear-time alternative to transformers with softmax attention.
1 code implementation • 30 Apr 2024 • Xiaoxuan Han, Songlin Yang, Wei Wang, Yang Li, Jing Dong
Specifically, we employ an adversarial search strategy to search for the adversarial embedding which can transfer across different unlearned models.
1 code implementation • 12 Apr 2024 • Yang Li, Songlin Yang, Wei Wang, Ziwen He, Bo Peng, Jing Dong
We verify the effectiveness of the proposed explanations from two aspects: (1) Counterfactual Trace Visualization: the enhanced forgery images are useful to reveal artifacts by visually contrasting the original images and two different visualization methods; (2) Transferable Adversarial Attacks: the adversarial forgery images generated by attacking the detection model are able to mislead other detection models, implying the removed artifacts are general.
3 code implementations • 11 Apr 2024 • Zhen Qin, Songlin Yang, Weixuan Sun, Xuyang Shen, Dong Li, Weigao Sun, Yiran Zhong
Hierarchically gated linear RNN (HGRN, \citealt{HGRN}) has demonstrated competitive training speed and performance in language modeling while offering efficient inference.
no code implementations • 31 Jan 2024 • Yang Li, Songlin Yang, Wei Wang, Jing Dong
The previous methods either failed to accurately fit the face region or lost the interactive generative ability with other existing concepts in T2I models.
no code implementations • 31 Dec 2023 • Xiaoxuan Han, Songlin Yang, Wei Wang, Ziwen He, Jing Dong
To further investigate natural triggers, we propose a novel analysis-by-synthesis backdoor attack against face forgery detection models, which embeds natural triggers in the latent space.
no code implementations • 16 Dec 2023 • Songlin Yang, Wei Wang, Yushi Lan, Xiangyu Fan, Bo Peng, Lei Yang, Jing Dong
Therefore, we are inspired to ask: Can we learn the dense correspondence between different NeRF-based face representations without a 3D parametric model prior?
5 code implementations • 11 Dec 2023 • Songlin Yang, Bailin Wang, Yikang Shen, Rameswar Panda, Yoon Kim
When used as a replacement for the standard attention layer in Transformers, the resulting gated linear attention (GLA) Transformer is found to perform competitively against the LLaMA-architecture Transformer (Touvron et al., 2023) as well recent linear-time-inference baselines such as RetNet (Sun et al., 2023a) and Mamba (Gu & Dao, 2023) on moderate-scale language modeling experiments.
1 code implementation • 26 Oct 2023 • Zhaohui Yan, Songlin Yang, Wei Liu, Kewei Tu
Also, most of current ERE models do not take into account higher-order interactions between multiple entities and relations, while higher-order modeling could be beneficial. In this work, we propose HyperGraph neural network for ERE ($\hgnn{}$), which is built upon the PL-marker (a state-of-the-art marker-based pipleline model).
1 code implementation • 23 Oct 2023 • Wei Liu, Songlin Yang, Yoon Kim, Kewei Tu
Scaling dense PCFGs to thousands of nonterminals via a low-rank parameterization of the rule probability tensor has been shown to be beneficial for unsupervised parsing.
no code implementations • 19 Feb 2023 • Songlin Yang, Wei Wang, Bo Peng, Jing Dong
For more flexible face manipulation, we then design a dual-branch StyleFlow module to transfer the StyleNeRF codes with disentangled geometry and texture flows.
1 code implementation • 18 Dec 2022 • Songlin Yang, Roger P. Levy, Yoon Kim
We study grammar induction with mildly context-sensitive grammars for unsupervised discontinuous parsing.
no code implementations • 30 May 2022 • Songlin Yang, Wei Wang, Chenye Xu, Ziwen He, Bo Peng, Jing Dong
These fine-grained adversarial examples can be used for selecting robust backbone networks and auxiliary features.
2 code implementations • NAACL 2022 • Songlin Yang, Wei Liu, Kewei Tu
Recent research found it beneficial to use large state spaces for HMMs and PCFGs.
no code implementations • 7 Apr 2022 • Songlin Yang, Kewei Tu
Second-order semantic parsing with end-to-end mean-field inference has been shown good performance.
1 code implementation • ACL 2022 • Chao Lou, Songlin Yang, Kewei Tu
They treat nested entities as partially-observed constituency trees and propose the masked inside algorithm for partial marginalization.
1 code implementation • ACL 2022 • Songlin Yang, Kewei Tu
Constituency parsing and nested named entity recognition (NER) are similar tasks since they both aim to predict a collection of nested and non-crossing spans.
1 code implementation • Findings (ACL) 2022 • Songlin Yang, Kewei Tu
Graph-based methods, which decompose the score of a dependency tree into scores of dependency arcs, are popular in dependency parsing for decades.
1 code implementation • ACL 2022 • Songlin Yang, Kewei Tu
In a projective dependency tree, the largest subtree rooted at each word covers a contiguous sequence (i. e., a span) in the surface order.
no code implementations • 19 Jul 2021 • Songlin Yang, Wei Wang, Yuehua Cheng, Jing Dong
Through this, we can construct unrestricted adversarial image to decrease ID similarity recognized by model.
1 code implementation • ACL 2021 • Songlin Yang, Yanpeng Zhao, Kewei Tu
Neural lexicalized PCFGs (L-PCFGs) have been shown effective in grammar induction.
1 code implementation • NAACL 2021 • Songlin Yang, Yanpeng Zhao, Kewei Tu
In this work, we present a new parameterization form of PCFGs based on tensor decomposition, which has at most quadratic computational complexity in the symbol number and therefore allows us to use a much larger number of symbols.
1 code implementation • COLING 2020 • Songlin Yang, Yong Jiang, Wenjuan Han, Kewei Tu
Inspired by second-order supervised dependency parsing, we proposed a second-order extension of unsupervised neural dependency models that incorporate grandparent-child or sibling information.
Ranked #1 on
Dependency Grammar Induction
on WSJ10