Data augmentation, which refers to manipulating the inputs (e. g., adding random noise, masking specific parts) to enlarge the dataset, has been widely adopted in machine learning.
The diversity of tables makes table detection a great challenge, leading to existing models becoming more tedious and complex.
2 code implementations • 28 Sep 2023 • Jinze Bai, Shuai Bai, Yunfei Chu, Zeyu Cui, Kai Dang, Xiaodong Deng, Yang Fan, Wenbin Ge, Yu Han, Fei Huang, Binyuan Hui, Luo Ji, Mei Li, Junyang Lin, Runji Lin, Dayiheng Liu, Gao Liu, Chengqiang Lu, Keming Lu, Jianxin Ma, Rui Men, Xingzhang Ren, Xuancheng Ren, Chuanqi Tan, Sinan Tan, Jianhong Tu, Peng Wang, Shijie Wang, Wei Wang, Shengguang Wu, Benfeng Xu, Jin Xu, An Yang, Hao Yang, Jian Yang, Shusheng Yang, Yang Yao, Bowen Yu, Hongyi Yuan, Zheng Yuan, Jianwei Zhang, Xingxuan Zhang, Yichang Zhang, Zhenru Zhang, Chang Zhou, Jingren Zhou, Xiaohuan Zhou, Tianhang Zhu
Large language models (LLMs) have revolutionized the field of artificial intelligence, enabling natural language processing tasks that were previously thought to be exclusive to humans.
The use of deep learning methods for automatic detection of students' classroom behavior is a promising approach to analyze their class performance and enhance teaching effectiveness.
As one of the challenging NLP tasks, designing math word problem (MWP) solvers has attracted increasing research attention for the past few years.
Structure-based drug design is drawing growing attentions in computer-aided drug discovery.
We regard the DTI triplets as a sequence and use a Transformer-based model to directly generate them without using the detailed annotations of entities and relations.
In commercial dialogue systems, the Spoken Language Understanding (SLU) component tends to have numerous domains thus context is needed to help resolve ambiguities.
For this task, it is obvious that external knowledge, such as Knowledge graph, can help the model understand commonsense in natural language statements.
Recently, the concept of teaching has been introduced into machine learning, in which a teacher model is used to guide the training of a student model (which will be used in real tasks) through data selection, loss function design, etc.
While the multi-branch architecture is one of the key ingredients to the success of computer vision tasks, it has not been well investigated in natural language processing, especially sequence learning tasks.
Ranked #4 on Machine Translation on WMT2014 English-German (SacreBLEU metric)
We Microsoft Research Asia made submissions to 11 language directions in the WMT19 news translation tasks.
Different from typical learning settings in which the loss function of a machine learning model is predefined and fixed, in our framework, the loss function of a machine learning model (we call it student) is defined by another machine learning model (we call it teacher).