no code implementations • 18 Dec 2024 • Bowen Chen, Namgi Han, Yusuke Miyao
The lack of data transparency in Large Language Models (LLMs) has highlighted the importance of Membership Inference Attack (MIA), which differentiates trained (member) and untrained (non-member) data.
no code implementations • 4 Jul 2024 • LLM-jp, :, Akiko Aizawa, Eiji Aramaki, Bowen Chen, Fei Cheng, Hiroyuki Deguchi, Rintaro Enomoto, Kazuki Fujii, Kensuke Fukumoto, Takuya Fukushima, Namgi Han, Yuto Harada, Chikara Hashimoto, Tatsuya Hiraoka, Shohei Hisada, Sosuke Hosokawa, Lu Jie, Keisuke Kamata, Teruhito Kanazawa, Hiroki Kanezashi, Hiroshi Kataoka, Satoru Katsumata, Daisuke Kawahara, Seiya Kawano, Atsushi Keyaki, Keisuke Kiryu, Hirokazu Kiyomaru, Takashi Kodama, Takahiro Kubo, Yohei Kuga, Ryoma Kumon, Shuhei Kurita, Sadao Kurohashi, Conglong Li, Taiki Maekawa, Hiroshi Matsuda, Yusuke Miyao, Kentaro Mizuki, Sakae Mizuki, Yugo Murawaki, Akim Mousterou, Ryo Nakamura, Taishi Nakamura, Kouta Nakayama, Tomoka Nakazato, Takuro Niitsuma, Jiro Nishitoba, Yusuke Oda, Hayato Ogawa, Takumi Okamoto, Naoaki Okazaki, Yohei Oseki, Shintaro Ozaki, Koki Ryu, Rafal Rzepka, Keisuke Sakaguchi, Shota Sasaki, Satoshi Sekine, Kohei Suda, Saku Sugawara, Issa Sugiura, Hiroaki Sugiyama, Hisami Suzuki, Jun Suzuki, Toyotaro Suzumura, Kensuke Tachibana, Yu Takagi, Kyosuke Takami, Koichi Takeda, Masashi Takeshita, Masahiro Tanaka, Kenjiro Taura, Arseny Tolmachev, Nobuhiro Ueda, Zhen Wan, Shuntaro Yada, Sakiko Yahata, Yuya Yamamoto, Yusuke Yamauchi, Hitomi Yanaka, Rio Yokota, Koichiro Yoshino
This paper introduces LLM-jp, a cross-organizational project for the research and development of Japanese large language models (LLMs).
1 code implementation • 4 Jun 2024 • Hitomi Yanaka, Namgi Han, Ryoma Kumon, Jie Lu, Masashi Takeshita, Ryo Sekizawa, Taisei Kato, Hiromi Arai
With the development of Large Language Models (LLMs), social biases in the LLMs have become a crucial issue.
no code implementations • 19 May 2024 • Bowen Chen, Namgi Han, Yusuke Miyao
One of those behaviors is memorization, in which LLMs can generate the same content used to train them.
1 code implementation • COLING 2020 • Namgi Han, Goran Topic, Hiroshi Noji, Hiroya Takamura, Yusuke Miyao
Our analysis, including shifting of training and test datasets and training on a union of the datasets, suggests that our progress in solving SimpleQuestions dataset does not indicate the success of more general simple question answering.
no code implementations • LREC 2020 • Namgi Han, Katsuhiko Hayashi, Yusuke Miyao
Many researchers have tried to predict the accuracies of extrinsic evaluation by using intrinsic evaluation to evaluate word embedding.