Search Results for author: Da-Shan Shiu

Found 30 papers, 9 papers with code

How does BERT process disfluency?

no code implementations SIGDIAL (ACL) 2021 Ye Tian, Tim Nieradzik, Sepehr Jalali, Da-Shan Shiu

Analysis on sentence embeddings of disfluent and fluent sentence pairs reveals that the deeper the layer, the more similar their representation (exp2).

Sentence Sentence Embeddings +1

Bayesian Optimization from Human Feedback: Near-Optimal Regret Bounds

no code implementations29 May 2025 Aya Kayal, Sattar Vakili, Laura Toni, Da-Shan Shiu, Alberto Bernacchia

Existing work, which adopts the Bradley-Terry-Luce (BTL) feedback model, provides regret bounds for the performance of several algorithms.

Bayesian Optimization

Towards a Foundation Model for Communication Systems

no code implementations20 May 2025 Davide Buffelli, Sowmen Das, Yu-Wei Lin, Sattar Vakili, Chien-Yi Wang, Masoud Attarifar, Pritthijit Nath, Da-Shan Shiu

Artificial Intelligence (AI) has demonstrated unprecedented performance across various domains, and its application to communication systems is an active area of research.

model

Latent Flow Transformer

1 code implementation20 May 2025 Yen-chen Wu, Feng-Ting Liao, Meng-Hsi Chen, Pei-Chen Ho, Farhang Nabiei, Da-Shan Shiu

Transformers, the standard implementation for large language models (LLMs), typically consist of tens to hundreds of discrete layers.

Image Generation

Group Think: Multiple Concurrent Reasoning Agents Collaborating at Token Level Granularity

no code implementations16 May 2025 Chan-Jan Hsu, Davide Buffelli, Jamie McGowan, Feng-Ting Liao, Yi-Chang Chen, Sattar Vakili, Da-Shan Shiu

Recent advances in large language models (LLMs) have demonstrated the power of reasoning through self-generated chains of thought.

TASTE: Text-Aligned Speech Tokenization and Embedding for Spoken Language Modeling

2 code implementations9 Apr 2025 Liang-Hsuan Tseng, Yi-Chang Chen, Kuan-Yi Lee, Da-Shan Shiu, Hung-Yi Lee

To our knowledge, TASTE is the first end-to-end approach that utilizes a reconstruction objective to automatically learn a text-aligned speech tokenization and embedding suitable for spoken language modeling.

Language Modeling Language Modelling +2

BreezyVoice: Adapting TTS for Taiwanese Mandarin with Enhanced Polyphone Disambiguation -- Challenges and Insights

no code implementations29 Jan 2025 Chan-Jan Hsu, Yi-Cheng Lin, Chia-Chun Lin, Wei-Chih Chen, Ho Lam Chung, Chen-An Li, Yi-Chang Chen, Chien-Yu Yu, Ming-Ji Lee, Chien-Cheng Chen, Ru-Heng Huang, Hung-Yi Lee, Da-Shan Shiu

We present BreezyVoice, a Text-to-Speech (TTS) system specifically adapted for Taiwanese Mandarin, highlighting phonetic control abilities to address the unique challenges of polyphone disambiguation in the language.

Language Modeling Language Modelling +4

Enhancing Function-Calling Capabilities in LLMs: Strategies for Prompt Formats, Data Integration, and Multilingual Translation

no code implementations2 Dec 2024 Yi-Chang Chen, Po-chun Hsu, Chan-Jan Hsu, Da-Shan Shiu

This research delves into enhancing the function-calling capabilities of LLMs by exploring different approaches, including prompt formats for integrating function descriptions, blending function-calling and instruction-following data, introducing a novel Decision Token for conditional prompts, leveraging chain-of-thought reasoning, and overcoming multilingual challenges with a translation pipeline.

Data Integration Instruction Following +2

FineWeb-zhtw: Scalable Curation of Traditional Chinese Text Data from the Web

no code implementations25 Nov 2024 Cheng-Wei Lin, Wan-Hsuan Hsieh, Kai-Xin Guan, Chan-Jan Hsu, Chia-Chen Kuo, Chuan-Lin Lai, Chung-Wei Chung, Ming-Jen Wang, Da-Shan Shiu

The quality and size of a pretraining dataset significantly influence the performance of large language models (LLMs).

RAD-Bench: Evaluating Large Language Models Capabilities in Retrieval Augmented Dialogues

1 code implementation19 Sep 2024 Tzu-Lin Kuo, Feng-Ting Liao, Mu-Wei Hsieh, Fu-Chieh Chang, Po-chun Hsu, Da-Shan Shiu

In real-world applications with Large Language Models (LLMs), external retrieval mechanisms - such as Search-Augmented Generation (SAG), tool utilization, and Retrieval-Augmented Generation (RAG) - are often employed to enhance the quality of augmented generations in dialogues.

RAG Retrieval +1

Let's Fuse Step by Step: A Generative Fusion Decoding Algorithm with LLMs for Multi-modal Text Recognition

1 code implementation23 May 2024 Chan-Jan Hsu, Yi-Chang Chen, Feng-Ting Liao, Pei-Chen Ho, Yu-Hsiang Wang, Po-chun Hsu, Da-Shan Shiu

We introduce "Generative Fusion Decoding" (GFD), a novel shallow fusion framework, utilized to integrate Large Language Models (LLMs) into multi-modal text recognition systems such as automatic speech recognition (ASR) and optical character recognition (OCR).

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

Breeze-7B Technical Report

no code implementations5 Mar 2024 Chan-Jan Hsu, Chang-Le Liu, Feng-Ting Liao, Po-chun Hsu, Yi-Chang Chen, Da-Shan Shiu

Breeze-7B is an open-source language model based on Mistral-7B, designed to address the need for improved language comprehension and chatbot-oriented capabilities in Traditional Chinese.

Chatbot Language Modeling +1

Advancing the Evaluation of Traditional Chinese Language Models: Towards a Comprehensive Benchmark Suite

1 code implementation15 Sep 2023 Chan-Jan Hsu, Chang-Le Liu, Feng-Ting Liao, Po-chun Hsu, Yi-Chang Chen, Da-Shan Shiu

In an effort to advance the evaluation of language models in Traditional Chinese and stimulate further research in this field, we have open-sourced our benchmark and opened the model for trial.

Question Answering

Generative Diffusion Models for Radio Wireless Channel Modelling and Sampling

no code implementations10 Aug 2023 Ushnish Sengupta, Chinkuo Jao, Alberto Bernacchia, Sattar Vakili, Da-Shan Shiu

In this paper, we propose a diffusion model based channel sampling approach for rapidly synthesizing channel realizations from limited data.

Zero-shot Domain-sensitive Speech Recognition with Prompt-conditioning Fine-tuning

1 code implementation18 Jul 2023 Feng-Ting Liao, Yung-Chieh Chan, Yi-Chang Chen, Chan-Jan Hsu, Da-Shan Shiu

In this work, we propose a method to create domain-sensitive speech recognition models that utilize textual domain information by conditioning its generation on a given text prompt.

Domain Adaptation speech-recognition +1

Image generation with shortest path diffusion

1 code implementation1 Jun 2023 Ayan Das, Stathi Fotiadis, Anil Batra, Farhang Nabiei, FengTing Liao, Sattar Vakili, Da-Shan Shiu, Alberto Bernacchia

We compute the shortest path according to this metric, and we show that it corresponds to a combination of image sharpening, rather than blurring, and noise deblurring.

Deblurring Image Generation

Improved Convergence Rates for Sparse Approximation Methods in Kernel-Based Learning

no code implementations8 Feb 2022 Sattar Vakili, Jonathan Scarlett, Da-Shan Shiu, Alberto Bernacchia

Kernel-based models such as kernel ridge regression and Gaussian processes are ubiquitous in machine learning applications for regression and optimization.

Gaussian Processes regression

Uniform Generalization Bounds for Overparameterized Neural Networks

no code implementations13 Sep 2021 Sattar Vakili, Michael Bromberg, Jezabel Garcia, Da-Shan Shiu, Alberto Bernacchia

As a byproduct of our results, we show the equivalence between the RKHS corresponding to the NT kernel and its counterpart corresponding to the Mat\'ern family of kernels, showing the NT kernels induce a very general class of models.

Generalization Bounds

Optimal Order Simple Regret for Gaussian Process Bandits

no code implementations NeurIPS 2021 Sattar Vakili, Nacime Bouziani, Sepehr Jalali, Alberto Bernacchia, Da-Shan Shiu

Consider the sequential optimization of a continuous, possibly non-convex, and expensive to evaluate objective function $f$.

Art Analysis

Towards a Universal NLG for Dialogue Systems and Simulators with Future Bridging

no code implementations21 May 2021 Philipp Ennen, Yen-Ting Lin, Ali Girayhan Ozbay, Ferdinando Insalata, Maolin Li, Ye Tian, Sepehr Jalali, Da-Shan Shiu

In light of the recent success of data-driven approaches, we propose the novel future bridging NLG (FBNLG) concept for dialogue systems and simulators.

Text Generation

How to distribute data across tasks for meta-learning?

no code implementations15 Mar 2021 Alexandru Cioba, Michael Bromberg, Qian Wang, Ritwik Niyogi, Georgios Batzolis, Jezabel Garcia, Da-Shan Shiu, Alberto Bernacchia

We show that: 1) If tasks are homogeneous, there is a uniform optimal allocation, whereby all tasks get the same amount of data; 2) At fixed budget, there is a trade-off between number of tasks and number of data points per task, with a unique solution for the optimum; 3) When trained separately, harder task should get more data, at the cost of a smaller number of tasks; 4) When training on a mixture of easy and hard tasks, more data should be allocated to easy tasks.

Few-Shot Image Classification image-classification +1

Model agnostic meta-learning on trees

no code implementations1 Jan 2021 Jezabel Garcia, Federica Freddi, Jamie McGowan, Tim Nieradzik, Da-Shan Shiu, Ye Tian, Alberto Bernacchia

In meta-learning, the knowledge learned from previous tasks is transferred to new ones, but this transfer only works if tasks are related, and sharing information between unrelated tasks might hurt performance.

Meta-Learning model

Optimal allocation of data across training tasks in meta-learning

no code implementations1 Jan 2021 Georgios Batzolis, Alberto Bernacchia, Da-Shan Shiu, Michael Bromberg, Alexandru Cioba

They are tested on benchmarks with a fixed number of data-points for each training task, and this number is usually arbitrary, for example, 5 instances per class in few-shot classification.

Few-Shot Image Classification image-classification +2

Cannot find the paper you are looking for? You can Submit a new open access paper.