Search Results for author: Jianyu Wang

Found 47 papers, 18 papers with code

SeaLLMs 3: Open Foundation and Chat Multilingual Large Language Models for Southeast Asian Languages

1 code implementation29 Jul 2024 Wenxuan Zhang, Hou Pong Chan, Yiran Zhao, Mahani Aljunied, Jianyu Wang, Chaoqun Liu, Yue Deng, Zhiqiang Hu, Weiwen Xu, Yew Ken Chia, Xin Li, Lidong Bing

Large Language Models (LLMs) have shown remarkable abilities across various tasks, yet their development has predominantly centered on high-resource languages like English and Chinese, leaving low-resource languages underserved.

Diversity Instruction Following +2

Apple Intelligence Foundation Language Models

no code implementations29 Jul 2024 Tom Gunter, ZiRui Wang, Chong Wang, Ruoming Pang, Aonan Zhang, BoWen Zhang, Chen Chen, Chung-Cheng Chiu, David Qiu, Deepak Gopinath, Dian Ang Yap, Dong Yin, Feng Nan, Floris Weers, Guoli Yin, Haoshuo Huang, Jianyu Wang, Jiarui Lu, John Peebles, Ke Ye, Mark Lee, Nan Du, Qibin Chen, Quentin Keunebroek, Sam Wiseman, Syd Evans, Tao Lei, Vivek Rathod, Xiang Kong, Xianzhi Du, Yanghao Li, Yongqiang Wang, Yuan Gao, Zaid Ahmed, Zhaoyang Xu, Zhiyun Lu, Al Rashid, Albin Madappally Jose, Alec Doane, Alfredo Bencomo, Allison Vanderby, Andrew Hansen, Ankur Jain, Anupama Mann Anupama, Areeba Kamal, Bugu Wu, Carolina Brum, Charlie Maalouf, Chinguun Erdenebileg, Chris Dulhanty, Dominik Moritz, Doug Kang, Eduardo Jimenez, Evan Ladd, Fangping Shi, Felix Bai, Frank Chu, Fred Hohman, Hadas Kotek, Hannah Gillis Coleman, Jane Li, Jeffrey Bigham, Jeffery Cao, Jeff Lai, Jessica Cheung, Jiulong Shan, Joe Zhou, John Li, Jun Qin, Karanjeet Singh, Karla Vega, Kelvin Zou, Laura Heckman, Lauren Gardiner, Margit Bowler, Maria Cordell, Meng Cao, Nicole Hay, Nilesh Shahdadpuri, Otto Godwin, Pranay Dighe, Pushyami Rachapudi, Ramsey Tantawi, Roman Frigg, Sam Davarnia, Sanskruti Shah, Saptarshi Guha, Sasha Sirovica, Shen Ma, Shuang Ma, Simon Wang, Sulgi Kim, Suma Jayaram, Vaishaal Shankar, Varsha Paidi, Vivek Kumar, Xin Wang, Xin Zheng, Walker Cheng, Yael Shrager, Yang Ye, Yasu Tanaka, Yihao Guo, Yunsong Meng, Zhao Tang Luo, Zhi Ouyang, Alp Aygar, Alvin Wan, Andrew Walkingshaw, Andy Narayanan, Antonie Lin, Arsalan Farooq, Brent Ramerth, Colorado Reed, Chris Bartels, Chris Chaney, David Riazati, Eric Liang Yang, Erin Feldman, Gabriel Hochstrasser, Guillaume Seguin, Irina Belousova, Joris Pelemans, Karen Yang, Keivan Alizadeh Vahid, Liangliang Cao, Mahyar Najibi, Marco Zuliani, Max Horton, Minsik Cho, Nikhil Bhendawade, Patrick Dong, Piotr Maj, Pulkit Agrawal, Qi Shan, Qichen Fu, Regan Poston, Sam Xu, Shuangning Liu, Sushma Rao, Tashweena Heeramun, Thomas Merth, Uday Rayala, Victor Cui, Vivek Rangarajan Sridhar, Wencong Zhang, Wenqi Zhang, Wentao Wu, Xingyu Zhou, Xinwen Liu, Yang Zhao, Yin Xia, Zhile Ren, Zhongzheng Ren

We present foundation language models developed to power Apple Intelligence features, including a ~3 billion parameter model designed to run efficiently on devices and a large server-based language model designed for Private Cloud Compute.

Language Modelling

Virtual Full-Duplex Wireless Communications with Zero-Interval Modulation and Sampling

no code implementations24 Jul 2024 Jianyu Wang, Wenchi Cheng, Wei zhang, Hailin Zhang

In ZIMS-VFD, the transceiver inserts a zero-interval for each symbol in the transmit signal and provides self-interference (SI)-free intervals for itself.

Multi-Frequency Resonant Circuit Based Multi-User Emergency Through-the-Earth Communication with Magnetic Induction

no code implementations24 Jul 2024 Zhenyu Wang, Jianyu Wang, Wenchi Cheng

Magnetic induction (MI) is an effective technique in emergency through-the-earth communications due to the higher penetration efficiency and lower propagation loss as compared with electromagnetic wave communication.

RIS-Based Self-Interference Cancellation for Full-Duplex Broadband Transmission

no code implementations17 Jul 2024 Jiayan Wu, Wenchi Cheng, Jianyu Wang, Jingqing Wang, Wei zhang

The problem is solved with alternate optimization (AO) algorithm in three cases: ideal case, where both the amplitude and phase of each RIS unit cell can be controlled independently and continuously, continuous phases, where the phase of each RIS unit cell can be controlled independently, while the amplitude is fixed to one, and discrete phases, where the RC of each RIS unit cell can only take discrete values and these discrete values are equally spaced on the unit circle.

Fast and Slow Generating: An Empirical Study on Large and Small Language Models Collaborative Decoding

1 code implementation18 Jun 2024 Kaiyan Zhang, Jianyu Wang, Ning Ding, Biqing Qi, Ermo Hua, Xingtai Lv, BoWen Zhou

Our research underscores that the fundamental distinction between System 1 and System 2 lies in the uncertainty of next token predictions, where interventions by System 2 are crucial to support System 1.

Hallucination

CoGenesis: A Framework Collaborating Large and Small Language Models for Secure Context-Aware Instruction Following

no code implementations5 Mar 2024 Kaiyan Zhang, Jianyu Wang, Ermo Hua, Biqing Qi, Ning Ding, BoWen Zhou

With the advancement of language models (LMs), their exposure to private data is increasingly inevitable, and their deployment (especially for smaller ones) on personal devices, such as PCs and smartphones, has become a prevailing trend.

Instruction Following

Momentum Approximation in Asynchronous Private Federated Learning

no code implementations14 Feb 2024 Tao Yu, Congzheng Song, Jianyu Wang, Mona Chitnis

Asynchronous protocols have been shown to improve the scalability of federated learning (FL) with a massive number of clients.

Federated Learning

Wasserstein Nonnegative Tensor Factorization with Manifold Regularization

no code implementations3 Jan 2024 Jianyu Wang, Linruize Tang

Nonnegative tensor factorization (NTF) has become an important tool for feature extraction and part-based representation with preserved intrinsic structure information from nonnegative high-order data.

PLGSLAM: Progressive Neural Scene Represenation with Local to Global Bundle Adjustment

no code implementations CVPR 2024 Tianchen Deng, Guole Shen, Tong Qin, Jianyu Wang, Wentao Zhao, Jingchuan Wang, Danwei Wang, Weidong Chen

To this end, we introduce PLGSLAM, a neural visual SLAM system capable of high-fidelity surface reconstruction and robust camera tracking in real-time.

Surface Reconstruction

SeaLLMs -- Large Language Models for Southeast Asia

1 code implementation1 Dec 2023 Xuan-Phi Nguyen, Wenxuan Zhang, Xin Li, Mahani Aljunied, Zhiqiang Hu, Chenhui Shen, Yew Ken Chia, Xingxuan Li, Jianyu Wang, Qingyu Tan, Liying Cheng, Guanzheng Chen, Yue Deng, Sen yang, Chaoqun Liu, Hang Zhang, Lidong Bing

Despite the remarkable achievements of large language models (LLMs) in various tasks, there remains a linguistic bias that favors high-resource languages, such as English, often at the expense of low-resource and regional languages.

Instruction Following

FedHyper: A Universal and Robust Learning Rate Scheduler for Federated Learning with Hypergradient Descent

no code implementations4 Oct 2023 Ziyao Wang, Jianyu Wang, Ang Li

The theoretical landscape of federated learning (FL) undergoes rapid evolution, but its practical application encounters a series of intricate challenges, and hyperparameter optimization is one of these critical challenges.

Federated Learning Hyperparameter Optimization

FedDisco: Federated Learning with Discrepancy-Aware Collaboration

1 code implementation30 May 2023 Rui Ye, Mingkai Xu, Jianyu Wang, Chenxin Xu, Siheng Chen, Yanfeng Wang

However, based on our empirical observations and theoretical analysis, we find that the dataset size is not optimal and the discrepancy between local and global category distributions could be a beneficial and complementary indicator for determining aggregation weights.

Federated Learning

Non-Line-of-Sight Imaging With Signal Superresolution Network

no code implementations CVPR 2023 Jianyu Wang, Xintong Liu, Leping Xiao, Zuoqiang Shi, Lingyun Qiu, Xing Fu

This paper proposes a general learning-based pipeline for increasing imaging quality with only a few scanning points.

Few-shot Non-line-of-sight Imaging with Signal-surface Collaborative Regularization

no code implementations CVPR 2023 Xintong Liu, Jianyu Wang, Leping Xiao, Xing Fu, Lingyun Qiu, Zuoqiang Shi

In this work, we propose a signal-surface collaborative regularization (SSCR) framework that provides noise-robust reconstructions with a minimal number of measurements.

Autonomous Driving Bayesian Inference

Robust Manifold Nonnegative Tucker Factorization for Tensor Data Representation

no code implementations8 Nov 2022 Jianyu Wang, Linruize Tang, Jie Chen, Jingdong Chen

Nonnegative Tucker Factorization (NTF) minimizes the euclidean distance or Kullback-Leibler divergence between the original data and its low-rank approximation which often suffers from grossly corruptions or outliers and the neglect of manifold structures of data.

Non-line-of-sight imaging with arbitrary illumination and detection pattern

no code implementations1 Nov 2022 Xintong Liu, Jianyu Wang, Leping Xiao, Zuoqiang Shi, Xing Fu, Lingyun Qiu

Non-line-of-sight (NLOS) imaging aims at reconstructing targets obscured from the direct line of sight.

Autonomous Driving

Where to Begin? On the Impact of Pre-Training and Initialization in Federated Learning

1 code implementation14 Oct 2022 John Nguyen, Jianyu Wang, Kshitiz Malik, Maziar Sanjabi, Michael Rabbat

Surprisingly, we also find that starting federated learning from a pre-trained initialization reduces the effect of both data and system heterogeneity.

Federated Learning

FedFM: Anchor-based Feature Matching for Data Heterogeneity in Federated Learning

no code implementations14 Oct 2022 Rui Ye, Zhenyang Ni, Chenxin Xu, Jianyu Wang, Siheng Chen, Yonina C. Eldar

This method attempts to mitigate the negative effects of data heterogeneity in FL by aligning each client's feature space.

Federated Learning

Where to Begin? On the Impact of Pre-Training and Initialization in Federated Learning

2 code implementations30 Jun 2022 John Nguyen, Jianyu Wang, Kshitiz Malik, Maziar Sanjabi, Michael Rabbat

Surprisingly, we also find that starting federated learning from a pre-trained initialization reduces the effect of both data and system heterogeneity.

Federated Learning

On the Unreasonable Effectiveness of Federated Averaging with Heterogeneous Data

no code implementations9 Jun 2022 Jianyu Wang, Rudrajit Das, Gauri Joshi, Satyen Kale, Zheng Xu, Tong Zhang

Motivated by this observation, we propose a new quantity, average drift at optimum, to measure the effects of data heterogeneity, and explicitly use it to present a new theoretical analysis of FedAvg.

Federated Learning

FedLite: A Scalable Approach for Federated Learning on Resource-constrained Clients

no code implementations28 Jan 2022 Jianyu Wang, Hang Qi, Ankit Singh Rawat, Sashank Reddi, Sagar Waghmare, Felix X. Yu, Gauri Joshi

In classical federated learning, the clients contribute to the overall training by communicating local updates for the underlying model on their private data to a coordinating server.

Federated Learning

Personalized Federated Learning for Heterogeneous Clients with Clustered Knowledge Transfer

no code implementations16 Sep 2021 Yae Jee Cho, Jianyu Wang, Tarun Chiruvolu, Gauri Joshi

Personalized federated learning (FL) aims to train model(s) that can perform well for individual clients that are highly data and system heterogeneous.

Personalized Federated Learning Transfer Learning

DualVGR: A Dual-Visual Graph Reasoning Unit for Video Question Answering

1 code implementation10 Jul 2021 Jianyu Wang, Bing-Kun Bao, Changsheng Xu

However, existing graph-based methods fail to perform multi-step reasoning well, neglecting two properties of VideoQA: (1) Even for the same video, different questions may require different amount of video clips or objects to infer the answer with relational reasoning; (2) During reasoning, appearance and motion features have complicated interdependence which are correlated and complementary to each other.

Graph Attention Question Answering +3

Local Adaptivity in Federated Learning: Convergence and Consistency

no code implementations4 Jun 2021 Jianyu Wang, Zheng Xu, Zachary Garrett, Zachary Charles, Luyang Liu, Gauri Joshi

Popular optimization algorithms of FL use vanilla (stochastic) gradient descent for both local updates at clients and global updates at the aggregating server.

Federated Learning

Deep NMF Topic Modeling

no code implementations24 Feb 2021 Jianyu Wang, Xiao-Lei Zhang

In this paper, we propose a deep NMF (DNMF) topic modeling framework to alleviate the aforementioned problems.

Client Selection in Federated Learning: Convergence Analysis and Power-of-Choice Selection Strategies

no code implementations3 Oct 2020 Yae Jee Cho, Jianyu Wang, Gauri Joshi

Federated learning is a distributed optimization paradigm that enables a large number of resource-limited client nodes to cooperatively train a model without data sharing.

Distributed Optimization Federated Learning +1

Tackling the Objective Inconsistency Problem in Heterogeneous Federated Optimization

1 code implementation NeurIPS 2020 Jianyu Wang, Qinghua Liu, Hao Liang, Gauri Joshi, H. Vincent Poor

In federated optimization, heterogeneity in the clients' local datasets and computation speeds results in large variations in the number of local updates performed by each client in each communication round.

Slow and Stale Gradients Can Win the Race

no code implementations23 Mar 2020 Sanghamitra Dutta, Jianyu Wang, Gauri Joshi

Distributed Stochastic Gradient Descent (SGD) when run in a synchronous manner, suffers from delays in runtime as it waits for the slowest workers (stragglers).

Overlap Local-SGD: An Algorithmic Approach to Hide Communication Delays in Distributed SGD

1 code implementation21 Feb 2020 Jianyu Wang, Hao Liang, Gauri Joshi

In this paper, we propose an algorithmic approach named Overlap-Local-SGD (and its momentum variant) to overlap the communication and computation so as to speedup the distributed training procedure.

Deep topic modeling by multilayer bootstrap network and lasso

no code implementations24 Oct 2019 Jianyu Wang, Xiao-Lei Zhang

Specifically, we first apply multilayer bootstrap network (MBN), which is an unsupervised deep model, to reduce the dimension of documents, and then use the low-dimensional data representations or their clustering results as the target of supervised Lasso for topic word discovery.

Clustering Dimensionality Reduction +1

SlowMo: Improving Communication-Efficient Distributed SGD with Slow Momentum

1 code implementation ICLR 2020 Jianyu Wang, Vinayak Tantia, Nicolas Ballas, Michael Rabbat

We provide theoretical convergence guarantees showing that SlowMo converges to a stationary point of smooth non-convex losses.

Blocking Distributed Optimization +3

MATCHA: Speeding Up Decentralized SGD via Matching Decomposition Sampling

4 code implementations23 May 2019 Jianyu Wang, Anit Kumar Sahu, Zhouyi Yang, Gauri Joshi, Soummya Kar

This paper studies the problem of error-runtime trade-off, typically encountered in decentralized training based on stochastic gradient descent (SGD) using a given network.

Bilateral Adversarial Training: Towards Fast Training of More Robust Models Against Adversarial Attacks

1 code implementation ICCV 2019 Jianyu Wang, Haichao Zhang

To generate the adversarial image, we use one-step targeted attack with the target label being the most confusing class.

Adaptive Communication Strategies to Achieve the Best Error-Runtime Trade-off in Local-Update SGD

no code implementations19 Oct 2018 Jianyu Wang, Gauri Joshi

Large-scale machine learning training, in particular distributed stochastic gradient descent, needs to be robust to inherent system variability such as node straggling and random communication delays.

Cooperative SGD: A unified Framework for the Design and Analysis of Communication-Efficient SGD Algorithms

no code implementations22 Aug 2018 Jianyu Wang, Gauri Joshi

Communication-efficient SGD algorithms, which allow nodes to perform local updates and periodically synchronize local models, are highly effective in improving the speed and scalability of distributed SGD.

Visual Concepts and Compositional Voting

no code implementations13 Nov 2017 Jianyu Wang, Zhishuai Zhang, Cihang Xie, Yuyin Zhou, Vittal Premachandran, Jun Zhu, Lingxi Xie, Alan Yuille

We use clustering algorithms to study the population activities of the features and extract a set of visual concepts which we show are visually tight and correspond to semantic parts of vehicles.

Clustering Semantic Part Detection

Unsupervised learning of object semantic parts from internal states of CNNs by population encoding

1 code implementation21 Nov 2015 Jianyu Wang, Zhishuai Zhang, Cihang Xie, Vittal Premachandran, Alan Yuille

We address the key question of how object part representations can be found from the internal states of CNNs that are trained for high-level tasks, such as object classification.

Clustering Keypoint Detection +1

Semantic Part Segmentation using Compositional Model combining Shape and Appearance

no code implementations CVPR 2015 Jianyu Wang, Alan Yuille

This is more challenging than standard object detection, object segmentation and pose estimation tasks because semantic parts of animals often have similar appearance and highly varying shapes.

Object object-detection +4

Cannot find the paper you are looking for? You can Submit a new open access paper.