Search Results for author: Jeff M. Phillips

Found 28 papers, 4 papers with code

No Dimensional Sampling Coresets for Classification

no code implementations • 7 Feb 2024 • Meysam Alishahi, Jeff M. Phillips

We refine and generalize what is known about coresets for classification problems via the sensitivity sampling framework.

Classification

Paper
Add Code

On Mergable Coresets for Polytope Distance

no code implementations • 8 Nov 2023 • Benwei Shi, Aditya Bhaskara, Wai Ming Tai, Jeff M. Phillips

We show that a constant-size constant-error coreset for polytope distance is simple to maintain under merges of coresets.

Paper
Add Code

Sketching Multidimensional Time Series for Fast Discord Mining

no code implementations • 5 Nov 2023 • Chin-Chia Michael Yeh, Yan Zheng, Menghai Pan, Huiyuan Chen, Zhongfang Zhuang, Junpeng Wang, Liang Wang, Wei zhang, Jeff M. Phillips, Eamonn Keogh

In this work, we propose a sketch for discord mining among multi-dimensional time series.

Anomaly Detection Time Series +1

Paper
Add Code

An Efficient Content-based Time Series Retrieval System

no code implementations • 5 Oct 2023 • Chin-Chia Michael Yeh, Huiyuan Chen, Xin Dai, Yan Zheng, Junpeng Wang, Vivian Lai, Yujie Fan, Audrey Der, Zhongfang Zhuang, Liang Wang, Wei zhang, Jeff M. Phillips

A Content-based Time Series Retrieval (CTSR) system is an information retrieval system for users to interact with time series emerged from multiple domains, such as finance, healthcare, and manufacturing.

Information Retrieval Retrieval +1

Paper
Add Code

For Kernel Range Spaces a Constant Number of Queries Are Sufficient

no code implementations • 28 Jun 2023 • Jeff M. Phillips, Hasan Pourmahmood-Aghababa

For a point set $X$ of size $n$, a query returns a vector of values $R_p \in \mathbb{R}^n$, where the $i$th coordinate $(R_p)_i = K(p, x_i)$ for $x_i \in X$.

Paper
Add Code

Linear Distance Metric Learning with Noisy Labels

no code implementations • 5 Jun 2023 • Meysam Alishahi, Anna Little, Jeff M. Phillips

In linear distance metric learning, we are given data in one Euclidean metric space and the goal is to find an appropriate linear map to another Euclidean metric space which respects certain distance conditions as much as possible.

Learning with noisy labels Metric Learning

Paper
Add Code

Mitigating Exploitation Bias in Learning to Rank with an Uncertainty-aware Empirical Bayes Approach

no code implementations • 26 May 2023 • Tao Yang, Cuize Han, Chen Luo, Parth Gupta, Jeff M. Phillips, Qingyao Ai

While previous studies have demonstrated the effectiveness of using user behavior signals (e. g., clicks) as both features and labels of LTR algorithms, we argue that existing LTR algorithms that indiscriminately treat behavior and non-behavior signals in input features could lead to suboptimal performance in practice.

Learning-To-Rank Recommendation Systems

Paper
Add Code

Batch Multi-Fidelity Active Learning with Budget Constraints

no code implementations • 23 Oct 2022 • Shibo Li, Jeff M. Phillips, Xin Yu, Robert M. Kirby, Shandian Zhe

However, this method only queries at one pair of fidelity and input at a time, and hence has a risk to bring in strongly correlated examples to reduce the learning efficiency.

Active Learning

Paper
Add Code

Classifying Spatial Trajectories

1 code implementation • 3 Sep 2022 • Hasan Pourmahmood-Aghababa, Jeff M. Phillips

We provide the first comprehensive study on how to classify trajectories using only their spatial representations, measured on 5 real-world data sets.

Paper
Code

Self-Adaptable Point Processes with Nonparametric Time Decays

no code implementations • NeurIPS 2021 • Zhimeng Pan, Zheng Wang, Jeff M. Phillips, Shandian Zhe

Specifically, we use an embedding to represent each event type and model the event influence as an unknown function of the embeddings and time span.

Point Processes

Paper
Add Code

Practical and Configurable Network Traffic Classification Using Probabilistic Machine Learning

no code implementations • 10 Jul 2021 • Jiahui Chen, Joe Breen, Jeff M. Phillips, Jacobus Van der Merwe

Network traffic classification that is widely applicable and highly accurate is valuable for many network security and management tasks.

BIG-bench Machine Learning Classification +2

Paper
Add Code

Approximate Maximum Halfspace Discrepancy

no code implementations • 25 Jun 2021 • Michael Matheny, Jeff M. Phillips

For different classes of $\Phi$ we can either provide a $\Omega(|X|^{3/2 - o(1)})$ time lower bound for the exact solution with a reduction to APSP, or an $\Omega(|X| + 1/\varepsilon^{2-o(1)})$ lower bound for the approximate solution with a reduction to 3SUM.

Anomaly Detection

Paper
Add Code

VERB: Visualizing and Interpreting Bias Mitigation Techniques for Word Representations

1 code implementation • 6 Apr 2021 • Archit Rathore, Sunipa Dev, Jeff M. Phillips, Vivek Srikumar, Yan Zheng, Chin-Chia Michael Yeh, Junpeng Wang, Wei zhang, Bei Wang

To aid this, we present Visualization of Embedding Representations for deBiasing system ("VERB"), an open-source web-based visualization tool that helps the users gain a technical understanding and visual intuition of the inner workings of debiasing techniques, with a focus on their geometric properties.

Decision Making Dimensionality Reduction +3

Paper
Code

OSCaR: Orthogonal Subspace Correction and Rectification of Biases in Word Embeddings

1 code implementation • EMNLP 2021 • Sunipa Dev, Tao Li, Jeff M. Phillips, Vivek Srikumar

Language representations are known to carry stereotypical biases and, as a result, lead to biased predictions in downstream tasks.

Word Embeddings

Paper
Code

A Deterministic Streaming Sketch for Ridge Regression

1 code implementation • 5 Feb 2020 • Benwei Shi, Jeff M. Phillips

We provide a deterministic space-efficient algorithm for estimating ridge regression.

regression

Paper
Code

Constrained Non-Affine Alignment of Embeddings

no code implementations • 13 Oct 2019 • Yuwei Wang, Yan Zheng, Yanqing Peng, Chin-Chia Michael Yeh, Zhongfang Zhuang, Das Mahashweta, Bendre Mangesh, Feifei Li, Wei zhang, Jeff M. Phillips

Embeddings are already essential tools for large language models and image analysis, and their use is being extended to many other research domains.

Paper
Add Code

The Kernel Spatial Scan Statistic

no code implementations • 13 Jun 2019 • Mingxuan Han, Michael Matheny, Jeff M. Phillips

Kulldorff's (1997) seminal paper on spatial scan statistics (SSS) has led to many methods considering different regions of interest, different statistical models, and different approximations while also having numerous applications in epidemiology, environmental monitoring, and homeland security.

Epidemiology

Paper
Add Code

The GaussianSketch for Almost Relative Error Kernel Distance

no code implementations • 9 Nov 2018 • Jeff M. Phillips, Wai Ming Tai

We introduce two versions of a new sketch for approximately embedding the Gaussian kernel into Euclidean inner product space.

Paper
Add Code

Closed Form Word Embedding Alignment

no code implementations • 4 Jun 2018 • Sunipa Dev, Safia Hassan, Jeff M. Phillips

We develop a family of techniques to align word embeddings which are derived from different source datasets or created using different mechanisms (e. g., GloVe or word2vec).

Word Embeddings

Paper
Add Code

Simple Distances for Trajectories via Landmarks

no code implementations • 30 Apr 2018 • Jeff M. Phillips, Pingfan Tang

We develop a new class of distances for objects including lines, hyperplanes, and trajectories, based on the distance to a set of landmarks.

Clustering

Paper
Add Code

Near-Optimal Coresets of Kernel Density Estimates

no code implementations • 6 Feb 2018 • Jeff M. Phillips, Wai Ming Tai

When $d\geq 1/\varepsilon^2$, it is known that the size of coreset can be $O(1/\varepsilon^2)$.

Paper
Add Code

Improved Coresets for Kernel Density Estimates

no code implementations • 11 Oct 2017 • Jeff M. Phillips, Wai Ming Tai

When the dimension $d$ is constant, we demonstrate much tighter bounds on the size of the coreset specifically for Gaussian kernels, showing that it is bounded by the size of the coreset for axis-aligned rectangles.

Paper
Add Code

Coresets for Kernel Regression

no code implementations • 13 Feb 2017 • Yan Zheng, Jeff M. Phillips

Kernel regression is an essential and ubiquitous tool for non-parametric data analysis, particularly popular among time series and spatial data.

regression Time Series +1

Paper
Add Code

The Robustness of Estimator Composition

no code implementations • NeurIPS 2016 • Pingfan Tang, Jeff M. Phillips

And so on, if the composition is of more than two estimators.

Paper
Add Code

Relative Error Embeddings for the Gaussian Kernel Distance

no code implementations • 17 Feb 2016 • Di Chen, Jeff M. Phillips

A reproducing kernel can define an embedding of a data point into an infinite dimensional reproducing kernel Hilbert space (RKHS).

Paper
Add Code

Streaming Kernel Principal Component Analysis

no code implementations • 16 Dec 2015 • Mina Ghashami, Daniel Perry, Jeff M. Phillips

Kernel principal component analysis (KPCA) provides a concise set of basis vectors which capture non-linear structures within large data sets, and is a central tool in data analysis and learning.

Paper
Add Code

Subsampling in Smoothed Range Spaces

no code implementations • 30 Oct 2015 • Jeff M. Phillips, Yan Zheng

We consider smoothed versions of geometric range spaces, so an element of the ground set (e. g. a point) can be contained in a range with a non-binary value in $[0, 1]$.

Paper
Add Code

Frequent Directions : Simple and Deterministic Matrix Sketching

no code implementations • 8 Jan 2015 • Mina Ghashami, Edo Liberty, Jeff M. Phillips, David P. Woodruff

It performed $O(d \times \ell)$ operations per row and maintains a sketch matrix $B \in R^{\ell \times d}$ such that for any $k < \ell$ $\|A^TA - B^TB \|_2 \leq \|A - A_k\|_F^2 / (\ell-k)$ and $\|A - \pi_{B_k}(A)\|_F^2 \leq \big(1 + \frac{k}{\ell-k}\big) \|A-A_k\|_F^2 $ .

Data Structures and Algorithms 68W40 (Primary)

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.