Interpretable Online Network Dictionary Learning for Inferring Long-Range Chromatin Interactions

16 Dec 2023  ·  Vishal Rana, Jianhao Peng, Chao Pan, Hanbaek Lyu, Albert Cheng, Minji Kim, Olgica Milenkovic ·

Dictionary learning (DL) is commonly used in computational biology to tackle ubiquitous clustering problems due to its conceptual simplicity and relatively low computational complexity. However, DL algorithms produce results that lack interpretability and are not optimized for large-scale graph-structured data. We propose a novel DL algorithm called online convex network dictionary learning (online cvxNDL) that can handle extremely large datasets and enables the interpretation of dictionary elements, which serve as cluster representatives, through convex combinations of real measurements. Moreover, the algorithm can be applied to network-structured data via specialized subnetwork sampling techniques. To demonstrate the utility of our approach, we apply cvxNDL on 3D-genome RNAPII ChIA-Drop data to identify important long-range interaction patterns. ChIA-Drop probes higher-order interactions, and produces hypergraphs whose nodes represent genomic fragments. The hyperedges represent observed physical contacts. Our hypergraph model analysis creates an interpretable dictionary of long-range interaction patterns that accurately represent global chromatin physical contact maps. Using dictionary information, one can also associate the contact maps with RNA transcripts and infer cellular functions. Our results offer two key insights. First, we demonstrate that online cvxNDL retains the accuracy of classical DL methods while simultaneously ensuring unique interpretability and scalability. Second, we identify distinct collections of proximal and distal interaction patterns involving chromatin elements shared by related processes across different chromosomes, as well as patterns unique to specific chromosomes. To associate the dictionary elements with biological properties of the corresponding chromatin regions, we employ Gene Ontology enrichment analysis and perform RNA coexpression studies.

PDF Abstract

Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods