An Algorithm for Mining Frequent Approximate Subgraphs with Structural and Label Variations in Graph Collections
Using graphs as a data structure is a simple way to represent relationships between objects. Consequently, it has raised the need for algorithms to process, analyze, and extract meaningful information from graphs. Therefore, frequent subgraph mining (FSM) algorithms have been reported in the literat...
Saved in:
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2025-07-01
|
Series: | Applied Sciences |
Subjects: | |
Online Access: | https://www.mdpi.com/2076-3417/15/14/7880 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Using graphs as a data structure is a simple way to represent relationships between objects. Consequently, it has raised the need for algorithms to process, analyze, and extract meaningful information from graphs. Therefore, frequent subgraph mining (FSM) algorithms have been reported in the literature to discover interesting, unexpected, and useful patterns in graph databases. Frequent subgraph mining involves discovering subgraphs that appear no less than a user-specified threshold; this can be performed exactly or approximately. Although several algorithms for mining frequent approximate subgraphs exist, mining this type of subgraph in graph collections has scarcely been addressed. Thus, we propose AGCM-SLV, an algorithm for mining frequent approximate subgraphs within a graph collection that allows structural and label variations. Unlike other FSM approaches, our proposed algorithm tracks subgraph occurrences and their structural dissimilarities, allowing user-defined partial similarities between node and edge labels, and captures frequent approximate subgraphs (patterns) that would otherwise be overlooked. Experiments on real-world datasets demonstrate that our algorithm identifies more patterns than the most similar state-of-the-art algorithm with a shorter runtime. We also present experiments in which we add white noise to the graph collection at different levels, revealing that over 99% of the patterns extracted without noise are preserved under noisy conditions, making the proposed algorithm noise-tolerant. |
---|---|
ISSN: | 2076-3417 |