Same Stats, Different Graphs: Exploring the Space of Graphs in Terms of Graph Properties

IEEE Trans Vis Comput Graph. 2021 Mar;27(3):2056-2072. doi: 10.1109/TVCG.2019.2946558. Epub 2021 Jan 28.

Abstract

Data analysts commonly utilize statistics to summarize large datasets. While it is often sufficient to explore only the summary statistics of a dataset (e.g., min/mean/max), Anscombe's Quartet demonstrates how such statistics can be misleading. We consider a similar problem in the context of graph mining. To study the relationships between different graph properties, we examine low-order non-isomorphic graphs and provide a simple visual analytics system to explore correlations across multiple graph properties. However, for larger graphs, studying the entire space quickly becomes intractable. We use different random graph generation methods to further look into the distribution of graph properties for higher order graphs and investigate the impact of various sampling methodologies. We also describe a method for generating many graphs that are identical over a number of graph properties and statistics yet are clearly different and identifiably distinct.