Worldwide co-occurrence analysis of 17 species of the genus Brachypodium using data mining

PeerJ. 2019 Jan 14:6:e6193. doi: 10.7717/peerj.6193. eCollection 2019.

Abstract

The co-occurrence of plant species is a fundamental aspect of plant ecology that contributes to understanding ecological processes, including the establishment of ecological communities and its applications in biological conservation. A priori algorithms can be used to measure the co-occurrence of species in a spatial distribution given by coordinates. We used 17 species of the genus Brachypodium, downloaded from the Global Biodiversity Information Facility data repository or obtained from bibliographical sources, to test an algorithm with the spatial points process technique used by Silva et al. (2016), generating association rules for co-occurrence analysis. Brachypodium spp. has emerged as an effective model for monocot species, growing in different environments, latitudes, and elevations; thereby, representing a wide range of biotic and abiotic conditions that may be associated with adaptive natural genetic variation. We created seven datasets of two, three, four, six, seven, 15, and 17 species in order to test the algorithm with four different distances (1, 5, 10, and 20 km). Several measurements (support, confidence, lift, Chi-square, and p-value) were used to evaluate the quality of the results generated by the algorithm. No negative association rules were created in the datasets, while 95 positive co-occurrences rules were found for datasets with six, seven, 15, and 17 species. Using 20 km in the dataset with 17 species, we found 16 positive co-occurrences involving five species, suggesting that these species are coexisting. These findings are corroborated by the results obtained in the dataset with 15 species, where two species with broad range distributions present in the previous dataset are eliminated, obtaining seven positive co-occurrences. We found that B. sylvaticum has co-occurrence relations with several species, such as B. pinnatum, B. rupestre, B. retusum, and B. phoenicoides, due to its wide distribution in Europe, Asia, and north of Africa. We demonstrate the utility of the algorithm implemented for the analysis of co-occurrence of 17 species of the genus Brachypodium, agreeing with distributions existing in nature. Data mining has been applied in the field of biological sciences, where a great amount of complex and noisy data of unseen proportion has been generated in recent years. Particularly, ecological data analysis represents an opportunity to explore and comprehend biological systems with data mining and bioinformatics tools.

Keywords: Association rules; Bioinformatics; Brachypodium; Co-occurrence analysis; Data mining.

Grants and funding

The authors received no funding for this work.