Whole-genome sequence data of the proteolytic and bacteriocin producing strain Enterococcus faecalis PK23 isolated from the traditional Halitzia cheese produced in Cyprus

Data Brief. 2021 Sep 30:38:107437. doi: 10.1016/j.dib.2021.107437. eCollection 2021 Oct.

Abstract

Halitzia is a traditional white-brined cheese produced by a limited number of producers in Cyprus. During a survey of the microbiome of a number of different Halitzia samples, we identified a bacterial strain that exhibited enhanced proteolytic activity compared to the other isolates. The strain was further studied, and it was assigned as Enterococcus faecalis PK23. We proceeded with sequencing of its whole genome using Illumina technology. Initial sequencing and assembly produced 116 scaffolds with a length of 3,149,036 bp. Comparison with the available E. faecalis genomes revealed that the strain PK23 exhibited high levels of identity to the genome sequence of E. faecalis isolate 26975_2#180 deposited in GenBank as a single complete contig. From the 116 scaffolds 106 could be aligned to the genome of isolate 26975_2#180 leading to a chromosomal length of 3,132,784 bp with a GC content of 37.3%. From the remaining 10 scaffolds, five showed similarity to plasmid sequences. More specifically, scaffold 54 showed high identity with most part of plasmid pEF1071 of E. faecalis strain BFE 1071, which carries the gene cluster involved in the biosynthesis of enterocins 1071A and 1071B, while scaffold 77 showed high identity with the entire sequence of the unnamed_5 cryptic plasmid of Enterococcus faecium strain PR05720-3. The other three scaffolds were only short parts of larger plasmids. The remaining five scaffolds which could not be related to any plasmid sequence most probably constitute chromosomal sequences present in strain PK23 but absent from isolate 26975_2#180. Their total length was around 2.7 kb, which does not affect the sequence of the PK23 pseudochromosome in a major way. The whole-genome sequence annotation of strain PK23 identified 3161 coding sequences and 62 RNA sequences. The results from the Rapid Annotation using Subsystem Technology (RAST) version 2.0 server indicated the presence of seven putative genes which were related to the subsystem of Protein Degradation. This dataset provides a first overview of the proteolytic and bacteriocin producing properties of E. faecalis PK23. The dataset may also be used in future experiments which could shed light on the adaptation of the strain in the dairy environment and its role in cheese production.

Keywords: Adaptation; Bacteriocin; Cheese; Enterococcus; Genomics; Lactic acid bacteria; Plasmid; Proteolysis.