Metagenomic datasets of air samples collected during episodes of severe smoke-haze in Malaysia

Data Brief. 2021 May 9:36:107124. doi: 10.1016/j.dib.2021.107124. eCollection 2021 Jun.

Abstract

Transboundary emissions of smoke-haze from land and forest fires have recurred annually during the dry period (June to October, over the past few decades) in South East Asia. Hazardous air quality has been recorded in Malaysia during these episodes. Agricultural practices such as slash-and-burn of biomass and peat fires particularly in Sumatera and Kalimantan, Indonesia, have been implicated as the major causes of the haze. Past findings have shown that a diversity of microbes can thrive in air including in smoke-haze polluted air. In this study, metagenomic data were generated to reveal the diversity of microorganisms in air during days with and without haze. Air samples were collected during non-haze (2013A01) and two haze (2013A04 and 2013A05) periods in the month of June 2013. DNA was extracted from the samples, subjected to Multiple Displacement Amplification and whole genome sequencing (Next Generation Sequencing) using the HiSeq 2000 Platform. Extensive bio-informatic analyses of the raw sequence data then followed. Raw reads from these six air samples were deposited in the NCBI SRA databases under Bioproject PRJNA662021 with accession numbers SRX9087478, SRX9087479 and SRX9087480.

Keywords: Forest fires; Haze samples; Multiple Displacement Amplification; NGS.