Portuguese public procurement data for construction (2015-2022)

Data Brief. 2023 Mar 16:48:109063. doi: 10.1016/j.dib.2023.109063. eCollection 2023 Jun.

Abstract

The Architecture, Engineering and Construction (AEC) sector currently exhibits a significant scarcity of systematised information in databases (DB). This characteristic is a relevant obstacle to implementing new methodologies in the sector, which have proven highly successful in other industries. In addition, this scarcity also contrasts with the intrinsic workflow of the AEC sector, which generates a high volume of documentation throughout the construction process. To help solve this issue, the present work focuses on the systematisation of the data related to the contracting and public tendering procedure in Portugal, summarising the steps to obtain and process this information through the use of scraping algorithms, as well as the subsequential translation of the gathered data into English. The contracting and public tendering procedure is one of the most well-documented procedures at the national level, having all its data available as open-access. The resulting DB comprises 5214 unique contracts, characterised by 37 distinct properties. This paper identifies future development opportunities that can be supported by this DB, such as the application of descriptive statistical analysis techniques and/or Artificial Intelligence (AI) algorithms, namely, Machine Learning (ML) and Natural Language Processing (NLP), to improve construction tendering.

Keywords: Artificial intelligence; Construction; Contract awarding; Database; Machine learning; Natural language processing; Public procurement; Scraping algorithm.