Dataset for WWW landing pages webobject retrieval performance evaluation

Data Brief. 2020 Mar 14:30:105429. doi: 10.1016/j.dib.2020.105429. eCollection 2020 Jun.

Abstract

This dataset describes data obtained from a multi-day World Wide Web (WWW) measurement campaign distributed internationally across multiple Amazon Web Service (AWS) datacentres. The Chrome web browser was controlled by the Selenium framework to make repetitive requests to several popular websites; the resulting webobjects were captured by a proxy server and details about them stored in the provided SQLite3 databases. A Python script is provided to evaluate the webobjects with respect to their configured as well as their actual expiration times, as part of our more detailed analysis that we provide in [1]. Researchers and practitioners can readily employ this dataset in their own research endeavours with little efforts for avenues of inquiry beyond webobject expiration times we described in [1], as we provide additional information about each webobject and each website visit during the measurement campaign time horizon.

Keywords: Information retrieval; Performance evaluation; Web caching; World Wide Web.