A group of scientists have developed an open-source dataset containing three years of data from Hong Kong’s largest behind-the-meter rooftop solar project. Power generation was collected at 5-minute intervals and meteorological data at 1-minute intervals.
Scientists from the Hong Kong University of Science and Technology have created a new three-year high-resolution dataset of rooftop PV generation in urban environments.
The open source data includes measured data on PV energy generation and associated weather data. PV generation was collected from 60 grid-connected rooftop PV stations on the university campus, while weather data was collected from an on-site station.
“The potential use cases for the dataset could be as follows: comparing the generation efficiency of PV modules with different capacities, module models, optimization types and connection time; calibrating PV generation and forecasting models developed from data-driven or physics-based approaches; developing automatic fault detection algorithms for PV modules; and longitudinal performance degradation analysis for PV systems,” the academics said.
The metered rooftop solar project, managed by the university’s Sustainability/Net-Zero Office, is the largest behind-the-meter solar project in Hong Kong. The combined power is 2,230.8 kW and the annual electricity production is 3 million kWh.
It is generated from 6,085 PV modules, all included in the dataset from 2021 to 2023. 61.7% of the stations are equipped with panel-level optimizers, and their measurements are collected by both the inverter and the panel-level optimizer. The remaining stations had no panel-level optimizers and their data was collected only at the inverter level. Generation data were collected at 5-minute intervals.
The meteorological data were collected at 1-minute intervals from a weather station on the east side of campus. The station consists of an automatic weather tower of 10 meters high and an outdoor area with six monitoring sensors. Parameters included irradiance, temperature, relative humidity, sea level pressure, visibility, wind and rainfall.
“This region has a subtropical climate, with humidity averaging more than 75% and temperatures ranging from 10 degrees Celsius in winter to above 30 degrees Celsius in summer,” the researchers said. “These climate conditions significantly impact the performance of PV systems, leading to variations in efficiency. Higher temperatures can reduce the efficiency of PV panels, while high humidity can lead to dust build-up, further affecting performance. Because the meteorological and solar PV data are captured at this specific location, this may limit the generalizability of models trained on this dataset.”
The PV generation measurement had an accuracy of about 2.5% based on the devices, while the weather station had a minimum uncertainty of about 1% and a maximum uncertainty of about 10% for the variety of sensors. The data is classified as grade A as the missing rate is less than 10%. Missing data can be the result of communications failures, equipment failures, and data capture errors, among other things.
“The open source dataset is divided into two categories: time series data and metadata,” the scientists explained, saying the former includes all PV generation and metrology data points. “To improve data understanding and enable efficient querying, we developed a Brick model that represents the location, equipment and temporary metadata for PV systems.”
The new dataset was presented in “A three-year high-resolution dataset to support analyzes of rooftop solar photovoltaic (PV) power generation”, published in Scientific data.
This content is copyrighted and may not be reused. If you would like to collaborate with us and reuse some of our content, please contact: editors@pv-magazine.com.