top of page

Data: Cleaning-up the grid

Our source. SkyHarvester uses  the daily precipitation gridded dataset from the IRI/LDEO climate data library. The dataset is based on global NOAA precipitation gauge stations with a grid size of 0.5ºx0.5º, and ranges in longitude (-165.5º to -50.5º) and latitude (9.5º to 72.5º). Dates range from January 1, 1979 to December 31, 2014 (36 years).

 

Reason to "clean" it. While the use of gridded precipitation data sets allows for an unprecedented spatial outlook into the viability of RWH in the US, it also has a number of disadvantages. Because it is based on an interpolation scheme using real station precipitation data, gridded precipitation results in the loss of many of the inherent characteristics of rainfall patterns. In general, gridded precipitation increases the number of wet days by a factor of 1.5 – 3 within the United States, with most of the increase represented by insignificant rainfall amounts. Using the raw gridded precipitation data would therefore lead to spurious reliability values when evaluating RWH systems

The solution. We came up with a deconvolution mechanism that corrects the data to reflect inherent weather patterns observed on the ground. We used a trial-and-error process to optimize the deconvolution algorithm. The algorithm removes insignificant rainfall amounts that aren't detected by normal weather stations and uses spatial patterns to make decisions on the removal of spurious rainfall values. Our process reduced error to within 10% in most cases.

Fraction of wet days (a) pre-deconvolution and (b) post-deconvolution. Note that most of the regional patterns were preserved. For further reference, see the Table below for more in-depth results.

Representative results of deconvolution. “wet/total grid” is the percentage of wet days in the original data set; “wet/total real” is the percentage of wet days in weather station data downloaded from KNMI explorer; “wet/total deconvolution” is the percentage of wet days in the deconvoluted data set; “Error Explained” is the error explained by the deconvolution mechanism, and is calculated as: (wet/total grid – wet/total deconvolution)/(wet/total grid – wet/total real)*100%.

bottom of page