New paper ! An active learning cloud detection tool to generate reference cloud masks for Sentinel-2. Application to the validation of MAJA, Sen2cor and FMask cloud masks

Olivier Hagolle, 20 février 2019

Example of reference cloud mask generated by ALCD, and comparison with the cloud masks generated by three operational processors (Sen2cor, FMask and MAJA). True positive invalid pixels appear in blue, true negative in green, false negative in red and false positive in purple..

It is not that frequent when the work of a trainee ends up as a peer reviewed publication, but Louis Baetens was a brilliant trainee. In a six months training period at CESBIO, funded by CNES, here is what Louis Baetens did:

developed an active learning method to generate reference cloud masks for Sentinel-2, using multi-temporal data as input
validated the quality of the produced masks (around 99% overall accuracy)
generated cloud and shadow masks covering 32 entire Sentinel-2 images
produced these same scenes with Sen2cor 2.5.5, FMask 4.0 and MAJA 3.3
evaluated the results using ALCD masks
wrote a report and a user manual for ALCD
released the masks and tools on open access platforms
And wrote (with Camille and myself) a scientific publication

The publication was just released by remote sensing :Baetens, L.; Desjardins, C.; Hagolle, O. Validation of Copernicus Sentinel-2 Cloud Masks Obtained from MAJA, Sen2Cor, and FMask Processors Using Reference Cloud Masks Generated with a Supervised Active Learning Procedure. Remote Sens. 2019, 11, 433.

The remaining of the post provides a plain language summary (but it’s better to read the paper !)

Active learning

Building reference cloud masks manually is a lot of work, and we tried in this work to implement an efficient work flow. We wanted to have fully classified images and not just a few polygons scattered in the image. So we decided to use a machine learning classification method, and to constitute the learning reference database iteratively. This is what means « active learning ». So you first provide a few learning samples, make a first classification, with its confidence map. On the next iterations, you just have to provide samples where confidence is low or classification is wrong. This enables to provide mainly meaningful samples and to reduce the amount of samples necessary.With ALCD framework, and QGIS to select the samples, it took me only 2 hours per image, but Louis, after some training, was able to provide one mask in one hour (but had blinking eyes afterwards and had to rest a little before making a new one).

Validation

Some part of the samples were kept for validation, showing an average accuracy around 99%. But we also wanted to compare their results to those of other authors, so Louis classified the same images as they did.Our reference cloud masks were compared to manually drawn polygon samples obtained by Hollstein et al, with an accuracy of 99% except for a few cases for which we had different definitions (Hollstein shadows included terrain shadows, while we only wanted to provide cloud shadows).Hollstein, A.; Segl, K.; Guanter, L.; Brell, M.; Enesco, M. Ready-to-Use Methods for the Detection of Clouds, Cirrus, Snow, Shadow, Water and Clear Sky Pixels in Sentinel-2 MSI Images. Remote Sens. 2016, 8, 666.

Reference cloud masks

A data set of 32 reference cloud masks has been built by Louis and myself ( well… 29 by Louis and only 3 by myself, privilege of the supervisor). Here are the 32 masks :

Mosaic of cloud masks generated by ALCD.

Validation of operational cloud masks

For all the reference masks, we computed the performances of three operational processors, Sen2cor, which is the processor developed and selected by ESA to provide Sentinel-2 official L2A , FMask, which is used by USGS to produce Landsat data, but is also applicable to Sentinel-2, and of course MAJA, used by THEIA to provide high quality L2A products. It’s been a while since we claim that MAJA has good performances regarding cloud and shadow detection, and now, we are happy to prove it. However, if Sen2cor has much lower performances, FMask 4.0 is not too far from MAJA’s performance. On average, the optimal accuracy obtained for MAJA when comparing dilated masks of invalid pixels (cloud or shadows) is 91%, FMask’s is 90% and Sen2cor 84%. Moreover, MAJA, and FMask to a lower extent, are quite robust and their performances are quite constant, whereas Sen2cor tends to have highly variable performances depending on the cases, as it may be seen on next figure.

	MAJA	Sen2cor	FMask
Average overall accuracy for invalid/valid pixel classification	91%	84%	90%

91% may seem a quite low number, it corresponds to 9% of errors, and this should push us to do more research in improving the tuning of their parameters, or introducing new methods such as for instance machine learning. But these 9% of pixels in fact often correspond to pixels for which the operator also had difficulties to decide if it was a real cloud or not. Moreover, the selected scenes contain a mixture of cloud free and cloud pixels, and are therefore more complex than fully clear or full cloudy images. The real performance should be therefore a little better.

Access to data sets and ALCD tools

The 32 cloud masks generated so far with ALCD are available on zenodo DOI server :

Louis Baetens, & Olivier Hagolle. (2018). Sentinel-2 reference cloud masks generated by an active learning method [Data set]. Zenodo. http://doi.org/10.5281/zenodo.1460961

Access to ALCD tool

Using ALCD will require to install OTB, and also requires QGIS to select the samples.The ALCD software can be downloaded from a github repository. ALCD has already been reused by CNES colleagues to detect water surfaces and to make demonstration of active learning procedures.

Rechercher