Fmask (Function of mask) is an automated algorithm for detecting clouds and cloud shadows in Landsat 4–9 (including 4, 5, 7, 8, and 9) and Sentinel-2 imagery (Figure 1). It processes Landsats 4-9 Collection 2 Level 1 imagery (Digital Number) and Sentinel-2 baseline 4 Level-1C imagery (Top Of Atmosphere reflectance).
Figure 1: Example of Fmask-UPL for Sentinel-2. Image ID: S2B_MSIL1C_20190803T202849_N0208_R114_T10XEF_20190803T221046.
Version 5.0 includes seven different cloud detection models by integrating physical rules with machine learning or using each approach independently (Table 1). Particularly, it offers a Physics-Informed Machine Learning (PIML) framework (Figure 2) to enhance cloud detection accuracy, while cloud shadow detection relies on the physical geometric relationship between identified clouds and their corresponding shadows.
Table 1: Cloud detection models. UPL is recommended for Landsat 8-9 and Sentinel-2, while LPL is recommended for Landsat 4-7 (4, 5, and 7).
Category | Model Name | Command Option | Pre-trained ML | Physical Rules | Fine-tuned ML | Post-processing |
---|---|---|---|---|---|---|
Baseline | Physical | PHY | No | Yes | No | Spatial morphology |
LightGBM | GBM | LightGBM | No | No | No | |
UNet | UNet | UNet | No | No | No | |
Simple Combination | LPL | LPL | LightGBM | Yes | LightGBM (5% replacement rate) | Spatial morphology |
UPU | UPU | UNet | Yes | UNet (10 epochs) | No | |
Hybrid Combination | LPU | LPU | LightGBM | Yes | UNet (10 epochs) | No |
UPL | UPL | UNet | Yes | LightGBM (5% replacement rate) | Spatial morphology & UNet-overlap |
Note: The machine learning model could be pre-trained using global reference datasets and fine-tuned using “localized” training data generated by physical rules. UNet-overlap means that the cloud objects, where none of the pixels were classified as clouds by the pre-trained UNet model, will be excluded. Abbreviations: ML: Machine learning.
Figure 2: Flowchart of physics-informed machine learning (PIML) for cloud detection. The approach utilizes pixel-based LightGBM and CNN-based UNet models. The arrow indicates the processing sequence, transitioning from gray to black arrows. Abbreviations: HOT: Haze Optimized Transformation.
This repository only provides the source code and does not include the integrated global auxiliary datasets or pre-trained machine learning models. To access the complete Fmask package (~3 GB), including all necessary auxiliary data and model files, please download it from the link(s) below:
Version | Download |
---|---|
5.0.0 | Link to Fmask 5.0.0 package |
TBD
To apply Fmask-UPL on a single Landsat 8-9 image (default cloud dilation: 3 pixels):
python fmask.py --imagepath /path/to/image_directory_landsat8-9 --model UPL
To apply Fmask-UPL on a single Sentinel-2 image (recommended cloud dilation: 0 pixels):
python fmask.py --imagepath /path/to/image_directory_Sentinel-2.SAFE --model UPL --dcloud 0
To apply Fmask-LPL on a single Landsat 4-7 image (default cloud dilation: 3 pixels):
python fmask.py --imagepath /path/to/image_directory_landsat4-7 --model LPL
Option | Short | Description | Default |
---|---|---|---|
--imagepath |
-i |
Path to input image directory (Landsat/Sentinel-2). | required |
--model |
-m |
Cloud detection model to use (Options shown in Table 1). | UPL |
--dcloud |
-c |
Dilation size (in pixels) for cloud mask. | 3 |
--dshadow |
-s |
Dilation size (in pixels) for cloud shadow mask. | 5 |
--dsnow |
-n |
Dilation size (in pixels) for snow/ice mask. | 0 |
--output |
-o |
Directory for saving output. If not provided, results go into the input image directory. | None |
--skip_existing |
-s |
Skip processing if results already exist (yes or no ). |
no |
--save_metadata |
-md |
Save model metadata as CSV. | no |
--display_fmask |
-df |
Save and display the Fmask result as a PNG. | no |
--display_image |
-di |
Save and display the color composite figure (NGR: NIR-Green-Red and SNG: SWIR1-NIR-Red), cirrus band, and thermal band (if available). | no |
--print_summary |
-ps |
Print cloud, shadow, snow, and clear percentage summary. | no |
If the tool runs successfully, you will see progress information as shown below:
************************************************
Starting Fmask 5.0.0 with dilating 3 for cloud, 5 for shadow, and 0 for snow
Processing /gpfs/sharedfs1/zhulab/Shi/ProjectCloudDetectionFmask5/HLSDataset/Landsat/LC08_L1TP_048022_20230713_20230724_02_T1 with Fmask-UPL model
>>> loading solar_zenith in radian
>>> loading coastal in toa
>>> loading blue in toa
Click to see the full information of the progress
************************************************
Starting Fmask 5.0.0 with dilating 3 for cloud, 5 for shadow, and 0 for snow
Processing /gpfs/sharedfs1/zhulab/Shi/ProjectCloudDetectionFmask5/HLSDataset/Landsat/LC08_L1TP_048022_20230713_20230724_02_T1 with Fmask-UPL model
>>> loading solar_zenith in radian
>>> loading coastal in toa
>>> loading blue in toa
>>> loading green in toa
>>> loading red in toa
>>> loading nir in toa
>>> loading swir1 in toa
>>> loading swir2 in toa
>>> loading tirs1 in bt
>>> loading tirs2 in bt
>>> loading cirrus in toa
>>> calculating hot
>>> calculating whiteness
>>> calculating ndvi
>>> calculating ndsi
>>> calculating ndbi
>>> calculating sfdi
>>> calculating var_nir
>>> loading dem from gtopo30
>>> loading gswo
>>> loading ['unet'] as base machine learning model
>>> loading lightgbm as tune machine learning model
>>> normalizing the datacube to [-1, 1] with percentiles [1, 99]
>>> classifying image by unet
>>> adjusting physical rules 01/01
>>> cloud probability (TTT) | overlap: 0.198981859 | optimal threshold: 0.025000000
>>> cloud probability (FTT) | overlap: 0.248279496 | optimal threshold: 0.000000000
>>> cloud probability (TFT) | overlap: 0.110536988 | optimal threshold: 0.025000000
>>> cloud probability (FFT) | overlap: 0.248274458 | optimal threshold: 0.000000000
>>> cloud probability (TTF) | overlap: 0.212598233 | optimal threshold: 0.025000000
>>> cloud probability (FTF) | overlap: 0.248252342 | optimal threshold: 0.000000000
>>> optimal cloud probability (TFT) | optimal threshold: 0.03
>>> tuning machine learning model 01/01
>>> training lightgbm 100 tree based on 10001 samples
>>> using 19 predictors: ['coastal', 'blue', 'green', 'red', 'nir', 'swir1', 'swir2', 'tirs1', 'tirs2', 'cirrus', 'hot', 'whiteness', 'ndvi', 'ndsi', 'ndbi', 'sfdi', 'var_nir', 'dem', 'swo']
>>> classifying the image by lightgbm model
>>> stop iterating at the end
>>> postprocessing with morphology&unet-based elimination
>>> masking potential cloud shadow by flood-fill
>>> loading gtopo30-slope
>>> loading gtopo30-aspect
>>> loading sensor_zenith in degree
>>> loading sensor_azimuth in degree
>>> matching cloud shadows
>>> saved fmask layer as geotiff to /gpfs/sharedfs1/zhulab/Shi/ProjectCloudDetectionFmask5/HLSDataset/Landsat/LC08_L1TP_048022_20230713_20230724_02_T1
Finished with 11.90 mins
The tool generates a uint8 GeoTIFF file named after the selected cloud detection model, such as '<image folder name>_UPL.tif'.
Each pixel is classified with one of the following values:
Value | Class | Description |
---|---|---|
0 | Land | Clear land surface |
1 | Water | Clear water surface |
2 | Cloud Shadow | Shadow matched with the detected cloud |
3 | Snow/Ice | Snow- or ice-covered surface |
4 | Cloud | Detected cloud |
255 | Filled | No-data fill (e.g., due to missing input band(s)) |
Note: Water and snow/ice pixels are labeled solely to enhance cloud detection. Their detection accuracy has not been evaluated.
TBD
- Applied Physics-Informed Machine Learning (PIML) framework for cloud detection, as described in Qiu et al., 2025.
- Adapted cloud shadow detection from MATLAB Fmask 4.6 with minor improvements described on this page.
Earlier versions of the Fmask tools offered only a physical-rule-based cloud detection module, programmed in MATLAB. See this page for more details.
We welcome and encourage contributions to Fmask! There are two primary ways to contribute:
If you happen to have any issues or suggestions for improving Fmask, we encourage you to open an issue or submit a pull request.
We are actively collecting examples of images that have not been processed accurately by the current version of Fmask. If you come across such images, please share the image ID with us on this page. The collected images will be used to refine the inner machine learning models, improving their accuracy and reliability in future versions.
- False positive errors in cloud detection over bright surfaces. Although the most recent version of Fmask has addressed most of these issues, challenges remain in highly reflective areas, such as high-mountain snow and ice.
- Artifacts in cloud detection under very thin clouds. Thin (cirrus) clouds over bright surfaces, such as buildings and cropland, are more easily identified, as their features become more pronounced when the bright surfaces are located beneath very thin cirrus clouds.
- Potential omitted cloud shadows at the image boundary, where the associated clouds are either not identified or difficult to match outside the extent of the imagery (unable to detect beyond the image boundaries). Note: Our team is collecting images with cloud detection issues and will continuously update the machine learning model to make improvements.
Qiu, S., Zhu, Z., Yang, X., Ju, J., Zhou, Q., Neigh, C., Physics-Informed Machine Learning for Cloud Detection in Landsat and Sentinel-2 Imagery, Under review
Qiu, S., et al., Fmask 4.0: Improved cloud and cloud shadow detection in Landsats 4-8 and Sentinel-2 imagery, Remote Sensing of Environment, (2019), doi.org/10.1016/j.rse.2019.05.024 (paper for 4.0).
Zhu, Z. and Woodcock, C. E., Improvement and Expansion of the Fmask Algorithm: Cloud, Cloud Shadow, and Snow Detection for Landsats 4-7, 8, and Sentinel 2 images, Remote Sensing of Environment, (2014), doi:10.1016/j.rse.2014.12.014 (paper for version 3.2).
Zhu, Z. and Woodcock, C. E., Object-based cloud and cloud shadow detection in Landsat imagery, Remote Sensing of Environment, (2012), doi:10.1016/j.rse.2011.10.028 (paper for 1.6).
Qiu, S., et al., Improving Fmask cloud and cloud shadow detection in mountainous areas for Landsats 4–8 images, Remote Sensing of Environment, (2017), doi.org/10.1016/j.rse.2017.07.002 (paper for Mountainous Fmask (MFmask), which has been integrated into the current Fmask).
Shi Qiu (shi.qiu@uconn.edu) and Zhe Zhu (zhe@uconn.edu)
Global Environmental Remote Sensing Laboratory (GERSL), University of Connecticut, Storrs, USA