You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Leveraging Vision Transformers for Enhanced Wildfire Detection and Characterization
In this project, we use the active fire dataset from https://github.com/pereira-gha/activefire (data link) and try to improve over their results. We use two Vision Transformer networks: Swin-Unet and TransUnet, and one CNN-based UNet network. We show that ViT can outperform well-trained and specialized CNNs to detect wildfires on a previously published dataset of LandSat-8 imagery (Pereira et al.). One of our ViTs outperforms the baseline CNN comparison by 0.92%. However, we find our own implementation of CNN-based UNet to perform best in every category, showing their sustained utility in image tasks. Overall, ViTs are comparably capable in detecting wildfires as CNNs, though well-tuned CNNs are still the best technique for detecting wildfire with our UNet providing an IoU of 93.58%, better than the baseline UNet by some 4.58%.
File description
UNet.py: Contains the pytorch code for UNet model.
evaluate.py: Takes in the model name and evaluates the saved checkpoint on 4 metrics: precision, recall, f-score, and IoU.