SATIN: A Multi-Task Metadataset for Classifying Satellite Imagery using Vision-Language Models

Roberts, Jonathan; Han, Kai; Albanie, Samuel

Computer Science > Computer Vision and Pattern Recognition

arXiv:2304.11619 (cs)

[Submitted on 23 Apr 2023]

Title:SATIN: A Multi-Task Metadataset for Classifying Satellite Imagery using Vision-Language Models

Authors:Jonathan Roberts, Kai Han, Samuel Albanie

View PDF

Abstract:Interpreting remote sensing imagery enables numerous downstream applications ranging from land-use planning to deforestation monitoring. Robustly classifying this data is challenging due to the Earth's geographic diversity. While many distinct satellite and aerial image classification datasets exist, there is yet to be a benchmark curated that suitably covers this diversity. In this work, we introduce SATellite ImageNet (SATIN), a metadataset curated from 27 existing remotely sensed datasets, and comprehensively evaluate the zero-shot transfer classification capabilities of a broad range of vision-language (VL) models on SATIN. We find SATIN to be a challenging benchmark-the strongest method we evaluate achieves a classification accuracy of 52.0%. We provide a $\href{this https URL}{\text{public leaderboard}}$ to guide and track the progress of VL models in this important domain.

Subjects:	Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
Cite as:	arXiv:2304.11619 [cs.CV]
	(or arXiv:2304.11619v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2304.11619

Submission history

From: Jonathan Roberts [view email]
[v1] Sun, 23 Apr 2023 11:23:05 UTC (8,314 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:SATIN: A Multi-Task Metadataset for Classifying Satellite Imagery using Vision-Language Models

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:SATIN: A Multi-Task Metadataset for Classifying Satellite Imagery using Vision-Language Models

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators