Part of Advances in Neural Information Processing Systems 37 (NeurIPS 2024) Datasets and Benchmarks Track
Nikolaos Ioannis Bountos, Maria Sdraka, Angelos Zavras, Andreas Karavias, Ilektra Karasante, Themistocles Herekakis, Angeliki Thanasou, Dimitrios Michail, Ioannis Papoutsis
Global flash floods, exacerbated by climate change, pose severe threats to humanlife, infrastructure, and the environment. Recent catastrophic events in Pakistan andNew Zealand underscore the urgent need for precise flood mapping to guide restoration efforts, understand vulnerabilities, and prepare for future occurrences. While Synthetic Aperture Radar (SAR) remote sensing offers day-and-night, all-weatherimaging capabilities, its application in deep learning for flood segmentation is limited by the lack of large annotated datasets. To address this, we introduce KuroSiwo, a manually annotated multi-temporal dataset, spanning 43 flood events globally. Our dataset maps more than 338 billion $m^2$ of land, with 33 billion designatedas either flooded areas or permanent water bodies. Kuro Siwo includes a highlyprocessed product optimized for flash flood mapping based on SAR Ground RangeDetected, and a primal SAR Single Look Complex product with minimal preprocessing, designed to promote research on the exploitation of both the phase and amplitude information and to offer maximum flexibility for downstream task preprocessing. To leverage advances in large scale self-supervised pretraining methodsfor remote sensing data, we augment Kuro Siwo with a large unlabeled set of SARsamples. Finally, we provide an extensive benchmark, namely BlackBench, offering strong baselines for a diverse set of flood events globally. All data and code arepublished in our Github repository: https://github.com/Orion-AI-Lab/KuroSiwo.