Brick Kiln Classification (Bangladesh)
If you have any questions about this dataset, please reach out to Jihyeon Lee (jihyeon@cs.stanford.edu). If you have any questions about the dataloader, please reach out to Erik Rozi (erikrozi@stanford.edu).
Monitoring compliance with environmental regulations is a key step to combat climate change. In South Asia, brick manufacturing is a major source of pollution and carbon emissions, but because it is an informal industry, it is difficult to monitor and regulate, especially for low-income governments [1]. Automatically identifying brick kilns from satellite imagery is a low-cost, scalable approach to monitor sources of pollution and study their effect on nearby populations.
Details
Lee, Brooks, et al. [1] developed a model to classify high-resolution imagery as containing a brick kiln or not (and used gradient attribution to identify the exact location of the kiln). We focus on the task of classification, where “no kiln” (class 0) means no kiln is present in the image and “yes kiln” class 1 means there is a kiln present. The high-res imagery used in [1] was not released publicly because it was proprietary, so we provide a low-resolution alternative using Sentinel-2 imagery from Google Earth Engine. We retrieve 13 bands, B1 through B12, as documented in the Earth Engine catalog. The imagery is from October 2018 to May 2019, matching the time period from which the ground truth kiln locations were found [1]. There are 6,329 positive examples and 67,284 negative examples total. We provide a train, validation, and test split of 80-10-10 that preserves the relative proportions of classes across the splits. We evaluate models by using overall accuracy, precision, recall, and AUC score. The data preprocessing pipeline can be found here.
The images have been stored as hdf5 files, each of which has four keys: bounds
, images
, indices
, and labels
.
bounds
, size (4,): longitude and latitude of top left corner, longitude and latitude of bottom right cornerimages
, size (num_images, 64, 64, 13): image data with channels B1 through B12 from Google Earth Engineindices
, size (2,): global x, y location of this image tilelabels
, size (1,): 1 if image contains kiln, 0 if not
In the included CSV file, each row represents an image and records its metadata.
Data Format
Input
64x64 crops of Sentinel-2 imagery from October 2018 - May 2019 covering Bangladesh.
Output
A predicted class of 0 (no kiln) or 1 (yes kiln).
Dataloader Configuration
To load the Brick Kiln Classification
dataset, use brick_kiln
in the SustainBench dataloader.
Baseline Model
Documentation about the baseline model can be found here.
Download
The data can be downloaded here.
Notes
Sentinel 2 (10m resolution) imagery is used as input for this task. We retrieve 13 bands, B1 through B12, as documented in the Google Earth Engine catalog. To use blue, green, and red data, refer to bands B2, B3, and B4, respectively. Labels have been generated automatically from the brick kiln coordinate locations provided in [1].
Citation
@article{lee2021scalable,
author = {Lee, Jihyeon and Brooks, Nina R. and Tajwar, Fahim and Burke, Marshall and Ermon, Stefano and Lobell, David B. and Biswas, Debashish and Luby, Stephen P.},
title = {Scalable deep learning to identify brick kilns and aid regulatory capacity},
volume = {118},
number = {17},
elocation-id = {e2018863118},
year = {2021},
doi = {10.1073/pnas.2018863118},
publisher = {National Academy of Sciences},
issn = {0027-8424},
URL = {https://www.pnas.org/content/118/17/e2018863118},
eprint = {https://www.pnas.org/content/118/17/e2018863118.full.pdf},
journal = {Proceedings of the National Academy of Sciences}
}
References
[1] J. Lee, N. R. Brooks, F. Tajwar, M. Burke, S. Ermon, D. B. Lobell, D. Biswas, and S. P. Luby. Scalable deep learning to identify brick kilns and aid regulatory capacity. Proceedings of the National Academy of Sciences, 118(17), 2021. ISSN 0027-8424. doi: 10.1073/pnas.2018863118. URL https://www.pnas.org/content/118/17/e2018863118.