Putting machine learning to work for development
In their effort to create a machine-learning tool they hope will encourage more and better monitoring of the United Nation’s 17 Sustainable Development Goals (SDGs), a team of researchers led in part by Stanford University computer science PhD candidate Chenlin Meng ’20 BS ’25 PhD collaborated with the Sweden-based mapping platform Mapillary to download street-level images from 48 countries.
It was, Meng explains, “a huge” amount of data.
The images were used by Meng and her team in combination with demographic and health surveys to help measure—using novel algorithms and machine-learning models—poverty levels for more than two million households. Poverty is one of seven SDGs the researchers set out to measure with their tool, which they named SustainBench after the SDGs and the benchmarking their datasets facilitate.
SustainBench is a collection of 15 benchmark tasks for each of the seven SDGs that will provide consistent evaluation metrics for the machine learning and development communities. The work is part of the Stanford King Center on Global Development’s Data for Development initiative (DDI), which explores how to use data to address global poverty. USAID and the Global Innovation Fund are also funders.
Meng’s advisor, Assistant Professor of Computer Science Stefano Ermon, compared SustainBench to a well-known benchmarking model, ImageNet, which trained computers to recognize objects like cats and cars.
“Recent progress in machine learning has been driven by the availability of challenging and comprehensive benchmarks,” says Ermon, who is also a faculty leader of DDI. “These benchmarks allow researchers to test their algorithms and approaches on realistic tasks and compare results in a standardized manner. Sustainability applications however are rarely considered by machine learning researchers because they often require access to specialized datasets. We hope this will drive innovation on the machine learning modeling side and increase performance of the models.”
SustainBench is designed to lower the barrier to entry for researchers by providing—in one place—the datasets necessary to measure progress on the identified SDGs and to encourage the machine learning community to develop new methods for monitoring the SDGs. For 11 of its 15 tasks, SustainBench is releasing datasets that have not previously been made public.
The point of SustainBench is to allow other researchers to train their own models and validate their findings. On the SustainBench website, Meng and her co-researchers have created a “leaderboard,” which they will use to highlight machine-learning tools that seek to improve upon their model.
“If they can come up with some algorithm that is better, we will add their results,” Meng says. “We cannot solve this problem by ourselves. We hope this can encourage the community to focus more on making progress towards the SDGs.”
In addition to the datasets and website, the researchers have written a paper, SustainBench: Benchmarks for Monitoring the Sustainable Development Goals with Machine Learning, that will soon be published as part of the top machine learning and artificial intelligence conference, Neural Information Processing Systems.
“It’s a great outcome from the publication side,” says Earth System Science Associate Professor Marshall Burke, a DDI faculty leader who is also one of the authors. SustainBench “is a huge step toward organizing machine learning efforts toward development-related outcomes, and, in some sense, a culmination of a lot of the work that DDI has funded. Our hope is that it opens the doors to machine learning and artificial intelligence researchers from around the world to get involved in development work.”
The seven SDGs SustainBench benchmarks are: SDG 1 (No Poverty), SDG 2 (Zero Hunger), SDG 3 (Good Health and Well-Being), SDG 4 (Quality Education), SDG 6 (Clean Water and Sanitation), SDG 13 (Climate Action), and SDG 15 (Life on Land). In the future, the team hopes to add datasets to help researchers monitor the 10 other SDGs.
“Progress towards the SDGs is traditionally monitored by ground surveys and censuses,” Meng explains. “This kind of data collection is very expensive, and many countries go decades between measurements on key SDG indicators. Machine learning provides new tools to help plug those data gaps, including through satellite imagery, social media posts, and mobile phone activity.”
The potential impact of such tools in the fight against global poverty is huge. In the paper, Meng and her coauthors point to recent efforts by governments in Togo, Uganda, and Bangladesh to use machine learning to target economic aid to vulnerable populations during the COVID-19 pandemic.
SustainBench aims to fill a void between academic disciplines: Most machine-learning researchers don’t focus on development issues, and most development researchers don’t understand machine learning.
The King Center’s DDI was the perfect place to conceive of the concept, given its interdisciplinary approach to tackling global poverty.
“The SDGs are arguably the most urgent challenge facing the world today,” Meng says. “They require domain-specific knowledge from machine learning, computer science, environmental science, and economics. It makes sense that we are working together to figure out ways to provide better insight into monitoring the SDGs.”
In addition to Meng, there are two other first authors on the paper: Sherrie Wang, PhD ’21, (now a postdoctoral fellow at UC Berkeley) and Christopher T. Yeh, MS ’19, (now a Ph.D. candidate at the California Institute of Technology). There are also four joint second authors from Stanford—Center on Food Security and the Environment research data analyst Anne Driscoll, computer science student and research assistant Erik Rozi, ’24, computer science student and researcher Patrick Liu, ’24, and Jihyeon Lee, ’19, MS ’20, (now at Google). Faculty authors include Burke, Ermon, and Earth System Science Professor David Lobell. Like Burke and Ermon, Lobell is also a faculty leader of DDI.
Lobell says SustainBench was the “ultimate student-led project.”
“Three veteran students decided to do it, organized a team, and made it happen,” he says. “They all believed in the value of what they were doing. By improving models in this space, the hope is that society can make more informed and better decisions on poverty, hunger, and other sustainability issues.”