Curating machine learning datasets in international collaborations – case study on the Island of Bali
2021-09-30, 14:30–15:00, Microsoft AI for Earth

State of the art environmental datasets often combine satellite-based remote sensing information with data collected by humans in the field. This poses unique challenges to data collection and curation, specially if these materials are to be made amenable to machine learning processes. And the task become more challenging in international collaborations across language differences, cultural barriers and economic gradients.

This talk will present an overview of ongoing work situated on the Island of Bali that seeks to build a machine learning compatible dataset on ethnobotany collected on the ground in combination with land use data collected via satellites. This project is a collaborative effort between scholars from the US and Indonesia, as well as data collectors on the Island of Bali. The goal of the project is to make use of the synergies between remote sensing data and field data to better understand how local communities are in fact using their lands, and how tourism is impacting already limited resources on the island.


State of the art environmental datasets often combine satellite-based remote sensing information with data collected by humans in the field. This poses unique challenges to data collection and curation, specially if these materials are to be made amenable to machine learning processes. And the task become more challenging in international collaborations across language differences, cultural barriers and economic gradients.

This talk will present an overview of ongoing work situated on the Island of Bali that seeks to build a machine learning compatible dataset on ethnobotany collected on the ground in combination with land use data collected via satellites. This project is a collaborative effort between scholars from the US and Indonesia, as well as data collectors on the Island of Bali. The goal of the project is to make use of the synergies between remote sensing data and field data to better understand how local communities are in fact using their lands, and how tourism is impacting already limited resources on the island.


Authors and Affiliations

Marc Böhlen, University at Buffalo
Jianqiao Liu, University at Buffalo
Wawan Sujarwo, Indonesian Institute of Sciences
Rajif Iryadi, Indonesian Institute of Sciences

Track

Use cases & applications

Topic

Data collection, data sharing, data science, open data, big data, data exploitation platforms

Level

1 - Principiants. No required specific knowledge is needed.

Language of the Presentation

English