Open Source Science
2021-09-30, 15:00–15:30, Microsoft AI for Earth

The core tools of geospatial science (data, software, and computers) are undergoing a rapid and historic evolution, changing what questions scientists ask and how they find answers. This shift is fueled by developments in the open source software community. The open source ecosystem now supports and deeply influences how science is accomplished and thought about. Advanced open source software tools are enabling new data formats that are optimized for cloud storage enabling rapid analysis of multi-petabyte datasets. Open source cloud-based data science platforms, accessed through a web-browser window, are enabling advanced, collaborative, interdisciplinary science to be performed wherever scientists can connect to the internet. Increasing amounts of data and computational power in the cloud are unlocking new approaches for data-driven discovery. For the first time, it is truly feasible for geospatial and other scientists to bring their analysis to the data without specialized cloud computing knowledge. Practically, for scientists, the effect of these changes is to vastly shrink the amount of time spent acquiring and processing data, freeing up more time for science. This shift in paradigm is lowering the threshold for entry, expanding the science community, and increasing opportunities for collaboration, while promoting scientific innovation, transparency, and reproducibility. These changes are increasing the speed of science, broadening the possibilities of what questions science can answer, and expanding participation in science.


The core tools of geospatial science (data, software, and computers) are undergoing a rapid and historic evolution, changing what questions scientists ask and how they find answers. This shift is fueled by developments in the open source software community. The open source ecosystem now supports and deeply influences how science is accomplished and thought about. Advanced open source software tools are enabling new data formats that are optimized for cloud storage enabling rapid analysis of multi-petabyte datasets. Open source cloud-based data science platforms, accessed through a web-browser window, are enabling advanced, collaborative, interdisciplinary science to be performed wherever scientists can connect to the internet. Increasing amounts of data and computational power in the cloud are unlocking new approaches for data-driven discovery. For the first time, it is truly feasible for geospatial and other scientists to bring their analysis to the data without specialized cloud computing knowledge. Practically, for scientists, the effect of these changes is to vastly shrink the amount of time spent acquiring and processing data, freeing up more time for science. This shift in paradigm is lowering the threshold for entry, expanding the science community, and increasing opportunities for collaboration, while promoting scientific innovation, transparency, and reproducibility. These changes are increasing the speed of science, broadening the possibilities of what questions science can answer, and expanding participation in science.


Authors and Affiliations

C. L. Gentemann (1), Holdgraf, C. (2,3), Abernathey, R. (2,4), Crichton, D. (5), Colliander, J. (2,6,7), Kearns, E.J. (8), Panda, Y. (2), Signell, R.P. (9)

1 Farallon Institute, Petaluma, CA
2 2i2c, Berkeley, CA
3 International Computer Science Institute, Berkeley, CA
4 Lamont Doherty Earth Observatory of Columbia University, Palisades, NY
5 Jet Propulsion Laboratory, California Institute of Technology, Pasadena, California
6 Pacific Institute for the Mathematical Sciences, Vancouver, BC, Canada
7 University of British Columbia, Vancouver, BC, Canada
8 First Street Foundation, Brooklyn, NY
9 US Geological Survey, Woods Hole, MA

Track

Use cases & applications

Topic

Open and Reproducible Science

Level

1 - Principiants. No required specific knowledge is needed.

Language of the Presentation

English

I have worked for over 25 years on retrievals of ocean temperature from space and using that data to understand how the ocean impacts our lives. I have served on NOAA’s Science Advisory Board and as co-chair of a standing committee for the National Academy of Sciences.