This event has ended. Create your own event on Sched.
For over 20 years, ESIP meetings have brought together the most innovative thinkers and leaders around Earth observation data, thus forming a community dedicated to making Earth observations more discoverable, accessible and useful to researchers, practitioners, policy makers, and the public. The theme of the meeting is Putting Data to Work: Building Public-Private Partnerships to Increase Resilience & Enhance the Socioeconomic Value of Data.

The meeting has now ended. Check out the ESIP Summer Meeting Highlights Webinar and learn how to access session materials at https://www.esipfed.org/collaboration-updates/esip-summer-meeting-2020-recap.
Back To Schedule
Wednesday, July 22 • 6:00pm - 7:30pm
Plenary: Proliferation of Vocabularies in Solid Earth, Space and Environmental sciences: Which one should I use and which ones can I trust?
Feedback form is now closed.
The widely accepted FAIR principles require that data be both human and machine-readable. The use of the web to provide access to structured data is widely accepted, and the mass-market web is beginning to make use of open standards for structured data, which are folding back into the technical community through initiatives such as ‘Science on Schema.org’. To maximise semantic interoperability, particularly across different domains, we are dependent on shared terminology that can be understood by both humans and machines and used by multiple communities, and ideally in multiple languages.

As more structured data goes online, codelists and reference vocabularies to support these are proliferating. A lack of mechanisms for discovery of existing vocabularies, and limited support for shared development and governance of them, mean there is limited motivation for vocabulary reuse. This leads to replications of effort resulting in multiple vocabularies with similar scope. The quality of these online vocabularies is highly variable. In this context, how does the user determine which vocabularies are trustworthy and fit for their purpose? Important considerations are:
  1. is the vocabulary developed by a recognised organisation?
  2. does it have scientifically valid definitions?
  3. are the definitions provided in a useful form?
  4. is the vocabulary aligned with related vocabularies?
  5. is there a plan for sustaining the vocabulary, both semantics and hosting arrangements, that will persist as long as the data that they connect to them?

Nevertheless, there are cases where it is necessary to manage a vocabulary locally, even if it has the same scope as existing vocabularies. Under these circumstances we need strategies and tools for vocabulary harmonization, in order to support interoperability between applications using them.
Guidelines are required both for users as to which are the best vocabularies to use, and to communities and terminology providers on when to develop a new vocabulary and how to govern its development.

Note: For this session, we are using the work ‘vocabulary’ to mean any semantic asset containing terms and (usually) information about those terms. This includes value sets (aka: bag of terms or term list), concept sets, topics, vocabularies, glossaries, thesauri, concept maps, taxonomies, ontologies, and now of course knowledge graphs…

The Session will start with three short presentations to set the scene:
  1. Proliferation of “controlled” vocabularies: feature or bug: Simon Cox
  2. Vocabulary Pick'em: The Definitive List of Vocabulary Selection Criteria (Now What?):  John Graybeal
  3. Perspectives from Vocabulary Services project: Adrian Burton

These presentations will be followed by breakout room discussions - pick which one you want or suggest another one.
  1. Can we develop a “5-star vocab” ranking similar to the Tim Berners Lee Five Star Open Data? (Lesley Wyborn)
  2. Can we develop guidelines for vocabulary mapping and harmonization? (Pier Luigi Buttigieg)
  3. Can we better utilise and extend the ESIP Community Ontology Repository? (Lewis McGibbney)
  4. Can learn the essentials of vocabulary governance from the “Big Three” in Earth and environmental science: GCMD, CGI-IUGS and NERC (Rowan Brownlee, Tyler Stevens, Natalia Atkins and Mark Rattenbury)
  5. Can we improve online vocab services? (Adrian Burton)
  6. Can we create a multi-disciplinary vocabulary space for physical samples (Jens Klump, Kerstin Lehnert)

The breakouts will be followed by report back and determining any potential next steps within ESIP.

Desirable outcomes from this session:
  1. Greater awareness of the issue of the current proliferation of vocabularies.
  2. Better coordination of work on vocabularies/ontologies across ESIP and greater pull from the ESIP Community Ontology Repository.
  3. Communities forming to develop best practice guidelines.

View Recording
View Session Notes
View Presentations


avatar for Simon Cox

Simon Cox

Research Scientist, CSIRO
avatar for Lesley Wyborn

Lesley Wyborn

Honorary Professor, Australian National University
avatar for Jens Klump

Jens Klump

Team Leader Geoscience Analytics, CSIRO
“The really exciting part is not about putting labels on things, but about what you can do when you put machine learning to work on the labelled data.” (https://www.auscope.org.au/posts/2020/12/18/introducing-jens).Vice President of the International Geo Sample Number Implementation... Read More →

Wednesday July 22, 2020 6:00pm - 7:30pm EDT