You are viewing 1 of 2 articles without an email address.


All our articles are free to read, but complete your details for free access to full site!

Already a Member?
Login Join us now

Google makes it easier to discover datasets

Google said its approach is based on an open standard for describing this information

LinkedInTwitterFacebook
Dataset Search lets you find datasets wherever they are hosted
Dataset Search lets you find datasets wherever they are hosted

Google has developed a search engine to allow researchers, scientists and journalists to find the data required for their work more easily.

 

Dataset Search aims to provide access to “millions of datasets” from many thousands of data repositories on the web in addition to the information published by local and national governments around the world.

 

Similar to how Google Scholar works, the new functionality is outlined in a blog post published to coincide with the launch. “Dataset Search lets you find datasets wherever they’re hosted, whether it’s a publisher’s site, a digital library, or an author’s personal web page,” writes Natasha Noy, a research scientist at Google AI, who was involved in the tool’s development.

 

Guidelines for dataset providers

 

To create Dataset Search, Google developed guidelines for dataset providers to describe their data in a way that it (and other search engines) can better understand the content of their pages. These guidelines include “salient” information about datasets: who created the dataset, when it was published, how the data was collected, and what the terms are for using the data.

 

This information is collected and linked, analysed where different versions of the same dataset might be, and publications found that may be describing or discussing the dataset. Google said its approach is based on an open standard for describing this information (schema.org) and anybody who publishes data can describe their dataset this way.

"Dataset Search lets you find datasets wherever they’re hosted, whether it’s a publisher’s site, a digital library, or an author’s personal web page"

“We encourage dataset providers, large and small, to adopt this common standard so that all datasets are part of this robust ecosystem,” continued Noy.

 

In this new release, references to most datasets in environmental and social sciences, as well as data from other disciplines including government data and data provided by news organisations, such as ProPublica can be found.

 

Data from Nasa and NOAA as well as from academic repositories such Harvard’s Dataverse and Inter-university Consortium for Political and Social Research (ICPSR) can also be accessed.

 

As more data repositories use the schema.org standard to describe their datasets, the variety and coverage of datasets that users will find in Dataset Search, will continue to grow.

 

“A search tool like this one is only as good as the metadata that data publishers are willing to provide,” Noy concluded. “We hope to see many of you use the open standards to describe your data, enabling our users to find the data that they are looking for.”

 

If you like this, you might be interested in reading the following:

 

Smart city data tool to inform decision-making

Aggregated data is collected from more than one million vehicles equipped with Geotab telematics devices

Read more

 

Boston’s open data platform wants users to ‘Analyse’ the city

OpenGov is the world’s first integrated cloud solution for public sector budgeting, reporting, and open data

Read more

 

Denton opens up to embrace smart government

The tool will be integrated with OpenGov’s other offerings to help the city increase public trust and facilitate civic action

Read more

 

 

LinkedInTwitterFacebook
Add New Comment
You must be a member if you wish to add a comment - why not join for free - it takes just 60 seconds!