The FAIR Guiding Principles for Scientific Data Management and Stewardship

Principles of standardised data documentation, publication, sharing and preservation have been formalised in the FAIR Guiding Principles for scientific data management and stewardship. FAIR stands for Findable, Accessible, Interoperable and Reusable.

The FAIR principles helps to obtain a common approach to data management within MET Norway and across institutions, leading to a more unified data management regime with improved ability to serve the data users through:

  • Ease of data discovery, visualization and access
  • Standard interfaces without the need for special customization on the user side
  • Reduced storage needs (data can be streamed)
  • Improved ability to compare and combine data across domains
  • Ability to apply common data transformations, like spatial, temporal and variable subsetting and reprojection, before download
  • Possibility to build specialized metadata catalogues and data portals targeting a specific user community

Dataset definition

We define a dataset as the combination of data records and the associated information content required to find and use the data.

  • The information content shall accompany the data records, following agreed standards
    • Climate and Forecast Convention for use metadata (e.g., variable definitions and units)
    • Attribute Convention for Data Discovery (ACDD) for discovery metadata (e.g., time and place of observation, data creator, title, keywords, etc.)
  • To provide flexibility in search, access and reuse, dataset granularity is of major concern

More information

MET Norway's Data Management Handbook is available to internal users: https://s-enda.pages.met.no/data-management-handbook-met/.

External users can check the open-source version at https://metno.github.io/data-management-handbook/

Human Search Interface

web search interface

The web search interface can be accessed from the Data Catalog menu item. The search interface is composed on a central map and several filters.

The map shows a pagination of datasets available in the metadata catalog (max/min longitude/latitude rectangles), sorted by the latest additions first. 

Various interactions with the map can help to better display the results, and to perform data search. 

  • Select projection: the projection of the map and the bounding boxes can be changed using this button
  • Create bounding box filter: click this button to create a bounding box and define a geographical filter on the results
  • Reset search: clear the filter and start a new search
  • Reset map: reset the map

Map widgets:

  • +/- buttons for zooming
  • E: zoom to the extent of the displayed datasets
  • Switch map layers and features, or change the opacity of the overlays
  • Search location names
  • ">>": Show location in a world map
  • Full screen mode

The results of the search will be dynamically updated when filters are selected.

  • Full text search block
  • Start and end date of the datasets
  • Has children (i.e., a collection or dataset series)
  • Isotopic categories: The general subjects for which the geospatial data may be relevant, as defined by the ISO standard

To start a new search and remove all filters, click the "Reset" button.

Machine Search Interface

Automatic search and retrieval of data for use in downstream processing systems requires machine-machine interfaces. The OGC CSW standard is a catalog standard which enables data search through an API. 

MET Norway's OGC CSW API is available at data.csw.met.no. Some use examples:

PyCSW opensearch only supports geographical searches querying for a box. For more advanced geographical searches, one must write specific XML files. For example, to find all datasets containing a point:

<?xml version="1.0" encoding="ISO-8859-1" standalone="no"?>     
<csw:GetRecords     
   xmlns:csw="http://www.opengis.net/cat/csw/2.0.2"     
   xmlns:ogc="http://www.opengis.net/ogc"     
   xmlns:gml="http://www.opengis.net/gml"     
   xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"     
   service="CSW"     
   version="2.0.2"     
   resultType="results"     
   maxRecords="10"     
   outputFormat="application/xml"     
   outputSchema="http://www.opengis.net/cat/csw/2.0.2"     
   xsi:schemaLocation="http://www.opengis.net/cat/csw/2.0.2 http://schemas.opengis.net/csw/2.0.2/CSW-discovery.xsd" >     
 <csw:Query typeNames="csw:Record">     
   <csw:ElementSetName>full</csw:ElementSetName>     
   <csw:Constraint version="1.1.0">     
     <ogc:Filter>     
       <ogc:Contains>     
         <ogc:PropertyName>ows:BoundingBox</ogc:PropertyName>     
         <gml:Point>     
           <gml:pos srsDimension="2">59.0 4.0</gml:pos>     
         </gml:Point>     
       </ogc:Contains>     
     </ogc:Filter>     
   </csw:Constraint>     
 </csw:Query>     
</csw:GetRecords>

To find all datasets intersecting a polygon:

<?xml version="1.0" encoding="ISO-8859-1" standalone="no"?>     
<csw:GetRecords     
   xmlns:csw="http://www.opengis.net/cat/csw/2.0.2"     
   xmlns:gml="http://www.opengis.net/gml"     
   xmlns:ogc="http://www.opengis.net/ogc"     
   xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"     
   service="CSW"     
   version="2.0.2"     
   resultType="results"     
   maxRecords="10"     
   outputFormat="application/xml"     
   outputSchema="http://www.opengis.net/cat/csw/2.0.2"     
   xsi:schemaLocation="http://www.opengis.net/cat/csw/2.0.2 http://schemas.opengis.net/csw/2.0.2/CSW-discovery.xsd" >     
 <csw:Query typeNames="csw:Record">     
   <csw:ElementSetName>full</csw:ElementSetName>     
   <csw:Constraint version="1.1.0">     
     <ogc:Filter>     
       <ogc:Intersects>     
         <ogc:PropertyName>ows:BoundingBox</ogc:PropertyName>     
         <gml:Polygon>     
           <gml:exterior>     
             <gml:LinearRing>     
               <gml:posList>     
                 47.00 -5.00 55.00 -5.00 55.00 20.00 47.00 20.00 47.00 -5.00     
               </gml:posList>     
             </gml:LinearRing>     
           </gml:exterior>     
         </gml:Polygon>     
       </ogc:Intersects>     
     </ogc:Filter>     
   </csw:Constraint>     
 </csw:Query>     
</csw:GetRecords>

To find all datasets intersecting a polygon within a given time span:

<?xml version="1.0" encoding="ISO-8859-1" standalone="no"?>     
<csw:GetRecords     
   xmlns:csw="http://www.opengis.net/cat/csw/2.0.2"     
   xmlns:gml="http://www.opengis.net/gml"     
   xmlns:ogc="http://www.opengis.net/ogc"     
   xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"     
   service="CSW"     
   version="2.0.2"     
   resultType="results"     
   maxRecords="100"     
   outputFormat="application/xml"     
   outputSchema="http://www.opengis.net/cat/csw/2.0.2"     
   xsi:schemaLocation="http://www.opengis.net/cat/csw/2.0.2 http://schemas.opengis.net/csw/2.0.2/CSW-discovery.xsd" >     
 <csw:Query typeNames="csw:Record">     
   <csw:ElementSetName>summary</csw:ElementSetName>     
   <csw:Constraint version="1.1.0">     
     <ogc:Filter>     
       <ogc:And>     
         <ogc:Intersects>     
           <ogc:PropertyName>ows:BoundingBox</ogc:PropertyName>     
           <gml:Polygon>     
             <gml:exterior>     
               <gml:LinearRing>     
                 <gml:posList>     
                   63.3984 7.65173 60.7546 5.0449 59.0639 10.187 62.9065 12.4944 63.3984 7.65173     
                 </gml:posList>     
               </gml:LinearRing>     
             </gml:exterior>     
           </gml:Polygon>     
         </ogc:Intersects>     
         <ogc:PropertyIsGreaterThanOrEqualTo>     
           <ogc:PropertyName>apiso:TempExtent_begin</ogc:PropertyName>     
           <ogc:Literal>2022-03-01 00:00</ogc:Literal>     
         </ogc:PropertyIsGreaterThanOrEqualTo>     
         <ogc:PropertyIsLessThanOrEqualTo>     
           <ogc:PropertyName>apiso:TempExtent_end</ogc:PropertyName>     
           <ogc:Literal>2023-03-08 00:00</ogc:Literal>     
         </ogc:PropertyIsLessThanOrEqualTo>     
       </ogc:And>     
     </ogc:Filter>     
   </csw:Constraint>     
 </csw:Query>     
</csw:GetRecords>

Then, you can query the CSW endpoint with, e.g., python:

import requests     
requests.post('https://data.csw.met.no', data=open(my_xml_request).read()).text

QGIS

MET Norway’s S-ENDA CSW catalog service can also be used from QGIS as follows:

  1. Select Web > MetaSearch > MetaSearch menu item
  2. Select Services > New
  3. Type, e.g., data.csw.met.no for the name
  4. Type https://data.csw.met.no for the URL

Under the Search tab, you can then add search parameters, click Search, and get a list of available datasets.

Data Usage

If a dataset is accessible for reuse, a search result will contain an xml line like <atom:link href="https://thredds.met.no/thredds/dodsC/aromearcticarchive/2023/08/20/arome_arctic_det_2_5km_20230820T21Z.nc" title="Open-source Project for a Network Data Access Protocol" type="OPENDAP:OPENDAP"/>. This is an OPeNDAP link that allows data streaming. To use this in Python, it is possible to open the data with xarray, netCDF4 or other software packages:

import netCDF4     
ds = Dataset.open("https://thredds.met.no/thredds/dodsC/aromearcticarchive/2023/08/20/arome_arctic_det_2_5km_20230820T21Z.nc")     
import xarray as xr     
xds = xr.open_dataset("https://thredds.met.no/thredds/dodsC/aromearcticarchive/2023/08/20/arome_arctic_det_2_5km_20230820T21Z.nc")