XML format for dataset descriptions
The following describes the internal metadata description of metamod. It evolved from How to manage different input metadata formats?
For each dataset we will need two XML files:
Dataset XMD File
In addition to an XML file containing the metadata (in one of several possible formats), we will need an XML file that describes identification etc. for the dataset.
Such XML files will be given an “.xmd” extention, and have a format like this:
<dataset xmlns="" ...> // Ref. to XML namespace etc. <info status="..." // Value: active/deleted ownertag="..." creationDate="..." // YYYY-MM-DDTHH:TT:SSZ datestamp="..." // YYYY-MM-DDTHH:TT:SSZ metadataFormat="..." // Example: DIF, MM2 name="..." /> // Unique identification <quadtree> ... </quadtree> </dataset>
The important consequence of this XMD-file is that several metadata formats can be supported by METAMOD as long as translations (XSLT) and proper interpretation of the XMD metdataFormat tag is implemented in the version of METAMOD being operated.
Documents
Restrictions
- 2009-02-17: The name field must match one or two / characters, one / meaning parent (
APPLICATION/DIRECTORY
), two / meaning file (APPLICATION/DIRECTORY/FILE
). See Introducing two levels in the DataSet table
Metadata XML File
The XML files containing metadata are of varying formats. They will all be given an “.xml” extention.
One of these formats are defined by METAMOD2, and is only used as an internal format within the METAMOD2 system. This format (MM2) is for metadata produced by the UPLOAD module or the QUEST module:
<MM2 xmlns="" ...> // Ref. to XML namespace etc. (if needed) <metadata name="...">value</metadata> // To be repeated. One element // for each name,value pair. </MM2>
Metadata names with special meaning
Metamod is generally ignorant of metadata-names, and most names can be configured. But some names have a special meaning and cannot be changed:
- area: geographic area either as detailed area Fram Strait or as gcmd-list Continent > Europe > Northern Europe > Scandanavia > Norway
- bounding_box for the extraction of quadtree-nodes for geographic search
- datacollection_period_from
- datacollection_period_to
- datacollection_period – deprecated since metamod 2.1
- topic: used in quest as gcmd-list without '> HIDDEN', see variable and bug#13
- variable: parameter, either as CF-1.0 standard_name sea_surface_temperature or as gcmd-list with '> HIDDEN' Agriculture > Agricultural Chemicals > Fertilizers > HIDDEN, see also bug#13
For translation to and from other metadata, please see the metadata-names in the Frequently Asked Questions entry on DIF-mapping. A extensive list of currently configured names can be found under: searchdata.xml
Documents
Hints on coding
Both metadata files should always be written as an atomic operation. To support this, flock should be used for reading and writing to the file-system. flock should first lock the .xmd file, then the .xml file. Then, both files should be written. Afterwards, unlock first the .xml file, then the .xmd file. Not using this order might lead to a deadlock.