Gridded data indexer (Grindex)

Results from discussion in Bergen, 08.09.2010

Diana starts now reading gridded data via Fimex. Fimex provides access to a information unit, i.e. a model run, a netcdf-file. Fimex does not provide the functionality to efficiently extract the information unit from an archive, e.g. a directory with model-files (/opdata ?, /starc/DNMI_HIRLAM4) or a database (WDB).

General indexer should be an indexer API with user-configurable changeable index-types. It should enable us to extract a set of information from a certain data-source in a fast way. Grindex should be API driven and easily to integrate with diana and Fimex.

  • filter on model
  • filter on referenceTime (from/to)
  • additinal restrictions
  • all given as ascii-strings (boost::regex?)
  • extra input required for (datatype (grib,felt,nc,wdb), config-file for input)
  1. catalog (ftp, filesystem, http), filenames (patterns)
  2. fimex enabled files (parameters in fimex)
  3. wdb
  4. CSW (long term)
  1. count of results
  2. fimex vector<boost::shared_ptr<CDMReader»
  3. dump to netcdf-files (via fimex/NetCDFWriter) for testing (command-line tool)
DNMI_HIRLAM4/2010/09/15/grdqh00.dat_20100915 HIRLAM4 = model, 2010/09/15/00 = reference time arctic_mfc = model, 2010091500 = reference time
/opdata/hirlam4/grdqh00.dat hirlam4 = model, reference time from data content
str searchCriteria = "MODEL,REFERANCETIME,FILENAMEMATCH:"*=MODEL/YY/MM/DD/grdqhHH.dat_*"';

gr = new Grindex(uri, string searchCriteria, dataformat, config)

str searchDSL = "model=*;refernceTime < 2007-08-09'
GrindexFind found = gr->find(str searchDSL)

size_t count = found->count()
vector<boost::shared_ptr<CDMReader> > = found->cdmReaders();

(Maybe this should be a part of fimex, but very related to Grindex?)

  1. dataprovider
  2. shape-name (grid-information, proj-string (also on latlong), proj-units (required for m), axes in m/degree)
  3. ref-time
  4. valid-time (from, to) [bounds]
  5. parameter-name (no convention yet)
  6. level-names (not level-numbers)
  7. level (from-to) (no level2-numbers)
  8. dataversion (eps primarily, different version of same (new model-run, same ref-time))
  9. referanse to field
