User Tools

Site Tools


metamod:roadmap

Roadmap for 2.8

due Jan/Feb 2011

The roadmap is currently work in progress.

The requirements specification for the release can found here https://svn.met.no/metamod/catalyst-testing/docs/requirement_specification.ods (work in progress)

Current open questions can be found here Open questions for version 2.8

Planned changes

  • Rewrite the web frontend from PHP to the Catalyst Perl framework.
  • Implement collection basket.

Of the planned changes the conversion to Catalyst is expected use most of the time.

Conversion to Catalyst

Functionality

The functionality is expected to be the same for most parts of the application except that the use of sessions will be removed from the search page.

Technology

The conversion will be based on the following Perl modules:

  • Moose
  • Template::Toolkit
  • DBIx::Class

Installation and deployment

Installation of dependencies will be based on .deb packages. As we cannot be restricted by the packages that are provided as part of Debian we need to create our own packages for all CPAN packages that are used.

When deploying an application update_target.pl will still be used. update_target.pl will still be able to replace any file from the source directory to the target directory. Over time it is expected that this functionality is not needed and that it will be replaced by a different system. What this will be is not known at this time.

To override the CSS styles for the application, the file 'catalyst/root/static/css/custom.css' should be replaced. This file is be default empty, but still loaded for all pages.

Roadmap to 2.9

This version due in august 2011

Priority tasks

Simplify Installation (ØT)

  • Goals:
    • All scripts (Perl and bash) run from source catalogue.
    • Source can be packaged as .deb and installed on server, e.g. in /usr/share/metamod.
    • Several instances share the same installed METAMOD source.
  • Why:
    • Simplified development.
    • Simplified deployment.
  • Required changes:
    • Bash scripts must read master_config.txt at runtime. No extra generation step can be used.
      • We have a prototype for this, but we must also check all scripts that the variables are used correctly.
    • All scripts (Perl and bash) must take the path to master_config.txt as a commandline parameter and/or an environment variable.
      • I am unsure how this shall be done for scripts installed as services in /etc/init.d/. Perhaps configuration in /etc/default?
        • We must have separate scripts per site as today (due to limitations in System V init scripts); ergo these still must be generated, e.g. by install_jobs. Putting the config in /etc/default is a bit more elegant, but the init script still needs to be generated (so it knows which defaults file to read). — Geir Aalberg 2011/05/20 08:04
    • Path to custom catalogue not hardcoded, but moved to master_config.txt.
    • Path to staticdata/ not hardcoded but moved to master_config.txt.
      • Disagree to both. Instead of pointing to the master_config file, we should instead point to the applic directory. This should have a fixed structure so that master_config, staticdata and custom can be found automatically. — Geir Aalberg 2011/05/20 08:27
  • Limitations:
    • Only one version of METAMOD installed per server when installed via Debian package.
      • Presumably this means one copy of the code per server. We can still allow different versions by including the version number in the path, e.g. /usr/share/metamod29 (otherwise we will have a problem with automatic upgrades and backwards compatibility). — Geir Aalberg 2011/05/20 08:19

These changes will remove the need for the target directory, but also remove the possibility for overwriting any file in METAMOD. If this feature is still needed we need to re-think the solution. This feature is not needed.

After the changes METAMOD will be started like this:

From source:

catalyst/scripts/metamod_webserver.pl -r --metamod-config <path to application dir>

On deployment server (for testing):

metamod_server.pl --metamod-config <path to application dir>

On deployment server (inside init script):

start-stop-daemon --pidfile $PIDFILE ... \
      --startas $BINDIR/metamod_server.pl -- --port $PORT \
      --metamod-config $APPLICATION_DIR ...

where PIDFILE, BINDIR, PORT and APPLICATION_DIR are all specified in /etc/defaults/xxx.

Near Real-time data

Changing XML-files gives often wrong feedback to the user because files are changed in the file-database, but is read from the SQL-database, which is generally updated 10minutes later.

This will also improve testability since mmtime will no longer be required.

  • import_dataset should not poll datestamp changes on XML-files, but instead run from a queue/directly on the database.
    • everything changing the XML-files should trigger a import_dataset directly in the database. Import_dataset should not be called as a system command - but as a library function.
    • everything changing the XML-files runs currently through Metamod::ForeignDataset::writeToFile, so this should be very simple now
    • direct connection to database, Metamod::ForeignDataset::writeDataset..
    • TASKS
      • Remove daemon option from import_dataset.pl
      • Add better validation to questionnaire data as it is now inserted directly into the database.
      • Clean-up Metamod::DatasetImporter as it now a mix of object-oriented and procedural.
      • Add proper error handling to Metamod::DatasetImporter::write_to_database()
      • Remove QuadTrees support. Also includes removing tables from the database creation script and creating an update script from 2.8 to 2.9 that deletes the tables.
      • Refactor Metamod::DatasetImporter to smaller functions instead of one huge one.
      • Remove the user of shared metadata to ensure proper cleanup of database on re-import.
      • Find out where to add call to subscription service as it should not be the responsibility of Metamod::DatasetImporter.
      • Fix rights for database web user.
  • upload_monitor should only poll the ftpupload directory. Other uploads (webupload or osisaf-trigger) should be triggered and run from a queue. (Actually, upload_monitor now only processes web uploads via event queue; FTP is handled by a separate daemon ftp_monitor. Not sure how osisaf-triggers work. — Geir Aalberg 2011/07/01 08:37)
    • This will remove upload-time from 1h10m to some seconds
    • Syntheses run multiple times, problem?
    • evaluate use of Inotify module for ftp (IN_CLOSE_WRITE on files; problems: linux-only, system event - does not work with remote NFS changes, may work well as addition to current impl.)
  • harvester: configuration option on how often to run harvesting. (Currently: once a day?).

ncdigest independent of datasource

  • ncdigest should only know about the actual file to digest and older xml-files – it should not try to read all files in the repository
    • the 'history'-merging function should merge from the old xml-files
    • old data-repository files may have moved to archive

Improve Metadata editor

This feature is on hold until more information is available.

  • more info, please.. Geir
  • NETMAR requirements?
  • special editor for xmd-files (based upon ForeignDataset, without converting to Dataset/MM2)
  • Improving admin/xmledit
    • Additional admin/xmledit to edit through editor
    • Improving current admin/xmledit to be usable with other formats (i.e. without/with different (i.e. DIF/ISO19115) validation)

Smaller changes / Reviews

  • RSS-feeds with changable content (same content as in 'second-level' view)
  • Memory analysis of perl/catalyst
  • SRU2JDBC- full-text column should be without xml-tags
  • upload-monitor: allow user to “flag” datasets as deleted. (Geir investigates)
  • review: remove all intrinsic / no needed config-variables
  • review logger categories
  • more tests
    • catalyst-test runs full tests through http? FIXME (clarify, please .. robot?)
  • remove all 'quadtree' parts (deprecated since 2.3)

Low priority

  • allow 'boolean queries' (AND OR)
  • allow phrase searches (“norwegian meteorological institute”)
  • test cases for above, test against lucid database (pg 8.4) if possible
  • search in all metadata-elements per dataset at once!
    • currently 'norwegian model' does not find any data, since 'model' is 'activity type' while 'norwegian' is 'institute'.

Future changes

  • change of configuration format: https://wiki.met.no/metamod/better_config
  • change import_searchdata
    • use libxml instead of XML::Simple
    • make searchdata.xml using xml-structures, not flat structure with id-references
    • use searchdata.xml only for search-interface (e.g. using xslt) – don't restrict imported data by searchdata.
  • change code structure
  • join databases: move metabase to a schema of userbase and only drop schema
    • reduce number of database-handles
    • no need to reimport functions, i.e. tsearch/postgis
    • no need to drop complete database (which requires stopping all db-connections)
  • evaluate oai-pmh in perl
  • digest_nc
    • don't try to express CF-1.X in digest_nc.xml
    • simplify digest_nc.xml only include general translations of metadata
    • allow ncml per dataset to make specializations
    • allow digest_nc to read from Opendap (PDL::NetCDF 4.08 + nc4)
metamod/roadmap.txt · Last modified: 2011-07-20 12:29:50 by geira