Unified Listing - requirements

From wiki.gpii
Jump to: navigation, search

Unified Listing Overview

The Unified listing contains information about solutions (applications); AT and built-in.

There is a range of GPII components that will be using (and/or updating) the data in the Unified Listing. These are:

  • The Solutions registry used by the Real-Time Components (auto-personalization)
  • The C4A semantic framework
  • The Federated Repository of solutions/EASTIN (federation of solution databases)
  • Federation of mainstream/built-in solutions
  • The GPII Marketplace
  • The Shopping aid
  • Solutions Ontology (potentially via the semantic framework)

As can be seen from the variety of types of usage above, the Unified Listing will contain a wide range of information related to each solution. For example, the Solutions registry needs information about how to start and stop the application, what settings it as and so on. The GPII Marketplace, in turn, needs information on the producer, prices and available vendors. For details about the data held in the Unified listing, see below.

Basic Registry Requirements/Expectations

As most of the services/components using the Unified listing arent fully defined, it is very hard to predict the final requirements, dataset and general expectations.

But given the wide variety of usages, the database will be quite complex in terms of the data it needs to store. It needs to accomodate for several different types of users/uses and needs to support permissions accordingly, based of categories of data, user-types, etc. Furthermore, there will be several 'layers' of data; since the data for each solution is likely to be dependent on version/OS, this means a duplication of several fields. Another layer is translations which needs to be supported as well. Even for a specific version/OS, the are potentially multiple versions of each field due to the federated repository saving eg. multiple descriptions.

Data held in Unified Listing:

As described above, the Unified Listing needs to be able to contain a large amount of meta-data about each solution. But amount and type of information available will vary widely from solution to solution

A very rough/abstract list of fields:

  • Unique solution ID
  • Each solution is likely to have different version, and consequently multiple sets of meta-data, one for each version/platform combination. Therefor each solution should be able to have a set of instances. Most of the below data would be unique to each instance, and hence would be multiplied accordingly.
  • Solution name and company
  • Basic solution description
  • Needed by the solutions registry:
    • Declarations on how to start and stop the application, check whether it's running and installed, settings location and format, settings transformations, etc.
  • Market place:
    • Prices, Vendors, Vendor metadata, description, rating system, sys requirements, etc
  • Federated Repository:
    • Everything covered by EASTIN, original data from all federated DBs
  • Potentially ontologies related to solutions/Needs

Requirements to the DB:

  • Versioning (dated history of edits to each entry)
  • Multiple levels of permissions:
    • potentially on field level
    • moderation of some or all changes
    • permissions particular to translation only
    • different tiers of users (eg. manufacturers, contributors, general users, moderators, etc)
    • access to only parts of the data due to user tier
  • Should be able to handle a wide variety of entries (see Data held in Unified Listing above)
  • Several APIs, optimized for the various purposes the database is being used.
  • Should be able to handle the load as described below
  • A series of tools will be used to maintain the registry. An API should allow for this.

Expected load/usage:

  • The registry is expected to contain 500 - 2,500 solutions. The amount of data about each solution can be significant, eg:
    • descriptions of all settings/mappings
    • Each application can have several versions (different releases/OS)
    • Each application version can have multiple version of most fields
    • The data for each application version can have multiple translations, etc.
    • A rating system and commenting system might be implemented
  • Generally more reads than writes are expected. In the beginning the load will be heavier, but after a time it will become constant as usage reduces to maintenance, translation and updates/additions.
  • Burst of updates are expected (when pulling/getting pushed data from various sources)
  • Generally more reads are expected
  • The types and number of reads will depend on the final implementation. Likely parts of the usage of this database will be offloaded by optimized servers pulling relevant data occasionally, meaning big burst of reads (and/or writes). Generally, being as fundamental as it is to several services, the Unified Listing will be expected a relatively light usage. If some services (say, the shopping aid) becomes to big a load factor, it is probably preferable to create a new optimized server pulling from the Unified Listing and taking the load.
  • It's hard to predict the usage of the Unified Listing, as most of the services using it do not yet exist or are not yet well defined at all (for example, if we want a rating tool for the GPII Marketplace, the number of reads and write might be significantly higher)

Expected Usage and User Interfaces

Users/Uses

  • Browsers of the Registry - usually done through an onotology of some type
    • A simple interface for listing, browsing and searching the terms of the database. Read their descriptions, definitions, value-range, etc. Should support displaying the terms in multiple languages. Terms should be sortable according to fields
  • Registry Maintenance workers -  ( has extra maintenance and tracking fields visible) 
  • New item entry /  item edit interface  (moderated)
    • interface makes it easy to look for related or similar (or same meaning) terms
    • provides instructions for what goes in where
    • interface for adding new terms along with descriptions and translations. Also moderated editing of some fields in existing terms. 
  • Translators - side by side original and translation (and maybe other translation(s) as well) (with google translate button for first pass and a wiktionary button for word meanings)
  • Moderators interface - 
    • provides side by side with old version
    • shows differences
    • shows data on track record of submitter (and if they are banned)
    • allows scoring of submissions across very broad lines (e.g.  this was spam, this was a duplicate item, this was POOR/FAIR/GOOD/EXCELLENT ly written, ) 
  • Ontology creators using API -- (usually use this via API and their ontology tool -- but also need to search .  for the latter would they use  the same tools as new entry?? or ???)

Tools

  • Browser Tools
    • many different browsers
    • mostly ontology based
    • READ ONLY 
    • Allows passing of comments to editors
    • Allows access to different translation layers
  • Data entry/edit Tools
    • used to enter and maintain the database
    • different levels of authority
    • Moderated
  • Translation/Edit tools
    • used to create translation layers for the database
    • different levels of authority
    • language/region specificity possible


Masterplan:

  • Collection of Use Cases
  • Generation of example data entries
  • UI design and discussion
  • Decision of technology and implementation