Technology Evaluation - Survey Functionality

From wiki.gpii
Jump to: navigation, search

Background

In order to collect important information from GPII users, APCP needs a survey pop-up functionality to be developed. An integral part of this is to enable the PSP to decode simple payloads sent by the survey server which will instruct the PSP under what circumstances to show a survey to the user. The purpose of this document is to examine whether there are any suitable existing systems/modules/libraries that can both help with this task and at the same time be minimalistic and easy to use and extend.

Requirements

  1. The PSP should be able to parse instructions sent by the survey server for when to show or hide a survey.
  2. The PSP should be able to determine when the events described by the survey server instructions have occurred in the future. Examples of such events include:
    1. Elapsing of a certain amount of time:
      1. Keyed in for at least X minutes.
      2. Not being keyed in for at least Y days. (for next phase)
      3. Account created at least Y days ago. (for next phase)
      4. Not having completed the survey for more than X minutes. (potentially for a future phase)
    2. User interactions with the PSP:
      1. Opening the PSP. (potentially for a future phase)
      2. Changing a setting in the PSP. (potentially for a future phase)
    3. Complex interaction events in conjunction with a time-based element:
      1. Last survey completed at least Y days ago. (for next phase)
      2. Keying out after being keyed in for at least X minutes. (potentially for a future phase)
      3. Displaying a survey in X minutes after it has been paused by the user. (potentially for a future phase)
  3. It should be easy to add new instructions to the survey server and make the PSP parse and execute them.

In case it is determined that the requirements above can be fulfilled best by a third-party module, it should in addition conform to the following:

  1. Be compatible with the used Node and Electron versions.
  2. Be open-source and with a proper license.
  3. Not introduce too much complexity.

Potential Candidates

We reviewed several libraries which were very similar to each other as they were all rules engine. Below is a summary of what we found out:

Criterion / Module Drools json-rules-engine node-rules Nools Durable rules rulesengine
Meets requirements:
  • easy encoding of instructions
  • evaluate instructions
  • notify that the instruction event occurs in the future
  • ability to easily add new instructions
  • no (encoded in a custom format)

no need to investigate any further as this is a Java library
  • yes (as rules in JSON format)
  • yes
  • partially (facts need to be supplied to the rules engine continuosly)
  • yes (conjunction, disjunction, various arithmetic and boolean operators)
  • partially (uses a JSON-friendly format but the conditions and consequences need to be functions)
  • yes
  • partially (facts need to be supplied to the rules engine continuosly)
  • no (the condition and the consequence of a new rule need to be defined in terms of JavaScript)
  • no (specified programatically or encoded in a custom format)
  • yes
  • partially (facts need to be supplied to the rules engine continuosly)
  • no (because of the custom format and the requirements for JavaScript knowledge)
  • partially (JSON-like formats with conditions and actions defined as functions)
  • yes
  • partially (facts need to be supplied to the rules engine continuosly)
  • no (conditions and actions need to be provided as JavaScript functions)
  • partially (JSON-like formats with conditions and actions defined as functions)
  • yes
  • partially (facts need to be supplied to the rules engine continuosly)
  • no (conditions and actions need to be provided as JavaScript functions)
Node support no, java library yes yes yes yes yes
Community (contributors, stability, future)
  • over 120 contributors, 11 000 commits, 113 releases
  • regular releases every few months with new features and bugfixes
  • single contributor
  • steady commits
  • widely used (4k downloads for the last month)
  • few issues (last issue fix is from several days ago)
  • single contributor
  • broadly tested
  • widely used (3K downloads for last month)
  • 21 issues and no commits for an year
  • project is no longer maintained
  • almost no development since late 2016
  • widely used (over 2.5K dowloads for the last month)
  • 1 release
  • 38 open issues
  • 7 contributors
  • not much activity since September 2017
  • 1 release
  • 48 open issues in github
  • almost 800 downloads for the last month
  • single contributor
  • no commits for almost 2 years
  • a single release
  • no issues reported
  • low number of downloads (18 for the last month)
Complexity (syntax, minimalism) overly complex with AI algorithms, ontologies, inference engine (pattern matcher), etc. fairly simple syntax; dependent on several but minimalistic modules easy to understand; depends only on underscore module the syntax seems odd and complicated the schema seems a bit complicated fairly easy to understand and use; depends on the underscore and async modules
Tests yes, extensive yes, extensive yes yes, extensive yes, extensive yes
Documentation and examples extensive documentation, videos,forum, chat and product support and consulting extensive documentation and examples covering both simple and advanced usages good documentation and examples for simple and more complex usages extensive documentation and some examples extensive documentation and examples covering both simple and advanced usages simplistic documentation with almost no examples (example usages can be inferred by looking at the tests)
License Apache 2.0 ISC MIT MIT MIT MIT

Detailed Analysis

Most of the reviewed modules rely on rules (a combination of conditions and a set of actions/consequences) and facts (various pieces of application information). This is the typical workflow:

  1. А set of rules which depend on different facts is registered.
  2. A rules engine checks if the rules’ conditions are satisfied against the provided facts.
  3. If the conditions of a given rule are satisfied, its action set is executed.

The requirements section above places emphasis on survey instructions which are in most cases satisfied once a specified amount of time has passed. None of the reviewed modules provides such a functionality naturally. However, as they do provide means for condition evaluation, a custom system using one of the modules together with a polling strategy may be developed.

On the other hand, it seems possible to fulfill the requirements without using a third-party library. The solution in this case will use Infusion only and will require that some type of a scheduling mechanism is devised.

In order to compare the two techniques, we will examine how the complex event “Keying out after being keyed in for at least X minutes” can be detected. For the first approach we would be using the rules engine that comes with the json-rules-engine module as this module seems to be the best among the potential candidates judging by its popularity, documentation, examples and ease of use.

Polling

This approach introduces a facts manager - a component which will store application facts (e.g. time since key-in, time since account created, time since last survey taken, etc) and will continuously supply them to the rules engine. Below you can find the example workflow together with some code snippets:

  1. A user keys in and the PSP receives the survey trigger rules.
  2. The JSON payload for the scenario in question could look like this:

    {
        id: "id",
        conditions: {
            all: [{
                fact: "keyedInBefore",
                operator: "greaterThanInclusive",
                value: 5000
            }, {
                fact: "keyedOut",
                operator: "equal",
                value: true
            }]
        }
    }
    

    The schema is human-readable and easy to understand: there are 2 conditions which define the accepted values for the keyedInBefore and the keyedOut facts. Both conditions must be satisfied in order for the PSP to report that the trigger has occurred. This schema is inherited from the json-rules-engine. Thus, all of its operators can be supported without any further code changes.

  3. The rules’ conditions and success actions are registered with the rules engine:
  4. var Engine = require('json-rules-engine');
    
    var engine = new Engine();
    engine.addRule({conditions: instruction.conditions, event: {type: instruction.id}});
    engine.on("success", function () {
    	// callback function to be invoked when a rule’s conditions are satisfied
    });
    
  5. The facts manager starts tracking the application facts:
    1. Obtains/calculates their initial values
    2. Updates/polls them regularly on a timely basis
  6. The keyedInBefore fact can be updated each second to guarantee that it has an up-to-date value. On the other hand, the keyedOut fact need not be updated on a timely basis. Instead, it is sufficient for its value to get modified only during an explicit key out.

    How often a fact should be updated depends on the nature of the fact itself and can greatly vary across the different facts. Some may need to be updated once every minute whereas others do not need to be updated that frequently (e.g. once a week). This means that the fact manager can introduce different polling intervals for every fact.

    A further optimization would be to restrict the supported polling frequencies to some time unit. If this time unit is 1 minute, it will not be possible to define rules whose actions are performed, let’s say, 30 or 45 seconds after a particular event has happened. This may cause flexibility to suffer and could lead to loss of accuracy (i.e. some triggers may be reported later than they originally happened).

  7. Whenever a fact is updated, all application facts are supplied to the rules engine.
  8. The initial provisioning of application facts to the rules engine will happen a second after the user has keyed in and will look like this:

    engine.run({
        keyedInBefore: 1000,
        keyedOut: false
    });
    

    Let’s suppose that the user keys out about 7 seconds after keying in. In this case, the seventh provisioning of facts could look like this:

    engine.run({
        keyedInBefore: 7100,
        keyedOut: true
    });
    
  9. Given the new facts, the rules engine examines/evaluates whether the different rules’ conditions are satisfied
  10. When the user keys out, one last facts provisioning which includes the new state of the keyedOut fact is done. Finally, the survey trigger rules are cleared from the rules engine and the facts manager stops tracking facts.

Scheduling (Infusion only)

A minimalistic Infusion-only approach based on timers is also possible. It will include a simpler facts manager which will initially retrieve and store the facts and a survey trigger manager which will observe the triggers’ conditions. Unlike the polling approach, the facts will not be updated regularly on a timely basis. Here is an example workflow:

  1. A user keys in and the facts manager loads the facts.
  2. With this approach, in the PSP there will be two types of facts: static and dynamic. A static fact does not change once it is retrieved (e.g. keyedInTimestamp, accountCreatedTimestamp), whereas a dynamic fact’s state will change over time (e.g. keyedOut). Once a user has keyed in, static facts are obtained (probably from the GPII core via the PSP channel) and listeners are registered for the dynamic facts.

  3. Once facts are loaded, the survey triggers are requested from the survey server.
  4. Similarly to the polling approach a survey trigger is defined in a JSON format and has a human-readable schema but the the conditions object is simpler:

    {
        id: "id",
        conditions: [{
            type: "keyedInBefore",
            value: 5000
        }, {
            type: "keyedOut",
            value: true
        }]
    }
    

    Definition of alternative conditions is not possible with this format. However, using boolean algebra, triggers whose conditions are complex and include disjunction can be split into multiple triggers which only include conjunctions of conditions.

  5. The triggers are registered with the survey trigger manager which starts tracking whether their conditions are satisfied.
  6. Each trigger has a unique id. If a survey trigger already exists in the system and then a trigger with the same id is registered, the latter will override the former. This means that the conditions of the former trigger will no longer be observed.

    A dedicated trigger handler will be in charge of determining if a trigger has occurred. For each of the trigger’s conditions the trigger handler will delegate the responsibility of ascertaining whether this condition is satisfied to a specific condition handler. The exact condition handler will be determined by the type of the condition.

    Depending on the nature of the condition’s fact, the condition handler can:

    • schedule a timer
    • wait to be notified that the value of the fact has changed.

    For the payload above:

    • there will be a trigger handler with two dynamic condition handler subcomponents: keyedInBeforeHandler and keyedOutHandler.
    • depending on the value of the keyedInTimestamp fact, the keyedInBeforeHandler will register a timer for at most 5 seconds and will notify the trigger handler that the condition has been fulfilled once the time is up.
    • If the user keyes out the keyedOutHandler will notify the trigger handler that the keyedOut condition has been met.

    This also shows that there can be different condition handlers which observe the same fact in a different way.

  7. Once all conditions have been satisfied, the trigger itself is considered satisfied and the PSP passes this information to the survey server.
  8. When the user keys out, all the trigger handlers are cleared from the survey trigger manager.

Conclusion

Both approaches fulfill the requirements listed at the start of the document and provide enough flexibility and extensibility.

By far the biggest asset of json-rules-engine is that it can evaluate rules with complex (and even nested) conditions which include conjunction and disjunction. However, this comes at a price - the necessity to frequently poll the application facts. Furthermore, adding a third-party module to the project will bring in an additional code complexity.

The Infusion only approach includes inherently a kind of a scheduling mechanism. This enables the application to be notified asynchronously in the future when a given trigger has been satisfied while eliminating the need for constant look-up of facts.

At first, when the smart survey specification was not available, using a rules engine seemed better as it provided enough extensibility to support rules with a high level of complexity. Now as we have a clearer vision of the possible future requirements for the survey functionality, we would prefer to stick to the Infusion only approach.