Monday, November 21, 2011

Primary Key Variables and Rule Activations in Pachinko

While Pachinko is a Rete-inspired rule engine, it contains features that make it suitable for stream-oriented event processing. One such feature is its use of Primary Key variables.

In the simplest form of rule engines, a rule fires any time a fact the rule depends on is modified. This is how Pachinko operates if all the rule parameters used are Variables.

However, in a stream processing environment, it's not always desirable to have a rule fire any time any one of its parameters changes. For example, a rule which compares the share price of a seldom traded stock against the current Dow Jones average would require significant compute resources if it recalculated any time Either the share price OR the Dow changed, since even though the share price of the stock might only change a couple of times a day, the Dow is fluctuating continuously.

In that case, what's desired is a way to tell the engine to only recalculate when an IMPORTANT value (the share price) changes, and to use the most recent unimportant value (the Dow) but don't bother recalculating every time the unimportant value changes.

To make Pachinko do this, use PKVariables for the important values and Variables for the unimportant values. The rule won't fire at all until all variables have received values, but thereafter, the rule will only recalculate when one of the PKVariables changes value.

To get the latest version of Pachinko incorporating this feature and sample JUnit tests exercising it, go here.

Thursday, November 17, 2011

PACHINKO high-speed embeddable rule engine

I just released the first version of PACHINKO into the wild. You can find it up on GitHub at:

Get Pachinko

Pachinko is a small-footprint Rete-inspired rule engine runtime designed to be embedded in your java code. It accepts rules written in Java and executes them in a very fast, single-threaded fashion. It is lightweight enough that for multicore applications, a separate instance of it can be executed per thread and rules can access shared state.

Performance is currently (on a Macbook Core2i7) sub-200 nanoseconds for a single rule evaluation, and about 9 microseconds for evaluation of a single PrimaryKey rule in a corpus of 1000 rules on the same variable. (This latter performance number should improve considerably in the next version.)

Some of its interesting features include:

- It is built on top of the ROUX monadic function library, which is in large part why it is so fast. Using this library, there is no copying of values between alpha and beta memories, and no hash lookups.

- Persistence of the rule state can be easily accommodated by persisting the changed monads between rule invocations.

- Rules can be written in plain java code by subclassing DefaultCARule, or by dynamically assembling a monadic expression which is passed to the rule, allowing for light-weight UI-driven rule specification at runtime.

- Rules can be added or changed while the engine is running.

- Because it uses a Rete-inspired dependency graph for feeding state to the free variables of rules, considerable economies of both scaling and evaluation can be achieved. Rule conditions are not evaluated unless all free variables have current state present.

- Rule activation can be controlled in a manner amenable to performant stream processing. By default rules activate when all their free variables are bound to a current value. However, if one or more variables are defined as PKVariables, they are used as keys. A rule with one or more PK variables will not activate until all of its free variables are bound to a current value, however it will be re-activated whenever one of PKVariables changes value. Changes in value to ordinary variables have no effect on activation.

- For the common case where a number of rules are defined on a single fact, but each is expected to fire only when the fact attains a certain value, there is an optimization available. By defining the variable as a PKVariable and providing an ActivationValue for it, the rule condition will not be evaluated unless the value of the PKVariable is equal to its ActivationValue.

Here is a simple example of a Pachinko java rule:

public class StartEventRule extends DefaultCARule {
  int _event = -1;
  int _status = -1;

  public StartEventRule() {
    super();
    _event = addPkVariable(new PKVariable("EVENT", "StartEvent"), "StartEvent");
    _status = addOptionalVariable(new Variable("STATUS", "NOT_STARTED"));
  }

  @Override
  public boolean evaluateCondition(IMonadex context) {
    return context.bindValue(_event).equals("StartEvent");
  }

  @Override
  public void doAction(IReadWriteMonadex context) {
    context.returnValue(_status, "STARTED");
  }
}

And an example of the above rule in use:

public void simpleStartEventTest() {
  // Initialize rule system with its set of rules:
  CARuleSystem ruleSystem = new CARuleSystem(new StartEventRule());

  // Set some data into the rule system and process any resulting activations...
  IReadWriteMonadex readWriteContext = ruleSystem.freeVariables();
  readWriteContext.returnValue("EVENT", "IdleEvent");
  ruleSystem.executeActivations();

  // Verify that the rule system did not change state...
  assertEqual("NOT_STARTED", readWriteContext.getMonad("STATUS").bindValue(readWriteContext));

  // Now do it again, only with the expected value for EVENT...
  readWriteContext.returnValue("EVENT", "StartEvent");
  ruleSystem.executeActivations();

  // Verify that the rule system did change state this time:
  assertEqual("STARTED", readWriteContext.getMonad("STATUS").bindValue(readWriteContext));
}