[Subversion] / Trellis / README.txt  

View of /Trellis/README.txt

Parent Directory | Revision Log
Revision: 2200 - (download)
Fri Jun 23 04:33:08 2006 UTC (17 years, 10 months ago) by pje
File size: 17597 byte(s)
Refactor to use a Cells-like forward/backward propagation system.  
Separated thread-specific "current observer" state from trellis state.  
Tests still pass, but are far from sufficient and the docs are crap; 
you're better off not reading them yet.  Effectively this is just a 
checkpoint to have a jumping-off point for more extensive changes, like 
making values know what trellis they belong to, and redoing the 
docs/tests to maximize both line and scenario coverages.
===============================================================
Event-Driven Dependency Management with ``peak.events.trellis``
===============================================================

``peak.events.trellis`` is a dependency management framework for updating
values in a program.  When a value is changed, other values are automatically
updated, and code can be run to respond to value changes, somewhat like a
spreadsheet.

There are two things that are different from most other dependency-management
tools: synchronous updating, and automatic dependency discovery.

Synchronous updates means that all values are conceptually updated in lockstep,
such that there is never a time when the program's state is inconsistent due to
values being updated out of order.  In most event-driven systems, updates
are *asynchronous*, meaning that any value can change at any time, and some
values might change multiple times in response to a single input change,
leading to subtle bugs, especially as programs become more complex.

In contrast, synchronous updates ensure that every relevant value is updated
at most once for a change in a given input, making the system's dynamic
behavior easier to understand and far more likely to be bug-free.

Automatic dependency discovery means that there is no need to explicitly
"subscribe" or "listen" to anything.  Values that use other values in their
calculations automatically become dependent on the values they use.  If a
subsequent recalculation results in a new dependency relationship, that's kept
up-to-date also.

The concepts and algorithms used are courtesy of two systems, the "trellis
process architecture" described by David Gelernter in his book, "Mirror
Worlds", and the Lisp "Cells" library by Ken Tilton.  ``peak.events.trellis``
also takes some of its terminology from the Smalltalk "Value Model" frameworks.


.. contents:: **Table of Contents**


--------------
Programmer API
--------------

Value Objects
=============

``Value`` objects are the basis of the framework.  Each holds a value upon
which other values may depend.  There are three basic value types, ``Value``,
``Input``, and ``Constant``::

    >>> from peak.events.trellis import Value, Input, Constant

``Input`` objects simply hold a value::

    >>> v = Input(42)
    >>> v.value
    42

but ``Value`` objects use a *rule* to determine their value::

    >>> v_times_two = Value(lambda: v.value * 2)
    >>> v_times_two.value
    84

And that value always reflects the current values of any values that it depends
on::

    >>> v.value = 23
    >>> v_times_two.value
    46

``Constant`` values are initialized with a value, and are unchangeable
thereafter::

    >>> c = Constant(99)
    >>> c.value
    99
    >>> c.value -= 1
    Traceback (most recent call last):
      ...
    AttributeError: Constants can't be changed

Most of the time you won't use ``Constant`` objects directly, though, because
``Value`` objects automatically become ``Constant`` when they no longer depend
on any other ``Value`` or ``Input`` for their calculation.


-------------------
Internals and Tests
-------------------

Update State
============

The ``Trellis`` class manages the logic of updating::

    >>> from peak.events.trellis import Trellis
    >>> t = Trellis()

The principal job of a ``Trellis`` is to co-ordinate value changes by deferring
recursive updates and recalculations until any in-progress updates or
recalculations have been completed.

The process of propagating changes across a trellis's value holders is called a
"pulse".  Each pulse is numbered sequentially, beginning with zero::

    >>> t.pulse
    0

A pulse occurs during a change notification
Change propagation within a trellis can only happen during a pulse, but changes
to data or system state that is *outside* a trellis can only be processed
between pulses.
the tr

This process entails two phases when the last recursive update is completed:

1. Deferred recalculations are processed to complete all change propagation
2. Observers and deferred updates are invoked in a new update cycle.

Changes are handled using the ``call_between_pulses()`` method, which takes a
callback function and any number of positional arguments.  If a pulse is not
in progress, the callback is invoked immediately with the supplied arguments::

    >>> def hello(*args): print args
    >>> def what_version(): print "version =", t.pulse

    >>> t.call_between_pulses(hello, "Hello, world!")
    ('Hello, world!',)

    >>> t.call_between_pulses(what_version)
    version = 0

The ``with_propagation()`` method takes a callback and any number of positional
arguments.  The callback is run in an "update" context, meaning that there is
a new version number::

    >>> t.with_propagation(what_version)
    version = 1

Unless the call to ``with_propagation()`` is already inside a call to
``with_propagation()``::

    >>> t.with_propagation(t.with_propagation, what_version)
    version = 2

If a ``call_between_pulses()`` call occurs during a pulse, its execution is
deferred until all outstanding updates are finished, at which point all the
callbacks are run in the order they were given, in a *new* managed update::

    >>> def do_it():
    ...     what_version()
    ...     t.call_between_pulses(what_version)
    ...     t.call_between_pulses(hello, "Hello, world!")
    ...     t.call_between_pulses(hello, "Goodbye", "World!")
    ...     print "finished do_it"

    >>> t.with_propagation(do_it)
    version = 3
    finished do_it
    version = 4
    ('Hello, world!',)
    ('Goodbye', 'World!')

And of course this deferred callback queue is cleared between runs::

    >>> t.with_propagation(lambda: None)
    >>> what_version()
    version = 5

``call_between_pulses()`` callbacks can also be registered while callbacks are taking
place, and they will all be called in another version::

    >>> def do_it():
    ...     what_version()
    ...     t.call_between_pulses(what_version)
    ...     t.call_between_pulses(t.call_between_pulses, what_version)
    ...     t.call_between_pulses(t.call_between_pulses, what_version)
    ...     print "finished do_it"
    >>> t.with_propagation(do_it)
    version = 6
    finished do_it
    version = 7
    version = 8
    version = 8

Essentially, this means that ``with_propagation()`` recurses until no new
``call_between_pulses()`` callbacks are registered.  That is, ``with_propagation()``
tries to return the system to a stable state by repeatedly firing callbacks
until there is nothing that still needs calling back.


Data Pulse Axioms
=================

Overview: updates must be synchronous (all changed values are updated at
once), consistent (no rule sees out of date values), and minimal (only
necessary rules run).

1. Per-Cell "As Of" Value: Every value has a "current-as-of" update count, that
is initialized with a value that is less than the global update count will ever
be::

    >>> v = Input(42)
    >>> def rule(): print "computing"; return v.value
    >>> c = Value(rule)
    >>> c.pulse
    -1

2. Global Update Counter: There is a global update counter that is incremented
whenever an ``Input`` value is changed.  This guarantees that there is a
globally-consistent notion of the "time" at which updates occur::

    >>> from peak.events.trellis import state

    >>> start = state.pulse
    >>> state.pulse - start
    0

3. Inputs Move The System Forward: When an ``Input`` changes, it increments
the global update count and stores the new value in its own update count::

    >>> i = Input(22)
    >>> state.pulse - start
    1

    >>> i.pulse - start
    1

    >>> i.value = 21
    >>> i.pulse - start
    2
    >>> state.pulse - start
    2

4. XXX Out-of-dateness: A value is out of date if its update count is lower than
the update count of any of the values it depends on.

5. Out-of-date Before: When a ``Value`` object's ``.value`` is queried, its
rule is only run if the value is out of date; otherwise a cached previous value
is returned.  This guarantees that a rule is not run unless its dependencies
have changed since the last time the rule was run::

    >>> c.value
    computing
    42

6. Up-to-date After: Once a ``Value`` object's rule is run (or its ``.value``
is set, if it is an ``Input``), its update count must be equal to the global
update count.  This guarantees that a rule will not run more than once per
update::

    >>> c.pulse == state.pulse
    True

    >>> c.value
    42


Dependency Management Axioms
============================

Overview: values automatically notice when other value depend on them, then
notify them at most once if there is a change.

1. Thread-local "current rule": There is a thread-local variable that always
   contains the ``Value`` whose rule is currently being evaluated in the
   corresponding thread.  This variable can be empty (e.g. None)::

    >>> from peak.events.trellis import current_observer
    >>> print current_observer()
    None

2. "Currentness" Maintenance: While a ``Value`` object's rule is being run, the
   variable described in #1 must be set to point to the ``Value`` whose rule is
   being run.  When the rule is finished, the variable must be restored to
   whatever value it had before the rule began.  (Guarantees that values will
   be able to tell who is using them::

    >>> def rule():
    ...     print "computing", current_observer()
    >>> v = Value(rule)

    >>> print current_observer()   # between calculations
    None

    >>> v.value                 # during calculations
    computing <peak.events.trellis.Value object at ...>

    >>> print current_observer()   # returns to None
    None

    >>> def rule1():
    ...     print "computing r1?", current_observer() is r1
    ...     v = r2.value
    ...     print "r2 value =", v
    ...     print "computing r1?", current_observer() is r1

    >>> def rule2():
    ...     print "computing r2?", current_observer() is r2
    ...     return 42

    >>> r1, r2 = Value(rule1), Value(rule2)
    >>> r1.value        # verify that this works recursively
    computing r1? True
    computing r2? True
    r2 value = 42
    computing r1? True

3. Dependency Creation: When a value is read, it adds the "currently-being
   evaluated" value as a listener that it will notify of changes::

    >>> v1 = Input(99)
    >>> v2 = Value(lambda: v1.value * 2)
    >>> v3 = Value(lambda: v2.value * 2)

    >>> v1.listeners()
    []
    >>> v2.listeners()
    []
    >>> v3.listeners()
    []

    >>> v2.value    # causes v1 to depend on v2
    198
    >>> v1.listeners() == [v2]
    True

    >>> v2.listeners()
    []
    >>> v3.value
    396

    >>> v2.listeners() == [v3]
    True

4. Dependency Creation Order: New listeners are added only *after* the value
   being read has brought itself up-to-date, and notified any *previous*
   listeners of the change.  This ensures that the listening value does not
   receive redundant notification if the listened-to value has to be brought
   up-to-date first::

    >>> i1 = Input(1)
    >>> r1 = Value(lambda: i1.value)
    >>> r2 = Value(lambda x=[]: x or x.append(r1.value))    # one-time rule
    >>> print r2.value
    None

    >>> i1.listeners() == [r1]   # r1 is i1's only listener
    True
    >>> r1.listeners() == [r2]   # r2 is r1's only listener
    True

    >>> def showme():
    ...     r1.value    # r3 will be a listener of r1 now
    ...     if r1.listeners()==[r3]:
    ...         print "listeners of r1==[r3]"
    ...     elif r3 in r1.listeners():
    ...         print "subscribed"
    ...     if r1.pulse >= state.pulse: print "r1 is up-to-date"
    ...     if r2.pulse >= state.pulse: print "r2 is up-to-date"

    >>> r3 = Value(showme)
    >>> r3.value
    subscribed
    r1 is up-to-date
    r2 is up-to-date

    >>> i1.value = 2    # r2 will be notified and unsubscribed before listening
    listeners of r1==[r3]
    r1 is up-to-date
    r2 is up-to-date

    >>> i1.value = 3    # and r2 will not keep up with r1 any more
    listeners of r1==[r3]
    r1 is up-to-date

    >>> type(r2)        # because it's now a constant
    <class 'peak.events.trellis.Constant'>

5. Dependency Minimalism: A listener should only be added if it is not already
   present in the value's listener collection.  This isn't strictly
   mandatory, the system behavior will be correct but inefficient if this
   requirement isn't met::

   >>> i1 = Input(1)
   >>> r1 = Value(lambda: i1.value + i1.value)
   >>> r1.value
   2
   >>> i1.listeners()==[r1]
   True

6. Dependency Removal: Just before a value's rule is run, it must cease to be a
   listener for any other values.  (Guarantees that a dependency from a
   previous update cannot trigger an unnecessary repeated calculation.)

7. Dependency Notification: Whenever a ``Value`` or ``Input`` changes, it must
   notify all of its listeners that it has changed, in such a way that *none*
   of the listeners are asked to recalculate their value until *all* of
   the listeners have first been notified of the change.  (This guarantees
   that inconsistent views cannot occur.)

7a. Deferred Recalculation: The recalculation of listeners (not the
    notification of the listeners' out-of-dateness) must be deferred if a rule
    is currently being evaluated.  When the last rule being evaluated is
    finished, all deferred recalculations must occur.  This guarantees that in
    the absence of circular dependencies, no cell can ask for a value that's in
    the process of being calculated.)

8. One-Time Notification Only: A value's listeners are removed from its
   listener collection as soon as they have been notified.  In particular, the
   value's collection of listeners must be cleared *before* *any* of the
   listeners are asked to recalculate themselves.  (This guarantees that
   listeners reinstated as a side effect of recalculation will not get a
   duplicate notification in the current update, or miss a notification in a
   future update.)

9. Conversion to Constant: If a ``Value`` object's rule is run and no
   dependencies were created, it must become a ``Constant``, doing no further
   listener additions or notification, once any necessary notifications to
   existing listeners are completed.  That is, if the rule's run changed its
   value, it must notify its existing listeners, but then the listener
   collection must be cleared -- *again*, in addition to the clearing described
   in #8.  (See #4 for the actual test of this.)

10. No Changes During Notification: It is an error to change an ``Input`` value
    while change notifications are taking place.

11. Weak Notification: Automatically created inter-value links must not inhibit
    garbage collection of either value::

    >>> del v3
    >>> v2.listeners()
    []

    >>> del v2
    >>> v1.listeners()
    []


Update Algorithm
================

A value must be computed if and only if it is out of date; otherwise, it should
just return its previous cached value.  A value is out of date if a
value it depends on has changed since the value was last calculated.  The
``pulse`` attribute tracks the version the value was last calculated "as of",
and the ``needs`` attribute records the latest version of any value this value
depends on.  If ``needs>=pulse``, the value is out of date::

    >>> x = Input(23)
    >>> def rule():
    ...     print "computing y from x"
    ...     return x.value * 2
    >>> y = Value(rule)
    >>> y.value
    computing y from x
    46

    >>> y.value
    46

    >>> x.value = 10
    computing y from x
    >>> y.value
    20

    >>> x.value = 10
    >>> y.value
    20

    >>> def rule2():
    ...     print "computing z from y"
    ...     return y.value - 1
    >>> z = Value(rule2)
    >>> z.value
    computing z from y
    19

    >>> x.value = 7
    computing y from x
    computing z from y


When a value changes, the values that depend on it must be brought up to date.
To ensure that no stale values can ever be seen, values must be marked
"out of date" before any values are recomputed.  Thus, update notification
has two phases: first, the listeners of a value are marked out-of-date, and
only then does recomputation occur.  This breadth-first traversal ensures that
inter-value dependencies can't cause a stale value to be seen, as might happen
if updates were done depth-first.

A value must not be marked out of date using a dependency that no longer
exists, however.  For example, if a value C depends on values A and B, and A
changes, then later B, the change to B should not mark C out-of-date unless
a new dependency was set up.

    >>> def C_rule():
    ...     print "computing",
    ...     if A.value<5:
    ...         print A.value, B.value
    ...     else:
    ...         print "...done"

    >>> A = Input(1)
    >>> B = Input(2)
    >>> C = Value(C_rule)

    >>> C.value
    computing 1 2
    >>> C.value

    >>> A.value = 3
    computing 3 2

    >>> B.value = 4
    computing 3 4

    >>> A.value = 5
    computing ...done

    >>> B.value = 6     # nothing happens, since C no longer depends on B
    >>> A.value = 3
    computing 3 6

    >>> B.value = 7     # but now it's back depending on B again.
    computing 3 7

    >>> B.value = 7
    >>> A.value = 1
    computing 1 7

    >>> A.value = 1


TODO
====

* Allow custom comparison function for "changedness"

* Allow rules to see their value owner, previous value, ???

* Value attributes

* Circular dependency checking

* Observers

* Values should belong to a specific trellis, rather than a single global one

* Rollback


cvs-admin@eby-sarna.com

Powered by ViewCVS 1.0-dev

ViewCVS and CVS Help