[Subversion] / BytecodeAssembler / README.txt

Diff of /BytecodeAssembler/README.txt

-version 2194, Sat Jun 17 06:19:20 2006 UTC
+version 2457, Sun Dec 30 20:10:36 2007 UTC
  Line 9
  Line 9
  Line 9
  bytecode instead of on these mechanical issues.
  In addition to a low-level opcode-oriented API for directly generating specific
- bytecodes, this module also offers an extensible mini-AST framework for
+ Python bytecodes, this module also offers an extensible mini-AST framework for
  generating code from high-level specifications.  This framework does most of
  the work needed to transform tree-like structures into linear bytecode
  instructions, and includes the ability to do compile-time constant folding.
+ Changes since version 0.2:
+ * Added ``Suite``, ``TryExcept``, and ``TryFinally`` node types
+ * Added a ``Getattr`` node type that does static or dynamic attribute access
+   and constant folding
+ * Fixed ``code.from_function()`` not copying the ``co_filename`` attribute when
+   ``copy_lineno`` was specified.
+ * The ``repr()`` of AST nodes doesn't include a trailing comma for 1-argument
+   node types any more.
+ * Added a ``Pass`` symbol that generates no code, a ``Compare()`` node type
+   that does n-way comparisons, and ``And()`` and ``Or()`` node types for doing
+   logical operations.
+ * The ``COMPARE_OP()`` method now accepts operator strings like ``"<="``,
+   ``"not in"``, ``"exception match"``, and so on, as well as numeric opcodes.
+   See the standard library's ``opcode`` module for a complete list of the
+   strings accepted (in the ``cmp_op`` tuple).  ``"<>"`` is also accepted as an
+   alias for ``"!="``.
+ * Added code to verify that forward jump offsets don't exceed a 64KB span, and
+   support absolute backward jumps to locations >64KB.
+ Changes since version 0.1:
+ * Constant handling has been fixed so that it doesn't confuse equal values of
+   differing types (e.g. ``1.0`` and ``True``), or equal unhashable objects
+   (e.g. two empty lists).
+ * Removed ``nil``, ``ast_curry()`` and ``folding_curry()``, replacing them with
+   the ``nodetype()`` decorator and ``fold_args()``; please see the docs for
+   more details.
+ * Added stack tracking across jumps, globally verifying stack level prediction
+   consistency and automatically rejecting attempts to generate dead code.  It
+   should now be virtually impossible to accidentally generate bytecode that can
+   crash the interpreter.  (If you find a way, let me know!)
  Changes since version 0.0.1:
  * Added massive quantities of new documentation and examples
- Line 35
+ Line 76
  Line 35
  Line 76
  * Various bug fixes
  There are a few features that aren't tested yet, and not all opcodes may be
- fully supported.  Notably, the following features are still NOT reliably
+ fully supported.  Also note the following limitations:
- supported yet:
- * Wide jump addressing (for generated bytecode>64K in size)
+ * Jumps to as-yet-undefined labels cannot span a distance greater than 65,535
+   bytes.
  * The ``dis()`` module in Python 2.3 has a bug that makes it show incorrect
    line numbers when the difference between two adjacent line numbers is
- Line 190
+ Line 231
  Line 190
  Line 231
  method::
      >>> c = Code()
-     >>> where = c.here()         # get a location at the start of the code
      >>> c.LOAD_CONST(42)
+     >>> where = c.here()         # get a location near the start of the code
+     >>> c.DUP_TOP()
+     >>> c.POP_TOP()
      >>> c.JUMP_ABSOLUTE(where)   # now jump back to it
      >>> dis(c.code())
-    >>    0 LOAD_CONST               1 (42)
+          0 LOAD_CONST               1 (42)
-JUMP_ABSOLUTE            0
+             >>    3 DUP_TOP
+POP_TOP
+JUMP_ABSOLUTE            3
  But if you are jumping *forward*, you will need to call the jump or setup
  method without any arguments.  The return value will be a "forward reference"
- Line 205
+ Line 249
  Line 205
  Line 249
  been reached::
      >>> c = Code()
-     >>> forward = c.JUMP_ABSOLUTE() # create a jump and a forward reference
+     >>> c.LOAD_CONST(99)
+     >>> forward = c.JUMP_IF_TRUE() # create a jump and a forward reference
      >>> c.LOAD_CONST(42)            # this is what we want to skip over
+     >>> c.POP_TOP()
      >>> forward()   # calling the reference changes the jump to point here
      >>> c.LOAD_CONST(23)
      >>> c.RETURN_VALUE()
      >>> dis(c.code())
-          0 JUMP_ABSOLUTE            6
+          0 LOAD_CONST               1 (99)
-LOAD_CONST               1 (42)
+JUMP_IF_TRUE             4 (to 10)
-             >>    6 LOAD_CONST               2 (23)
+LOAD_CONST               2 (42)
-RETURN_VALUE
+POP_TOP
+             >>   10 LOAD_CONST               3 (23)
+RETURN_VALUE
      >>> eval(c.code())
- Line 246
+ Line 294
  Line 246
  Line 294
      >>> c.LOAD_CONST(None)  # in real code, this'd be a Python code constant
      >>> c.MAKE_CLOSURE(0,2) # no defaults, 2 free vars in the new function
+ The ``COMPARE_OP`` method takes an argument which can be a valid comparison
+ integer constant, or a string containing a Python operator, e.g.::
+     >>> c = Code()
+     >>> c.LOAD_CONST(1)
+     >>> c.LOAD_CONST(2)
+     >>> c.COMPARE_OP('not in')
+     >>> dis(c.code())
+          0 LOAD_CONST               1 (1)
+LOAD_CONST               2 (2)
+COMPARE_OP               7 (not in)
+ The full list of valid operator strings can be found in the standard library's
+ ``opcode`` module.  ``"<>"`` is also accepted as an alias for ``"!="``::
+     >>> c.LOAD_CONST(3)
+     >>> c.COMPARE_OP('<>')
+     >>> dis(c.code())
+          0 LOAD_CONST               1 (1)
+LOAD_CONST               2 (2)
+COMPARE_OP               7 (not in)
+LOAD_CONST               3 (3)
+COMPARE_OP               3 (!=)
  High-Level Code Generation
  ==========================
- Line 279
+ Line 351
  Line 279
  Line 351
 LOAD_CONST               0 (None)
 LOAD_CONST               8 (<code object <lambda> at ...>)
+ Note that although some values of different types may compare equal to each
+ other, ``Code`` objects will not substitute a value of a different type than
+ the one you requested::
+     >>> c = Code()
+     >>> c(1, True, 1.0, 1L)     # equal, but different types
+     >>> dis(c.code())
+          0 LOAD_CONST               1 (1)
+LOAD_CONST               2 (True)
+LOAD_CONST               3 (1.0)
+LOAD_CONST               4 (1L)
  Simple Containers
  -----------------
- Line 323
+ Line 406
  Line 323
  Line 406
  a constant, rather than generating code to recreate the tuple using a series of
  ``LOAD_CONST`` operations followed by a ``BUILD_TUPLE``.
+ If the value wrapped in a ``Const`` is not hashable, it is compared by identity
+ rather than value.  This prevents equal mutable values from being reused by
+ accident, e.g. if you plan to mutate the "constant" values later::
+     >>> c = Code()
+     >>> c(Const([]), Const([]))     # equal, but not the same object!
+     >>> dis(c.code())
+          0 LOAD_CONST               1 ([])
+LOAD_CONST               2 ([])
+ Thus, although ``Const`` objects hash and compare based on equality for
+ hashable types::
+     >>> hash(Const(3)) == hash(3)
+     True
+     >>> Const(3)==Const(3)
+     True
+ They hash and compare based on object identity for non-hashable types::
+     >>> c = Const([])
+     >>> hash(c) == hash(id(c.value))
+     True
+     >>> c == Const(c.value)     # compares equal if same object
+     True
+     >>> c == Const([])          # but is not equal to a merely equal object
+     False
+ ``Suite`` and ``Pass``
+ ----------------------
+ On occasion, it's helpful to be able to group a sequence of opcodes,
+ expressions, or statements together, to be passed as an argument to other node
+ types.  The ``Suite`` node type accomplishes this::
+     >>> from peak.util.assembler import Suite, Pass
+     >>> c = Code()
+     >>> c.return_(Suite([Const(42), Code.DUP_TOP, Code.POP_TOP]))
+     >>> dis(c.code())
+          0 LOAD_CONST               1 (42)
+DUP_TOP
+POP_TOP
+RETURN_VALUE
+ And ``Pass`` is a shortcut for an empty ``Suite``, that generates nothing::
+     >>> Suite([])
+     Pass
+     >>> c = Code()
+     >>> c(Pass)
+     >>> c.return_(None)
+     >>> dis(c.code())
+          0 LOAD_CONST               0 (None)
+RETURN_VALUE
  Local and Global Names
  ----------------------
- Line 371
+ Line 512
  Line 371
  Line 512
 LOAD_DEREF               1 (z)
+ Obtaining Attributes
+ --------------------
+ The ``Getattr`` node type takes an expression and an attribute name.  The
+ attribute name can be a constant string, in which case a ``LOAD_ATTR`` opcode
+ is used, and constant folding is done if possible::
+     >>> from peak.util.assembler import Getattr
+     >>> c = Code()
+     >>> c(Getattr(Local('x'), '__class__'))
+     >>> dis(c.code())
+          0 LOAD_FAST                0 (x)
+LOAD_ATTR                0 (__class__)
+     >>> Getattr(Const(object), '__class__') # const expression, const result
+     Const(<type 'type'>)
+ Or the attribute name can be an expression, in which case a ``getattr()`` call
+ is compiled instead::
+     >>> c = Code()
+     >>> c(Getattr(Local('x'), Local('y')))
+     >>> dis(c.code())
+          0 LOAD_CONST               1 (<built-in function getattr>)
+LOAD_FAST                0 (x)
+LOAD_FAST                1 (y)
+CALL_FUNCTION            2
  Calling Functions and Methods
  -----------------------------
- Line 439
+ Line 611
  Line 439
  Line 611
      >>> c = Code()
      >>> c.return_()
+     >>> dis(c.code())
+          0 LOAD_CONST               0 (None)
+RETURN_VALUE
+     >>> c = Code()
      >>> c( Return() )
      >>> dis(c.code())
           0 LOAD_CONST               0 (None)
 RETURN_VALUE
-LOAD_CONST               0 (None)
-RETURN_VALUE
  Labels and Jump Targets
- Line 455
+ Line 630
  Line 455
  Line 630
  current location.  For example::
      >>> c = Code()
-     >>> forward = c.JUMP_FORWARD()
+     >>> c.LOAD_CONST(99)
-     >>> c( 1, 2, forward, Return(3) )
+     >>> forward = c.JUMP_IF_FALSE()
+     >>> c( 1, Code.POP_TOP, forward, Return(3) )
      >>> dis(c.code())
-          0 JUMP_FORWARD             6 (to 9)
+          0 LOAD_CONST               1 (99)
-LOAD_CONST               1 (1)
+JUMP_IF_FALSE            4 (to 10)
-LOAD_CONST               2 (2)
+LOAD_CONST               2 (1)
-              >>   9 LOAD_CONST               3 (3)
+POP_TOP
-RETURN_VALUE
+             >>   10 LOAD_CONST               3 (3)
+RETURN_VALUE
  However, there's an easier way to do the same thing, using ``Label`` objects::
- Line 470
+ Line 647
  Line 470
  Line 647
      >>> c = Code()
      >>> skip = Label()
-     >>> c(skip.JUMP_FORWARD, 1, 2, skip, Return(3))
+     >>> c(99, skip.JUMP_IF_FALSE, 1, Code.POP_TOP, skip, Return(3))
      >>> dis(c.code())
-          0 JUMP_FORWARD             6 (to 9)
+          0 LOAD_CONST               1 (99)
-LOAD_CONST               1 (1)
+JUMP_IF_FALSE            4 (to 10)
-LOAD_CONST               2 (2)
+LOAD_CONST               2 (1)
-              >>   9 LOAD_CONST               3 (3)
+POP_TOP
-RETURN_VALUE
+             >>   10 LOAD_CONST               3 (3)
+RETURN_VALUE
  This approach has the advantage of being easy to use in complex trees.
  ``Label`` objects have attributes corresponding to every opcode that uses a
- Line 492
+ Line 670
  Line 492
  Line 670
      AssertionError: Label previously defined
+ N-Way Comparisons
+ -----------------
+ You can generate N-way comparisons using the ``Compare()`` node type::
+     >>> from peak.util.assembler import Compare
+     >>> c = Code()
+     >>> c(Compare(Local('a'), [('<', Local('b'))]))
+     >>> dis(c.code())
+          0 LOAD_FAST                0 (a)
+LOAD_FAST                1 (b)
+COMPARE_OP               0 (<)
+-way comparisons generate code that's a bit more complex.  Here's a three-way
+ comparison (``a<b<c``)::
+     >>> c = Code()
+     >>> c.return_(Compare(Local('a'), [('<', Local('b')), ('<', Local('c'))]))
+     >>> dis(c.code())
+          0 LOAD_FAST                0 (a)
+LOAD_FAST                1 (b)
+DUP_TOP
+ROT_THREE
+COMPARE_OP               0 (<)
+JUMP_IF_FALSE           10 (to 24)
+POP_TOP
+LOAD_FAST                2 (c)
+COMPARE_OP               0 (<)
+JUMP_FORWARD             2 (to 26)
+             >>   24 ROT_TWO
+POP_TOP
+             >>   26 RETURN_VALUE
+ And a four-way (``a<b>c!=d``)::
+     >>> c = Code()
+     >>> c.return_(
+     ...     Compare( Local('a'), [
+     ...         ('<', Local('b')), ('>', Local('c')), ('!=', Local('d'))
+     ...     ])
+     ... )
+     >>> dis(c.code())
+          0 LOAD_FAST                0 (a)
+LOAD_FAST                1 (b)
+DUP_TOP
+ROT_THREE
+COMPARE_OP               0 (<)
+JUMP_IF_FALSE           22 (to 36)
+POP_TOP
+LOAD_FAST                2 (c)
+DUP_TOP
+ROT_THREE
+COMPARE_OP               4 (>)
+JUMP_IF_FALSE           10 (to 36)
+POP_TOP
+LOAD_FAST                3 (d)
+COMPARE_OP               3 (!=)
+JUMP_FORWARD             2 (to 38)
+             >>   36 ROT_TWO
+POP_TOP
+             >>   38 RETURN_VALUE
  Constant Detection and Folding
  ==============================
- Line 517
+ Line 759
  Line 517
  Line 759
      >>> const_value(Local('x'))
      Traceback (most recent call last):
        ...
-     NotAConstant: <bound method str.Local of 'x'>
+     NotAConstant: Local('x')
  Tuples of constants are recursively replaced by constant tuples::
- Line 532
+ Line 774
  Line 532
  Line 774
      >>> const_value( (1,Global('y')) )
      Traceback (most recent call last):
        ...
-     NotAConstant: <bound method str.Global of 'y'>
+     NotAConstant: Global('y')
  As do any types not previously described here::
- Line 547
+ Line 789
  Line 547
  Line 789
      [1, 2]
+ Folding Function Calls
+ ----------------------
  The ``Call`` wrapper can also do simple constant folding, if all of its input
  parameters are constants.  (Actually, the `args` and `kwargs` arguments must be
  *sequences* of constants and 2-tuples of constants, respectively.)
- Line 555
+ Line 800
  Line 555
  Line 800
  ``Const`` node instead of a ``Call`` node::
      >>> Call( Const(type), [1] )
-     <bound method type.Const of <type 'int'>>
+     Const(<type 'int'>)
  Thus, you can also take the ``const_value()`` of such calls::
- Line 566
+ Line 811
  Line 566
  Line 811
  passed in to another ``Call``::
      >>> Call(Const(type), [Call( Const(dict), [], [('x',27)] )])
-     <bound method type.Const of <type 'dict'>>
+     Const(<type 'dict'>)
  Notice that this folding takes place eagerly, during AST construction.  If you
  want to implement delayed folding after constant propagation or variable
- Line 604
+ Line 849
  Line 604
  Line 849
  ``globals()``, in other words.
+ Logical And/Or
+ --------------
+ You can evaluate logical and/or expressions using the ``And`` and ``Or`` node
+ types::
+     >>> from peak.util.assembler import And, Or
+     >>> c = Code()
+     >>> c.return_( And([Local('x'), Local('y')]) )
+     >>> dis(c.code())
+          0 LOAD_FAST                0 (x)
+JUMP_IF_FALSE            4 (to 10)
+POP_TOP
+LOAD_FAST                1 (y)
+             >>   10 RETURN_VALUE
+     >>> c = Code()
+     >>> c.return_( Or([Local('x'), Local('y')]) )
+     >>> dis(c.code())
+          0 LOAD_FAST                0 (x)
+JUMP_IF_TRUE             4 (to 10)
+POP_TOP
+LOAD_FAST                1 (y)
+             >>   10 RETURN_VALUE
+ True or false constants are folded automatically, avoiding code generation
+ for intermediate values that will never be used in the result::
+     >>> c = Code()
+     >>> c.return_( And([1, 2, Local('y')]) )
+     >>> dis(c.code())
+          0 LOAD_FAST                0 (y)
+RETURN_VALUE
+     >>> c = Code()
+     >>> c.return_( And([1, 2, Local('y'), 0]) )
+     >>> dis(c.code())
+          0 LOAD_FAST                0 (y)
+JUMP_IF_FALSE            4 (to 10)
+POP_TOP
+LOAD_CONST               1 (0)
+             >>   10 RETURN_VALUE
+     >>> c = Code()
+     >>> c.return_( Or([1, 2, Local('y')]) )
+     >>> dis(c.code())
+          0 LOAD_CONST               1 (1)
+RETURN_VALUE
+     >>> c = Code()
+     >>> c.return_( Or([False, Local('y'), 3]) )
+     >>> dis(c.code())
+          0 LOAD_FAST                0 (y)
+JUMP_IF_TRUE             4 (to 10)
+POP_TOP
+LOAD_CONST               1 (3)
+             >>   10 RETURN_VALUE
  Custom Code Generation
  ======================
- Line 625
+ Line 931
  Line 625
  Line 931
  As you can see, the ``Code.DUP_TOP()`` is called on the code instance, causing
  a ``DUP_TOP`` opcode to be output.  This is sometimes a handy trick for
  accessing values that are already on the stack.  More commonly, however, you'll
- want to implement more sophisticated callables, perhaps something like::
+ want to implement more sophisticated callables.
+ To make it easy to create diverse target types, a ``nodetype()`` decorator is
+ provided::
-     >>> from peak.util.assembler import ast_curry
+     >>> from peak.util.assembler import nodetype
+ It allows you to create code generation target types using functions.  Your
+ function should take one or more arguments, with a ``code=None`` optional
+ argument in the last position.  It should check whether ``code is None`` when
+ called, and if so, return a tuple of the preceding arguments.  If ``code``
+ is not ``None``, then it should do whatever code generating tasks are required.
+ For example::
      >>> def TryFinally(block1, block2, code=None):
      ...     if code is None:
-     ...         return ast_curry(TryFinally, block1, block2)
+     ...         return block1, block2
      ...     code(
      ...         Code.SETUP_FINALLY,
      ...             block1,
- Line 639
+ Line 955
  Line 639
  Line 955
      ...             block2,
      ...         Code.END_FINALLY
      ...     )
+     >>> TryFinally = nodetype()(TryFinally)
+ Note: although the nodetype() generator can be used above the function
+ definition in either Python 2.3 or 2.4, it cannot be done in a doctest under
+ Python 2.3, so this document doesn't attempt to demonstrate that.  Under
+.4, you would do something like this::
+     @nodetype()
+     def TryFinally(...):
+ and code that needs to also work under 2.3 should do something like this::
+     nodetype()
+     def TryFinally(...):
+ But to keep the examples here working with doctest, we'll be doing our
+ ``nodetype()`` calls after the end of the function definitions, e.g.::
      >>> def ExprStmt(value, code=None):
      ...     if code is None:
-     ...         return ast_curry(ExprStmt, value)
+     ...         return value,
      ...     code( value, Code.POP_TOP )
+     >>> ExprStmt = nodetype()(ExprStmt)
      >>> c = Code()
      >>> c( TryFinally(ExprStmt(1), ExprStmt(2)) )
- Line 657
+ Line 991
  Line 657
  Line 991
 POP_TOP
 END_FINALLY
- The ``ast_curry()`` utility function returns an ``instancemethod`` chain that
+ The ``nodetype()`` decorator is virtually identical to the ``struct()``
- binds the given arguments to the given function, creating a hashable and
+ decorator in the DecoratorTools package, except that it does not support
- comparable data structure -- a trivial sort of "AST node".  Just follow the
+ ``*args``, does not create a field for the ``code`` argument, and generates a
- code pattern above, using a ``code=None`` final argument, and returning a
+ ``__call__()`` method that reinvokes the wrapped function to do the actual
- curried version of the function if ``code is None``.  Otherwise, your function
+ code generation.
- should simply do whatever is needed to "generate" the arguments.
+ Among the benefits of this decorator are:
- (This is exactly the same pattern that ``peak.util.assembler`` uses internally
- to implement ``Const``, ``Call``, ``Local``, and other wrapper functions.)
+ * It gives your node types a great debugging format::
- The ``ast_curry()`` utility function isn't quite perfect; due to a quirk of the
+     >>> tf = TryFinally(ExprStmt(1), ExprStmt(2))
- ``instancemethod`` type, it can't save arguments whose value is ``None``: if
+     >>> tf
- you pass a ``None`` argument to ``ast_curry()``, it will be replaced with a
+     TryFinally(ExprStmt(1), ExprStmt(2))
- special ``nil`` object that tests as false, and generates a ``None`` constant
- when code is generated for it.  If your function accepts any arguments that
+ * It makes named fields accessible::
- might have a value of ``None``, you must correctly handle the cases where you
- receive a value of ``nil`` (found in ``peak.util.assembler``) instead of
+     >>> tf.block1
- ``None``.
+     ExprStmt(1)
- However, if you can use ``ast_curry()`` to generate your AST nodes, you will
+     >>> tf.block2
- have objects that are hashable and comparable by default, as long as none of
+     ExprStmt(2)
- your child nodes are unhashable or incomparable.  This can be useful for
- algorithms that require comparing AST subtrees, such as common subexpression
+ * Hashing and comparison work as expected (handy for algorithms that require
- elimination.
+   comparing or caching AST subtrees, such as common subexpression
+   elimination)::
+     >>> ExprStmt(1) == ExprStmt(1)
+     True
+     >>> ExprStmt(1) == ExprStmt(2)
+     False
+ Please see the `struct decorator documentation`_ for info on how to customize
+ node types further.
+ .. _struct decorator documentation: http://peak.telecommunity.com/DevCenter/DecoratorTools#the-struct-decorator
+ Note: hashing only works if all the values you return in your argument tuple
+ are hashable, so you should try to convert them if possible.  For example, if
+ an argument accepts any sequence, you should probably convert it to a tuple
+ before returning it.  Most of the examples in this document, and the node types
+ supplied by ``peak.util.assembler`` itself do this.
+ Constant Folding in Custom Targets
+ ----------------------------------
  If you want to incorporate constant-folding into your AST nodes, you can do
  so by checking for constant values and folding them at either construction
- or code generation time.  For example, this ``And`` node type folds constants
+ or code generation time.  For example, this ``And`` node type (a simpler
- during code generation, by not generating unnecessary branches when it can
+ version of the one included in ``peak.util.assembler``) folds constants during
+ code generation, by not generating unnecessary branches when it can
  prove which way a branch will go::
      >>> from peak.util.assembler import NotAConstant
      >>> def And(values, code=None):
      ...     if code is None:
-     ...         return ast_curry(And, tuple(values))
+     ...         return tuple(values),
      ...     end = Label()
      ...     for value in values[:-1]:
      ...         try:
      ...             if const_value(value):
      ...                 continue    # true constants can be skipped
-     ...             else:           # and false ones end the chain right away
-     ...                 return code(value, end)
      ...         except NotAConstant:    # but non-constants require code
      ...             code(value, end.JUMP_IF_FALSE, Code.POP_TOP)
+     ...         else:       # and false constants end the chain right away
+     ...             return code(value, end)
      ...     code(values[-1], end)
+     >>> And = nodetype()(And)
      >>> c = Code()
      >>> c.return_( And([1, 2]) )
- Line 725
+ Line 1083
  Line 725
  Line 1083
 LOAD_CONST               1 (False)
              >>   10 RETURN_VALUE
+ The above example only folds constants at code generation time, however.  You
+ can also do constant folding at AST construction time, using the
+ ``fold_args()`` function.  For example::
+     >>> from peak.util.assembler import fold_args
+     >>> def Getattr(ob, name, code=None):
+     ...     try:
+     ...         name = const_value(name)
+     ...     except NotAConstant:
+     ...         return Call(Const(getattr), [ob, name])
+     ...     if code is None:
+     ...         return fold_args(Getattr, ob, name)
+     ...     code(ob)
+     ...     code.LOAD_ATTR(name)
+     >>> Getattr = nodetype()(Getattr)
+     >>> const_value(Getattr(1, '__class__'))
+     <type 'int'>
+ The ``fold_args()`` function tries to evaluate the node immediately, if all of
+ its arguments are constants, by creating a temporary ``Code`` object, and
+ running the supplied function against it, then doing an ``eval()`` on the
+ generated code and wrapping the result in a ``Const``.  However, if any of the
+ arguments are non-constant, the original arguments (less the function) are
+ returned. This causes a normal node instance to be created instead of a
+ ``Const``.
+ This isn't a very *fast* way of doing partial evaluation, but it makes it
+ really easy to define new code generation targets without writing custom
+ constant-folding code for each one.  Just ``return fold_args(ThisType, *args)``
+ instead of ``return args``, if you want your node constructor to be able to do
+ eager evaluation.  If you need to, you can check your parameters in order to
+ decide whether to call ``fold_args()`` or not; this is in fact how ``Call``
+ implements its ``fold`` argument and the suppression of folding when
+ the call has no arguments.
+ (By the way, this same ``Getattr`` node type is also available
  Setting the Code's Calling Signature
  ====================================
- Line 758
+ Line 1155
  Line 758
  Line 1155
      >>> c1 = Code.from_function(f1, copy_lineno=True)
      >>> c1.co_firstlineno
+     >>> c1.co_filename is f1.func_code.co_filename
+     True
  If you create a ``Code`` instance from a function that has nested positional
  arguments, the returned code object will include a prologue to unpack the
- Line 865
+ Line 1264
  Line 865
  Line 1264
  stack_size
      The predicted height of the runtime value stack, as of the current opcode.
-     Its value is automatically updated by most opcodes, but you may want to
+     Its value is automatically updated by most opcodes, but if you are doing
-     save and restore it for things like try/finally blocks.  If you increase
+     something sufficiently tricky (as in the ``Switch`` demo, below) you may
-     the value of this attribute, you should also update the ``co_stacksize``
+     need to explicitly set it.
-     attribute if it is less than the new ``stack_size``.
+     The ``stack_size`` automatically becomes ``None`` after any unconditional
+     jump operations, such as ``JUMP_FORWARD``, ``BREAK_LOOP``, or
+     ``RETURN_VALUE``.  When the stack size is ``None``, the only operations
+     that can be performed are the resolving of forward references (which will
+     set the stack size to what it was when the reference was created), or
+     manually setting the stack size.
  co_freevars
-     A tuple of strings naming a function's "cell" variables.  Defaults to an
+     A tuple of strings naming a function's "free" variables.  Defaults to an
      empty tuple.  A function's free variables are the variables it "inherits"
      from its surrounding scope.  If you're going to use this, you should set
      it only once, before generating any code that references any free *or* cell
- Line 912
+ Line 1317
  Line 912
  Line 1317
  co_stacksize
      The maximum amount of stack space the code will require to run.  This
-     value is usually updated automatically as you generate code.  However, if
+     value is updated automatically as you generate code or change
-     you manually set a new ``stack_size`` that is larger than the current
+     the ``stack_size`` attribute.
-     ``co_stacksize``, you should increase the ``co_stacksize`` to match, so
-     that ``co_stacksize`` is always the largest stack size the code will
-     generate at runtime.
+ Stack Size Tracking and Dead Code Detection
+ ===========================================
+ ``Code`` objects automatically track the predicted stack size as code is
+ generated, by updating the ``stack_size`` attribute as each operation occurs.
+ A history is kept so that backward jumps can be checked to ensure that the
+ current stack height is the same as at the jump's target.  Similarly, when
+ forward jumps are resolved, the stack size at the jump target is checked
+ against the stack size at the jump's origin.  If there are multiple jumps to
+ the same location, they must all have the same stack size at the origin and
+ the destination.
+ In addition, whenever any unconditional jump code is generated (i.e.
+ ``JUMP_FORWARD``, ``BREAK_LOOP``, ``CONTINUE_LOOP``, ``JUMP_ABSOLUTE``, or
+ ``RETURN_VALUE``), the predicted ``stack_size`` is set to ``None``.  This
+ means that the ``Code`` object does not know what the stack size will be at
+ the current location.  You cannot issue *any* instructions when the predicted
+ stack size is ``None``, as you will receive an ``AssertionError``::
+     >>> c = Code()
+     >>> fwd = c.JUMP_FORWARD()
+     >>> print c.stack_size  # forward jump marks stack size as unknown
+     None
+     >>> c.LOAD_CONST(42)
+     Traceback (most recent call last):
+       ...
+     AssertionError: Unknown stack size at this location
+ Instead, you must resolve a forward reference (or define a previously-jumped to
+ label).  This will propagate the stack size at the source of the jump to the
+ current location, updating the stack size::
+     >>> fwd()
+     >>> c.stack_size
+ Note, by the way, that this means it is impossible for you to generate static
+ "dead code".  In other words, you cannot generate code that isn't reachable.
+ You should therefore check if ``stack_size`` is ``None`` before generating
+ code that might be unreachable.  For example, consider this ``If``
+ implementation::
+     >>> def If(cond, then, else_=Pass, code=None):
+     ...     if code is None:
+     ...         return cond, then, else_
+     ...     else_clause = Label()
+     ...     end_if = Label()
+     ...     code(cond, else_clause.JUMP_IF_FALSE, Code.POP_TOP, then)
+     ...     code(end_if.JUMP_FORWARD, else_clause, Code.POP_TOP, else_)
+     ...     code(end_if)
+     >>> If = nodetype()(If)
+ It works okay if there's no dead code::
+     >>> c = Code()
+     >>> c( If(23, 42, 55) )
+     >>> dis(c.code())   # Python 2.3 may peephole-optimize this code
+          0 LOAD_CONST               1 (23)
+JUMP_IF_FALSE            7 (to 13)
+POP_TOP
+LOAD_CONST               2 (42)
+JUMP_FORWARD             4 (to 17)
+             >>   13 POP_TOP
+LOAD_CONST               3 (55)
+ But it breaks if you end the "then" block with a return::
+     >>> c = Code()
+     >>> c( If(23, Return(42), 55) )
+     Traceback (most recent call last):
+       ...
+     AssertionError: Unknown stack size at this location
+ What we need is something like this instead::
+     >>> def If(cond, then, else_=Pass, code=None):
+     ...     if code is None:
+     ...         return cond, then, else_
+     ...     else_clause = Label()
+     ...     end_if = Label()
+     ...     code(cond, else_clause.JUMP_IF_FALSE, Code.POP_TOP, then)
+     ...     if code.stack_size is not None:
+     ...         end_if.JUMP_FORWARD(code)
+     ...     code(else_clause, Code.POP_TOP, else_, end_if)
+     >>> If = nodetype()(If)
+ As you can see, the dead code is now eliminated::
+     >>> c = Code()
+     >>> c( If(23, Return(42), 55) )
+     >>> dis(c.code())   # Python 2.3 may peephole-optimize this code
+          0 LOAD_CONST               1 (23)
+JUMP_IF_FALSE            5 (to 11)
+POP_TOP
+LOAD_CONST               2 (42)
+RETURN_VALUE
+             >>   11 POP_TOP
+LOAD_CONST               3 (55)
  Blocks, Loops, and Exception Handling
- Line 946
+ Line 1450
  Line 946
  Line 1450
      >>> c.code()
      <code object <lambda> ...>
- ``Code`` objects also check that the stack level as of a ``POP_BLOCK`` is the
- same as it was when the block was set up::
-     >>> c = Code()
-     >>> c.SETUP_LOOP()
-     >>> c.LOAD_CONST(23)
-     >>> c.POP_BLOCK()
-     Traceback (most recent call last):
-       ...
-     AssertionError: Stack level mismatch: actual=1 expected=0
  Exception Stack Size Adjustment
  -------------------------------
- When you ``POP_BLOCK`` for a ``SETUP_EXCEPT`` or ``SETUP_FINALLY``, the code's
+ When you issue a ``SETUP_EXCEPT`` or ``SETUP_FINALLY``, the code's maximum
- maximum stack size is raised to ensure that it's at least 3 items higher than
+ stack size is raised to ensure that it's at least 3 items higher than
  the current stack size.  That way, there will be room for the items that Python
  puts on the stack when jumping to a block's exception handling code::
      >>> c = Code()
      >>> c.SETUP_FINALLY()
      >>> c.stack_size, c.co_stacksize
-     (0, 0)
-     >>> c.POP_BLOCK()
-     >>> c.END_FINALLY()
-     >>> c.stack_size, c.co_stacksize
      (0, 3)
  As you can see, the current stack size is unchanged, but the maximum stack size
- Line 982
+ Line 1471
  Line 982
  Line 1471
      >>> c = Code()
      >>> c(1,2,3,4, *[Code.POP_TOP]*4)   # push 4 things, then pop 'em
      >>> c.SETUP_FINALLY()
-     >>> c.POP_BLOCK()
-     >>> c.END_FINALLY()
      >>> c.stack_size, c.co_stacksize
      (0, 4)
- Line 992
+ Line 1479
  Line 992
  Line 1479
      >>> c = Code()
      >>> c.SETUP_LOOP()
-     >>> break_to = c.POP_BLOCK()
      >>> c.stack_size, c.co_stacksize
      (0, 0)
- Line 1000
+ Line 1486
  Line 1000
  Line 1486
  Try/Except Blocks
  -----------------
- In the case of ``SETUP_EXCEPT``, the *current* stack size is also increased by
+ In the case of ``SETUP_EXCEPT``, the *current* stack size is increased by 3
-, because the code following the ``POP_BLOCK`` will be the exception handler
+ after a ``POP_BLOCK``, because the code that follows will be an exception
- and will thus always have exception items on the stack::
+ handler and will thus always have exception items on the stack::
      >>> c = Code()
      >>> c.SETUP_EXCEPT()
- Line 1059
+ Line 1545
  Line 1059
  Line 1545
              >>   10 LOAD_CONST               0 (None)
 RETURN_VALUE
- Labels have a ``POP_BLOCK`` attribute that you can pass in when generating
+ (Labels have a ``POP_BLOCK`` attribute that you can pass in when generating
- code.
+ code.)
+ And, for generating typical try/except blocks, you can use the ``TryExcept``
+ node type, which takes a body, a sequence of exception-type/handler pairs,
+ and an optional "else" clause::
+     >>> from peak.util.assembler import TryExcept
+     >>> c = Code()
+     >>> c.return_(
+     ...     TryExcept(
+     ...         Return(1),                                      # body
+     ...         [(Const(KeyError),2), (Const(TypeError),3)],    # handlers
+     ...         Return(4)                                       # else clause
+     ...     )
+     ... )
+     >>> dis(c.code())
+          0 SETUP_EXCEPT             8 (to 11)
+LOAD_CONST               1 (1)
+RETURN_VALUE
+POP_BLOCK
+JUMP_FORWARD            43 (to 54)
+             >>   11 DUP_TOP
+LOAD_CONST               2 (<type 'exceptions.KeyError'>)
+COMPARE_OP              10 (exception match)
+JUMP_IF_FALSE           10 (to 31)
+POP_TOP
+POP_TOP
+POP_TOP
+POP_TOP
+LOAD_CONST               3 (2)
+JUMP_FORWARD            27 (to 58)
+             >>   31 POP_TOP
+DUP_TOP
+LOAD_CONST               4 (<type 'exceptions.TypeError'>)
+COMPARE_OP              10 (exception match)
+JUMP_IF_FALSE           10 (to 52)
+POP_TOP
+POP_TOP
+POP_TOP
+POP_TOP
+LOAD_CONST               5 (3)
+JUMP_FORWARD             6 (to 58)
+             >>   52 POP_TOP
+END_FINALLY
+             >>   54 LOAD_CONST               6 (4)
+RETURN_VALUE
+             >>   58 RETURN_VALUE
  Try/Finally Blocks
- Line 1097
+ Line 1630
  Line 1097
  Line 1630
  adjusts the maximum expected stack size to accomodate up to three values being
  put on the stack by the Python interpreter for exception handling.
+ For your convenience, the ``TryFinally`` node type can also be used to generate
+ try/finally blocks::
+     >>> from peak.util.assembler import TryFinally
+     >>> c = Code()
+     >>> c( TryFinally(ExprStmt(1), ExprStmt(2)) )
+     >>> dis(c.code())
+          0 SETUP_FINALLY            8 (to 11)
+LOAD_CONST               1 (1)
+POP_TOP
+POP_BLOCK
+LOAD_CONST               0 (None)
+             >>   11 LOAD_CONST               2 (2)
+POP_TOP
+END_FINALLY
  Loops
  -----
- Line 1168
+ Line 1717
  Line 1168
  Line 1717
  And ``CONTINUE_LOOP`` is automatically replaced with a ``JUMP_ABSOLUTE`` if
  it occurs directly inside a loop block::
+     >>> c.LOAD_CONST(57)
      >>> c.SETUP_LOOP()
+     >>> fwd = c.JUMP_IF_TRUE()
      >>> c.CONTINUE_LOOP(c.here())
+     >>> fwd()
      >>> c.BREAK_LOOP()
      >>> c.POP_BLOCK()()
      >>> dis(c.code())
-          0 SETUP_LOOP               5 (to 8)
+          0 LOAD_CONST               1 (57)
-             >>    3 JUMP_ABSOLUTE            3
+SETUP_LOOP               8 (to 14)
-BREAK_LOOP
+JUMP_IF_TRUE             3 (to 12)
-POP_BLOCK
+             >>    9 JUMP_ABSOLUTE            9
+             >>   12 BREAK_LOOP
+POP_BLOCK
  In other words, ``CONTINUE_LOOP`` only really emits a ``CONTINUE_LOOP`` opcode
  if it's inside some other kind of block within the loop, e.g. a "try" clause::
      >>> c = Code()
+     >>> c.LOAD_CONST(57)
      >>> c.SETUP_LOOP()
      >>> loop = c.here()
      >>> c.SETUP_FINALLY()
+     >>> fwd = c.JUMP_IF_TRUE()
      >>> c.CONTINUE_LOOP(loop)
+     >>> fwd()
      >>> c.POP_BLOCK()
      >>> c.END_FINALLY()
      >>> c.POP_BLOCK()()
      >>> dis(c.code())
-          0 SETUP_LOOP              12 (to 15)
+          0 LOAD_CONST               1 (57)
-             >>    3 SETUP_FINALLY            7 (to 13)
+SETUP_LOOP              15 (to 21)
-CONTINUE_LOOP            3
+             >>    6 SETUP_FINALLY           10 (to 19)
-POP_BLOCK
+JUMP_IF_TRUE             3 (to 15)
-LOAD_CONST               0 (None)
+CONTINUE_LOOP            6
-             >>   13 END_FINALLY
+             >>   15 POP_BLOCK
-POP_BLOCK
+LOAD_CONST               0 (None)
+             >>   19 END_FINALLY
+POP_BLOCK
  ----------------------
- Line 1244
+ Line 1801
  Line 1244
  Line 1801
  Stack size tracking::
-     >>> c = Code()
+     >>> c = Code()          # 0
-     >>> c.LOAD_CONST(1)
+     >>> c.LOAD_CONST(1)     # 1
-     >>> c.POP_TOP()
+     >>> c.POP_TOP()         # 0
-     >>> c.LOAD_CONST(2)
+     >>> c.LOAD_CONST(2)     # 1
-     >>> c.LOAD_CONST(3)
+     >>> c.LOAD_CONST(3)     # 2
      >>> c.co_stacksize
-     >>> c.BINARY_ADD()
+     >>> c.stack_history
-     >>> c.LOAD_CONST(4)
+     [0, ..., 1, 0, ..., 1]
+     >>> c.BINARY_ADD()      # 1
+     >>> c.LOAD_CONST(4)     # 2
      >>> c.co_stacksize
+     >>> c.stack_history
+     [0, ..., 1, 0, 1, ..., 2, ..., 1]
      >>> c.LOAD_CONST(5)
      >>> c.LOAD_CONST(6)
      >>> c.co_stacksize
- Line 1290
+ Line 1851
  Line 1290
  Line 1851
 LOAD_ATTR                1 (bar)
 DELETE_FAST              0 (baz)
+ Stack tracking on jumps::
+     >>> c = Code()
+     >>> else_ = Label()
+     >>> end = Label()
+     >>> c(99, else_.JUMP_IF_TRUE, Code.POP_TOP, end.JUMP_FORWARD)
+     >>> c(else_, Code.POP_TOP, end)
+     >>> dis(c.code())
+          0 LOAD_CONST               1 (99)
+JUMP_IF_TRUE             4 (to 10)
+POP_TOP
+JUMP_FORWARD             1 (to 11)
+             >>   10 POP_TOP
+     >>> c.stack_size
+     >>> c.stack_history
+     [0, 1, 1, 1, 1, 1, 1, 0, None, None, 1]
+     >>> c = Code()
+     >>> fwd = c.JUMP_FORWARD()
+     >>> c.LOAD_CONST(42)    # forward jump marks stack size unknown
+     Traceback (most recent call last):
+       ...
+     AssertionError: Unknown stack size at this location
+     >>> c.stack_size = 0
+     >>> c.LOAD_CONST(42)
+     >>> fwd()
+     Traceback (most recent call last):
+       ...
+     AssertionError: Stack level mismatch: actual=1 expected=0
  Sequence operators and stack tracking:
- Line 1526
+ Line 2125
  Line 1526
  Line 2125
 STORE_FAST               4 (a)
 STORE_FAST               5 (b)
- Constant folding for *args and **kw::
+ Constant folding for ``*args`` and ``**kw``::
      >>> c = Code()
      >>> c.return_(Call(Const(type), [], [], (1,)))
- Line 1542
+ Line 2141
  Line 1542
  Line 2141
 RETURN_VALUE
  Demo: "Computed Goto"/"Switch Statement"
  ========================================
- Line 1551
+ Line 2149
  Line 1551
  Line 2149
      >>> from peak.util.assembler import LOAD_CONST, POP_BLOCK
-     >>> def Pass(code=None):
-     ...     if code is None:
-     ...         return Pass
-     >>> def NewConst(value, code=None):
-     ...     if code is None:
-     ...         return ast_curry(NewConst, value)
-     ...     code.emit_arg(LOAD_CONST, len(code.co_consts))
-     ...     code.co_consts.append(value)
-     ...     code.stackchange((0,1))
      >>> import sys
      >>> WHY_CONTINUE = {'2.3':5, '2.4':32, '2.5':32}[sys.version[:3]]
      >>> def Switch(expr, cases, default=Pass, code=None):
      ...     if code is None:
-     ...         return ast_curry(Switch, expr, tuple(cases), default)
+     ...         return expr, tuple(cases), default
      ...
      ...     d = {}
      ...     else_block  = Label()
- Line 1576
+ Line 2163
  Line 1576
  Line 2163
      ...
      ...     code(
      ...         end_switch.SETUP_LOOP,
-     ...             Call(NewConst(d.get), [expr]),
+     ...             Call(Const(d.get), [expr]),
      ...         else_block.JUMP_IF_FALSE,
      ...             WHY_CONTINUE, Code.END_FINALLY
      ...     )
      ...
-     ...     cursize = code.stack_size
+     ...     cursize = code.stack_size - 1   # adjust for removed WHY_CONTINUE
      ...     for key, value in cases:
      ...         d[const_value(key)] = code.here()
-     ...         code(value, cleanup.JUMP_FORWARD)
+     ...         code.stack_size = cursize
+     ...         code(value)
+     ...         if code.stack_size is not None: # if the code can fall through,
+     ...             code(cleanup.JUMP_FORWARD)  # jump forward to the cleanup
      ...
      ...     code(
      ...         else_block,
- Line 1593
+ Line 2183
  Line 1593
  Line 2183
      ...             Code.POP_BLOCK,
      ...         end_switch
      ...     )
+     >>> Switch = nodetype()(Switch)
      >>> c = Code()
      >>> c.co_argcount=1
- Line 1608
+ Line 2199
  Line 1608
  Line 2199
      >>> dis(c.code())
-          0 SETUP_LOOP              36 (to 39)
+          0 SETUP_LOOP              30 (to 33)
 LOAD_CONST               1 (<...method get of dict...>)
 LOAD_FAST                0 (x)
 CALL_FUNCTION            1
-JUMP_IF_FALSE           18 (to 33)
+JUMP_IF_FALSE           12 (to 27)
 LOAD_CONST               2 (...)
 END_FINALLY
 LOAD_CONST               3 (42)
 RETURN_VALUE
-JUMP_FORWARD            12 (to 38)
+LOAD_CONST               4 ('foo')
-LOAD_CONST               4 ('foo')
+RETURN_VALUE
-RETURN_VALUE
+             >>   27 POP_TOP
-JUMP_FORWARD             5 (to 38)
+LOAD_CONST               5 (27)
-             >>   33 POP_TOP
+RETURN_VALUE
-LOAD_CONST               5 (27)
+POP_BLOCK
-RETURN_VALUE
+             >>   33 LOAD_CONST               0 (None)
-             >>   38 POP_BLOCK
+RETURN_VALUE
-             >>   39 LOAD_CONST               0 (None)
-RETURN_VALUE
  TODO
  ====
- * AST introspection
-     * ast_type(node): called function, Const, or node.__class__
-       * tuples are Const if their contents are; no other types are Const
-     * ast_children(node): tuple of argument values for curried types, const value,
-       or empty tuple.  If node is a tuple, the value must be flattened.
-     * is_const(node): ast_type(node) is Const
- * Inline builtins (getattr, operator.getitem, etc.) to opcodes
-     * Getattr/Op/Unary("symbol", arg1 [, arg2]) node types -> Call() if folding
-     * Call() translates functions back to Ops if inlining
- * Pretty printing and short-naming of ASTs
  * Test NAME vs. FAST operators flag checks/sets
  * Test code flags generation/cloning
+ * Exhaustive tests of all opcodes' stack history effects
+ * YIELD_EXPR should set CO_GENERATOR; stack effects depend on Python version

-Generate output suitable for use with a patch program
+Legend:



Removed from v.2194
 


changed lines


 
Added in v.2457
 Legend:



Removed from v.2194
 


changed lines


 
Added in v.2457
-Removed from v.2194
+Added in v.2457

cvs-admin@eby-sarna.com

Diff of /BytecodeAssembler/README.txt

ViewCVS and CVS Help