Notice: Trying to access array offset on value of type bool in /home1/freesand/public_html/wp-content/plugins/wiki-embed/WikiEmbed.php on line 112

Notice: Trying to access array offset on value of type bool in /home1/freesand/public_html/wp-content/plugins/wiki-embed/WikiEmbed.php on line 112

Notice: Trying to access array offset on value of type bool in /home1/freesand/public_html/wp-content/plugins/wiki-embed/WikiEmbed.php on line 116
18 | 3 月 | 2019 | FreeSandal

Rock It 《ML》JupyterLab 【丁】Code《七》語義【六】解讀‧三前

何謂『碼物件』耶?

Code Objects

Code objects are a low-level detail of the CPython implementation. Each one represents a chunk of executable code that hasn’t yet been bound into a function.

PyCodeObject
The C structure of the objects used to describe code objects. The fields of this type are subject to change at any time.
PyTypeObject PyCode_Type
This is an instance of PyTypeObject representing the Python code type.
int PyCode_Check(PyObject *co)
Return true if co is a code object.
int PyCode_GetNumFree(PyCodeObject *co)
Return the number of free variables in co.
PyCodeObject* PyCode_New(int argcount, int kwonlyargcount, int nlocals, int stacksize, int flags, PyObject *code,PyObject *consts, PyObject *names, PyObject *varnames, PyObject *freevars, PyObject *cellvars,PyObject *filename, PyObject *name, int firstlineno, PyObject *lnotab)
Return a new code object. If you need a dummy code object to create a frame, use PyCode_NewEmpty() instead. Calling PyCode_New() directly can bind you to a precise Python version since the definition of the bytecode changes often.
PyCodeObject* PyCode_NewEmpty(const char *filename, const char *funcname, int firstlineno)
Return a new empty code object with the specified filename, function name, and first line number. It is illegal toexec() or eval() the resulting code object.

文本讀來,彷彿墬身迷霧裡!

單靠一點微光,物象難分明呦☻

29.12. inspect — Inspect live objects

Source code: Lib/inspect.py


The inspect module provides several useful functions to help get information about live objects such as modules, classes, methods, functions, tracebacks, frame objects, and code objects. For example, it can help you examine the contents of a class, retrieve the source code of a method, extract and format the argument list for a function, or get all the information you need to display a detailed traceback.

There are four main kinds of services provided by this module: type checking, getting source code, inspecting classes and functions, and examining the interpreter stack.

……

code co_argcount number of arguments (not including keyword only arguments, * or ** args)
  co_code string of raw compiled bytecode
  co_cellvars tuple of names of cell variables (referenced by containing scopes)
  co_consts tuple of constants used in the bytecode
  co_filename name of file in which this code object was created
  co_firstlineno number of first line in Python source code
  co_flags bitmap of CO_* flags, read more here
  co_lnotab encoded mapping of line numbers to bytecode indices
  co_freevars tuple of names of free variables (referenced via a function’s closure)
  co_kwonlyargcount number of keyword only arguments (not including ** arg)
  co_name name with which this code object was defined
  co_names tuple of names of local variables
  co_nlocals number of local variables
  co_stacksize virtual machine stack space required
  co_varnames tuple of names of arguments and local variables

 

宛如登山無地圖!

 

誰知雲深是何處?

 

所以前行最好有嚮導也☆

Peter Goldsborough

Disassembling Python Bytecode

In Python, the dis module allows disassembly of Python code into the individual instructions executed by the Python interpreter (usually cPython) for each line. Passing a module, function or other piece of code to the dis.dis function will return a human-readable representation of the underlying, disassembled bytecode. This is useful for analyzing and hand-tuning tight loops or perform other kinds of necessary, fine-grained optimizations.

Basic Usage

The main function you will interact with when wanting to disassemble Python code is dis.dis. It takes either a function, method, class, module, code string, generator or byte sequence of raw bytecode and prints the disassembly of that code object to stdout (if no explicit file argument is specified). In the case of a class, it will disassemble each method (also static and class methods). For a module, it disassembles all functions in that module.

Let’s see this in practice. Take the following code:

import dis

class Foo(object):
  def __init__(self):
    pass

  def foo(self, x):
    return x + 1

def bar():
  x = 5
  y = 7
  z = x + y
  return z

def main():
  dis.dis(bar) # disassembles `bar`
  dis.dis(Foo) # disassembles each method in `Foo`

This will print:

14           0 LOAD_CONST               1 (5)
             3 STORE_FAST               0 (x)

15           6 LOAD_CONST               2 (7)
             9 STORE_FAST               1 (y)

16          12 LOAD_FAST                0 (x)
            15 LOAD_FAST                1 (y)
            18 BINARY_ADD
            19 STORE_FAST               2 (z)

17          22 LOAD_FAST                2 (z)
            25 RETURN_VALUE

Disassembly of __init__:
 8           0 LOAD_CONST               0 (None)
             3 RETURN_VALUE

Disassembly of foo:
11           0 LOAD_FAST                1 (x)
             3 LOAD_CONST               1 (1)
             6 BINARY_ADD
             7 RETURN_VALUE

Also, we can disassemble an entire module from the command line using python -m dis module_file.py. Either way, at this point, we should probably discuss the format of the disassembly output. The columns returned are the following:

  1. The original line of code the disassembly is referencing.
  2. The address of the bytecode instruction.
  3. The name of the instruction.
  4. The index of the argument in the code block’s name and constant table.
  5. The human-friendly mapping from the argument index (4) to the actual value or name being referenced.

For (4), it is important to understand that all code objects in Python, that is, isolated code blocks like functions, have internal name and constant tables. These tables are simply lists, where the constant table would hold constants such as string literals, numbers or special values such as None that appear at least once in the code block, while the name table will hold a list of variable names. These variable names are then, further, keys into a dictionary mapping such symbols to actual values. The reason why instruction arguments are indices into tables and not the values stored in those tables is so that arguments can have uniform length (always two bytes). As you can imagine, storing variable-length strings in the bytecode directly makes advancing a program counter a great deal more complex.

……

Interpreting Bytecode

Disassembled bytecode instructions are already quite low-level (a.k.a. cool). However, we can go even deeper and understand the byte code itself – i.e. the binary or hexadecimal representation of the instructions in compiled and assembled bytecode. For this, let’s define a function and mess a little more with its __code__ property:

def function():
  x = 5
  l = [1, 2]
  return len(l) + x

Through function.__code__ we can gain access to the code object associated with the function. Furthermore, function.__code__.co_code returns the actual bytecode:

In [1]: function.__code__.co_code
Out[1]: b'd\x01\x00}\x00\x00d\x02\x00d\x03\x00g\x02\x00}\x01\x00t\x00\x00|\x01\x00\x83\x01\x00|\x00\x00\x17S'

Yes! Bytes! Just what I like for breakfast. But what can we actually make of these delicious bites of bytecode? Well, we know that these bytes specify instructions, some taking arguments and some not. Each instruction will occupy a single byte and arguments (such as the indices into the name and constants table) will occupy further bytes. Furthermore, fortunately enough, the dismodule (as well as the opcode module) provides an opname table and an opmap map. The former is a simple list, laid out such that indexing it with the opcode of an instruction will return the name (mnemonic) of that instruction. The latter, dis.opmap, maps instruction mnemonics to their bytecode numbers:

In [1]: dis.opname[69]
Out[1]: 'GET_YIELD_FROM_ITER'

In [2]: dis.opmap['LOAD_CONST']
Out[2]: 100

So, if we know the byte value describing a certain instruction, we now know how to get the instruction name. All that’s left is interpreting the arguments of these instructions. For this we need to know whether or not the instruction takes arguments in the first place. To get this information, we can make use of the dis.hasconst, dis.hasname, dis.hasjrel and dis.hasjabs and others. Each of these are lists in the dis module that contain the bytecodes either taking a a constant argument, a name argument, relative/absolute jump target or other kind of parameter. For example,dis.hasnargs is also such a list, containing all opcodes related to function calls, such as CALL_FUNCTION, CALL_FUNCTION_VAR (for functions taking *args) or CALL_FUNCTION_KW (for functions taking**kwargs). It is noteworthy that if an instruction takes arguments at all, it can only take a single argument occupying exactly 16 bits (two bytes).

 

 

 

 

 

 

 

 

輕。鬆。學。部落客