嵌套生成器表达式 - 意外结果

Nested generator expression - unexpected result(嵌套生成器表达式 - 意外结果)

本文介绍了嵌套生成器表达式 - 意外结果的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

测试代码如下:

units = [1, 2]十位数 = [10, 20]nums = (a + b for a in units for b in十位)单位 = [3, 4]十位数 = [30, 40][x 代表 x,以 nums 为单位]

假设第 3 行的生成器表达式 (nums = ...) 形成一个迭代器,我希望最终结果能够反映 units十个.OTOH,如果要在第 3 行评估该生成器表达式,生成结果元组,那么我希望使用 unitstens 的第一个定义.

我看到的是一个MIX;即,结果是 [31, 41, 32, 42]!?

谁能解释这种行为?

解决方案

一个生成器表达式创建了一个函数;一个只有一个参数,最外层的可迭代对象.

这里是 units,它在创建生成器表达式时作为参数绑定到生成器表达式.

所有其他名称要么是局部变量(例如 ab),要么是全局变量,要么是闭包.tens 是作为全局查找的,因此每次推进生成器时都会查找它.

因此,units 绑定到第 3 行的生成器,当您迭代最后一行的生成器表达式时,会查找 tens.

您可以在将生成器编译为字节码并检查该字节码时看到这一点:

<预><代码>>>>导入文件>>>genexp_bytecode = compile('(a + b for a in units for b in tens)', '<file>', 'single')>>>dis.dis(genexp_bytecode)1 0 LOAD_CONST 0(<代码对象在0x10f013ae0,文件",第1行>)3 LOAD_CONST 1 ('<genexpr>')6 MAKE_FUNCTION 09 LOAD_NAME 0(单位)12 GET_ITER13 CALL_FUNCTION 1(1 个位置,0 个关键字对)16 PRINT_EXPR17 LOAD_CONST 2(无)20 RETURN_VALUE

MAKE_FUNCTION 字节码将生成器表达式代码对象变成了一个函数,并立即被调用,传入iter(units) 作为参数.tens 名称在这里根本没有被引用.

这在 原始生成器 PEP 中有记录:

<块引用>

只有最外层的 for 表达式被立即求值,其他的表达式被推迟到生成器运行:

g = (tgtexp for var1 in exp1 if exp2 for var2 in exp3 if exp4)

相当于:

def __gen(bound_exp):对于 bound_exp 中的 var1:如果 exp2:对于 exp3 中的 var2:如果 exp4:产量 tgtexpg = __gen(iter(exp1))删除 __gen

在生成器表达式参考中:

<块引用>

当为生成器对象调用 __next__() 方法时,生成器表达式中使用的变量会被延迟计算(与普通生成器的方式相同).但是,最左边的 for 子句会立即求值,因此它产生的错误可以在处理生成器表达式的代码中的任何其他可能错误之前被看到.后续的 for 子句不能立即计算,因为它们可能依赖于前一个 for 循环.例如:(x*y for x in range(10) for y in bar(x)).

PEP 有一个很好的部分激励为什么名称(除了最外层的可迭代对象)绑定晚了,参见 早期绑定与晚期绑定.

Here's the test code:

units = [1, 2]
tens = [10, 20]
nums = (a + b for a in units for b in tens)
units = [3, 4]
tens = [30, 40]
[x for x in nums]

Under the assumption that the generator expression on line 3 (nums = ...) forms an iterator I would expect the final result to reflect the final assigned values for units and tens. OTOH, if that generator expression were to be evaluated at line 3, producing the result tuple, then I'd expect the first definitions of units and tens to be used.

What I see is a MIX; i.e., the result is [31, 41, 32, 42]!?

Can anyone explain this behavior?

解决方案

A generator expression creates a function of sorts; one with just one argument, the outermost iterable.

Here that's units, and that is bound as an argument to the generator expression when the generator expression is created.

All other names are either locals (such as a and b), globals, or closures. tens is looked up as a global, so it is looked up each time you advance the generator.

As a result, units is bound to the generator on line 3, tens is looked up when you iterated over the generator expression on the last line.

You can see this when compiling the generator to bytecode and inspecting that bytecode:

>>> import dis
>>> genexp_bytecode = compile('(a + b for a in units for b in tens)', '<file>', 'single')
>>> dis.dis(genexp_bytecode)
  1           0 LOAD_CONST               0 (<code object <genexpr> at 0x10f013ae0, file "<file>", line 1>)
              3 LOAD_CONST               1 ('<genexpr>')
              6 MAKE_FUNCTION            0
              9 LOAD_NAME                0 (units)
             12 GET_ITER
             13 CALL_FUNCTION            1 (1 positional, 0 keyword pair)
             16 PRINT_EXPR
             17 LOAD_CONST               2 (None)
             20 RETURN_VALUE

The MAKE_FUNCTION bytecode turned the generator expression code object into a function, and it is called immediately, passing in iter(units) as the argument. The tens name is not referenced at all here.

This is documented in the original generators PEP:

Only the outermost for-expression is evaluated immediately, the other expressions are deferred until the generator is run:

g = (tgtexp  for var1 in exp1 if exp2 for var2 in exp3 if exp4)

is equivalent to:

def __gen(bound_exp):
    for var1 in bound_exp:
        if exp2:
            for var2 in exp3:
                if exp4:
                    yield tgtexp
g = __gen(iter(exp1))
del __gen

and in the generator expressions reference:

Variables used in the generator expression are evaluated lazily when the __next__() method is called for generator object (in the same fashion as normal generators). However, the leftmost for clause is immediately evaluated, so that an error produced by it can be seen before any other possible error in the code that handles the generator expression. Subsequent for clauses cannot be evaluated immediately since they may depend on the previous for loop. For example: (x*y for x in range(10) for y in bar(x)).

The PEP has an excellent section motivating why names (other than the outermost iterable) are bound late, see Early Binding vs. Late Binding.

这篇关于嵌套生成器表达式 - 意外结果的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持编程学习网!

本文标题为:嵌套生成器表达式 - 意外结果

基础教程推荐