Nested generator expression - unexpected result(嵌套生成器表达式 - 意外结果)
问题描述
测试代码如下:
units = [1, 2]十位数 = [10, 20]nums = (a + b for a in units for b in十位)单位 = [3, 4]十位数 = [30, 40][x 代表 x,以 nums 为单位]
假设第 3 行的生成器表达式 (
nums = ...
) 形成一个迭代器,我希望最终结果能够反映units
和十个
.OTOH,如果要在第 3 行评估该生成器表达式,生成结果元组,那么我希望使用units
和tens
的第一个定义.我看到的是一个MIX;即,结果是
[31, 41, 32, 42]
!?谁能解释这种行为?
解决方案一个生成器表达式创建了一个函数;一个只有一个参数,最外层的可迭代对象.
这里是
units
,它在创建生成器表达式时作为参数绑定到生成器表达式.所有其他名称要么是局部变量(例如
a
和b
),要么是全局变量,要么是闭包.tens
是作为全局查找的,因此每次推进生成器时都会查找它.因此,
units
绑定到第 3 行的生成器,当您迭代最后一行的生成器表达式时,会查找tens
.您可以在将生成器编译为字节码并检查该字节码时看到这一点:
<预><代码>>>>导入文件>>>genexp_bytecode = compile('(a + b for a in units for b in tens)', '<file>', 'single')>>>dis.dis(genexp_bytecode)1 0 LOAD_CONST 0(<代码对象在0x10f013ae0,文件 ",第1行>)3 LOAD_CONST 1 ('<genexpr>')6 MAKE_FUNCTION 09 LOAD_NAME 0(单位)12 GET_ITER13 CALL_FUNCTION 1(1 个位置,0 个关键字对)16 PRINT_EXPR17 LOAD_CONST 2(无)20 RETURN_VALUE
MAKE_FUNCTION
字节码将生成器表达式代码对象变成了一个函数,并立即被调用,传入iter(units)
作为参数.tens
名称在这里根本没有被引用.
这在 原始生成器 PEP 中有记录:
<块引用>只有最外层的 for 表达式被立即求值,其他的表达式被推迟到生成器运行:
g = (tgtexp for var1 in exp1 if exp2 for var2 in exp3 if exp4)
相当于:
def __gen(bound_exp):对于 bound_exp 中的 var1:如果 exp2:对于 exp3 中的 var2:如果 exp4:产量 tgtexpg = __gen(iter(exp1))删除 __gen
在生成器表达式参考中:
<块引用>当为生成器对象调用 __next__()
方法时,生成器表达式中使用的变量会被延迟计算(与普通生成器的方式相同).但是,最左边的 for
子句会立即求值,因此它产生的错误可以在处理生成器表达式的代码中的任何其他可能错误之前被看到.后续的 for
子句不能立即计算,因为它们可能依赖于前一个 for 循环.例如:(x*y for x in range(10) for y in bar(x))
.
PEP 有一个很好的部分激励为什么名称(除了最外层的可迭代对象)绑定晚了,参见 早期绑定与晚期绑定.
Here's the test code:
units = [1, 2]
tens = [10, 20]
nums = (a + b for a in units for b in tens)
units = [3, 4]
tens = [30, 40]
[x for x in nums]
Under the assumption that the generator expression on line 3 (nums = ...
) forms an iterator I would expect the final result to reflect the final assigned values for units
and tens
. OTOH, if that generator expression were to be evaluated at line 3, producing the result tuple, then I'd expect the first definitions of units
and tens
to be used.
What I see is a MIX; i.e., the result is [31, 41, 32, 42]
!?
Can anyone explain this behavior?
A generator expression creates a function of sorts; one with just one argument, the outermost iterable.
Here that's units
, and that is bound as an argument to the generator expression when the generator expression is created.
All other names are either locals (such as a
and b
), globals, or closures. tens
is looked up as a global, so it is looked up each time you advance the generator.
As a result, units
is bound to the generator on line 3, tens
is looked up when you iterated over the generator expression on the last line.
You can see this when compiling the generator to bytecode and inspecting that bytecode:
>>> import dis
>>> genexp_bytecode = compile('(a + b for a in units for b in tens)', '<file>', 'single')
>>> dis.dis(genexp_bytecode)
1 0 LOAD_CONST 0 (<code object <genexpr> at 0x10f013ae0, file "<file>", line 1>)
3 LOAD_CONST 1 ('<genexpr>')
6 MAKE_FUNCTION 0
9 LOAD_NAME 0 (units)
12 GET_ITER
13 CALL_FUNCTION 1 (1 positional, 0 keyword pair)
16 PRINT_EXPR
17 LOAD_CONST 2 (None)
20 RETURN_VALUE
The MAKE_FUNCTION
bytecode turned the generator expression code object into a function, and it is called immediately, passing in iter(units)
as the argument. The tens
name is not referenced at all here.
This is documented in the original generators PEP:
Only the outermost for-expression is evaluated immediately, the other expressions are deferred until the generator is run:
g = (tgtexp for var1 in exp1 if exp2 for var2 in exp3 if exp4)
is equivalent to:
def __gen(bound_exp): for var1 in bound_exp: if exp2: for var2 in exp3: if exp4: yield tgtexp g = __gen(iter(exp1)) del __gen
and in the generator expressions reference:
Variables used in the generator expression are evaluated lazily when the
__next__()
method is called for generator object (in the same fashion as normal generators). However, the leftmostfor
clause is immediately evaluated, so that an error produced by it can be seen before any other possible error in the code that handles the generator expression. Subsequentfor
clauses cannot be evaluated immediately since they may depend on the previous for loop. For example:(x*y for x in range(10) for y in bar(x))
.
The PEP has an excellent section motivating why names (other than the outermost iterable) are bound late, see Early Binding vs. Late Binding.
这篇关于嵌套生成器表达式 - 意外结果的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持编程学习网!
本文标题为:嵌套生成器表达式 - 意外结果
基础教程推荐
- Python kivy 入口点 inflateRest2 无法定位 libpng16-16.dll 2022-01-01
- 如何在海运重新绘制中自定义标题和y标签 2022-01-01
- 用于分类数据的跳跃记号标签 2022-01-01
- 筛选NumPy数组 2022-01-01
- Dask.array.套用_沿_轴:由于额外的元素([1]),使用dask.array的每一行作为另一个函数的输入失败 2022-01-01
- 何时使用 os.name、sys.platform 或 platform.system? 2022-01-01
- 在 Python 中,如果我在一个“with"中返回.块,文件还会关闭吗? 2022-01-01
- 如何让 python 脚本监听来自另一个脚本的输入 2022-01-01
- 线程时出现 msgbox 错误,GUI 块 2022-01-01
- 使用PyInstaller后在Windows中打开可执行文件时出错 2022-01-01