multiprocessing.pool.MaybeEncodingError: #39;TypeError(quot;cannot serialize #39;_io.BufferedReader#39; objectquot;,)#39;(multiprocessing.pool.MaybeEncodingError: TypeError(cannot serialize _io.BufferedReader object,))
问题描述
为什么下面的代码只适用于multiprocessing.dummy
,而不适用于简单的multiprocessing
.
Why does the code below work only with multiprocessing.dummy
, but not with simple multiprocessing
.
import urllib.request
#from multiprocessing.dummy import Pool #this works
from multiprocessing import Pool
urls = ['http://www.python.org', 'http://www.yahoo.com','http://www.scala.org', 'http://www.google.com']
if __name__ == '__main__':
with Pool(5) as p:
results = p.map(urllib.request.urlopen, urls)
错误:
Traceback (most recent call last):
File "urlthreads.py", line 31, in <module>
results = p.map(urllib.request.urlopen, urls)
File "C:UserspatriAnaconda3libmultiprocessingpool.py", line 268, in map
return self._map_async(func, iterable, mapstar, chunksize).get()
File "C:UserspatriAnaconda3libmultiprocessingpool.py", line 657, in get
raise self._value
multiprocessing.pool.MaybeEncodingError: Error sending result: '[<http.client.HTTPResponse object at 0x0000016AEF204198>]'. Reason: 'TypeError("cannot serialize '_io.BufferedReader' object")'
缺少什么才能在没有虚拟"的情况下工作?
What's missing so that it works without "dummy" ?
推荐答案
你从 urlopen()
得到的 http.client.HTTPResponse
-object 有一个 >_io.BufferedReader
- 附加对象,这个对象不能被pickle.
The http.client.HTTPResponse
-object you get back from urlopen()
has a _io.BufferedReader
-object attached, and this object cannot be pickled.
pickle.dumps(urllib.request.urlopen('http://www.python.org').fp)
Traceback (most recent call last):
...
pickle.dumps(urllib.request.urlopen('http://www.python.org').fp)
TypeError: cannot serialize '_io.BufferedReader' object
multiprocessing.Pool
将需要腌制(序列化)结果以将其发送回父进程,但此处失败.由于 dummy
使用线程而不是进程,因此不会出现酸洗,因为同一进程中的线程自然共享它们的内存.
multiprocessing.Pool
will need to pickle (serialize) the results to send it back to the parent process and this fails here. Since dummy
uses threads instead of processes, there will be no pickling, because threads in the same process share their memory naturally.
这个TypeError
的一般解决方案是:
A general solution to this TypeError
is:
- 读出缓冲区并保存内容(如果需要)
- 从您尝试腌制的对象中删除对
'_io.BufferedReader'
的引用
在您的情况下,在 http.client.HTTPResponse
上调用 .read()
将清空并删除缓冲区,因此是用于将响应转换为可腌制内容的函数可以这样做:
In your case, calling .read()
on the http.client.HTTPResponse
will empty and remove the buffer, so a function for converting the response into something pickleable could simply do this:
def read_buffer(response):
response.text = response.read()
return response
例子:
r = urllib.request.urlopen('http://www.python.org')
r = read_buffer(r)
pickle.dumps(r)
# Out: b'x80x03chttp.client
HTTPResponse...
在考虑这种方法之前,请确保您确实想要使用多处理而不是多线程.对于像您在此处拥有的 I/O 绑定任务,多线程就足够了,因为无论如何大部分时间都花在等待响应上(不需要 cpu 时间).多处理和所涉及的 IPC 也会带来大量开销.
Before you consider this approach, make sure you really want to use multiprocessing instead of multithreading. For I/O-bound tasks like you have it here, multithreading would be sufficient, since most of the time is spend in waiting (no need for cpu-time) for the response anyway. Multiprocessing and the IPC involved also introduces substantial overhead.
这篇关于multiprocessing.pool.MaybeEncodingError: 'TypeError("cannot serialize '_io.BufferedReader' object",)'的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持编程学习网!
本文标题为:multiprocessing.pool.MaybeEncodingError: 'TypeError("cannot serialize '_io.BufferedReader' object",)'
基础教程推荐
- 用于分类数据的跳跃记号标签 2022-01-01
- 使用PyInstaller后在Windows中打开可执行文件时出错 2022-01-01
- 如何让 python 脚本监听来自另一个脚本的输入 2022-01-01
- Python kivy 入口点 inflateRest2 无法定位 libpng16-16.dll 2022-01-01
- 何时使用 os.name、sys.platform 或 platform.system? 2022-01-01
- 筛选NumPy数组 2022-01-01
- 线程时出现 msgbox 错误,GUI 块 2022-01-01
- 在 Python 中,如果我在一个“with"中返回.块,文件还会关闭吗? 2022-01-01
- 如何在海运重新绘制中自定义标题和y标签 2022-01-01
- Dask.array.套用_沿_轴:由于额外的元素([1]),使用dask.array的每一行作为另一个函数的输入失败 2022-01-01