pipe large amount of data to stdin while using subprocess.Popen(使用 subprocess.Popen 时将大量数据传输到标准输入)
问题描述
我有点难以理解解决这个简单问题的 python 方法是什么.
I'm kind of struggling to understand what is the python way of solving this simple problem.
我的问题很简单.如果您使用以下代码,它将挂起.这在子流程模块文档中有详细记录.
My problem is quite simple. If you use the follwing code it will hang. This is well documented in the subprocess module doc.
import subprocess
proc = subprocess.Popen(['cat','-'],
stdin=subprocess.PIPE,
stdout=subprocess.PIPE,
)
for i in range(100000):
proc.stdin.write('%d
' % i)
output = proc.communicate()[0]
print output
搜索解决方案(有一个非常有见地的线程,但我现在已经丢失了)我发现了这个使用显式分叉的解决方案(以及其他解决方案):
Searching for a solution (there is a very insightful thread, but I've lost it now) I found this solution (among others) that uses an explicit fork:
import os
import sys
from subprocess import Popen, PIPE
def produce(to_sed):
for i in range(100000):
to_sed.write("%d
" % i)
to_sed.flush()
#this would happen implicitly, anyway, but is here for the example
to_sed.close()
def consume(from_sed):
while 1:
res = from_sed.readline()
if not res:
sys.exit(0)
#sys.exit(proc.poll())
print 'received: ', [res]
def main():
proc = Popen(['cat','-'],stdin=PIPE,stdout=PIPE)
to_sed = proc.stdin
from_sed = proc.stdout
pid = os.fork()
if pid == 0 :
from_sed.close()
produce(to_sed)
return
else :
to_sed.close()
consume(from_sed)
if __name__ == '__main__':
main()
虽然这个解决方案在概念上很容易理解,但它使用了一个更多的进程,并且与子进程模块相比,它的级别太低了(那只是为了隐藏这类事情......).
While this solution is conceptually very easy to understand, it uses one more process and stuck as too low level compared to the subprocess module (that is there just to hide this kind of things...).
我想知道:是否有使用不会挂起的子流程模块的简单而干净的解决方案,或者为了实现这种模式,我必须退后一步并实现旧式选择循环或显式分叉?
I'm wondering: is there a simple and clean solution using the subprocess module that won't hung or to implement this patter I have to do a step back and implement an old-style select loop or an explicit fork?
谢谢
推荐答案
如果你想要一个纯 Python 解决方案,你需要将阅读器或编写器放在单独的线程中.threading
包是一种轻量级的方法,可以方便地访问通用对象并且没有混乱的分叉.
If you want a pure Python solution, you need to put either the reader or the writer in a separate thread. The threading
package is a lightweight way to do this, with convenient access to common objects and no messy forking.
import subprocess
import threading
import sys
proc = subprocess.Popen(['cat','-'],
stdin=subprocess.PIPE,
stdout=subprocess.PIPE,
)
def writer():
for i in range(100000):
proc.stdin.write(b'%d
' % i)
proc.stdin.close()
thread = threading.Thread(target=writer)
thread.start()
for line in proc.stdout:
sys.stdout.write(line.decode())
thread.join()
proc.wait()
看到 subprocess
模块经过现代化改造以支持流和协程可能会很好,这将允许更优雅地构建混合 Python 片段和 shell 片段的管道.
It might be neat to see the subprocess
module modernized to support streams and coroutines, which would allow pipelines that mix Python pieces and shell pieces to be constructed more elegantly.
这篇关于使用 subprocess.Popen 时将大量数据传输到标准输入的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持编程学习网!
本文标题为:使用 subprocess.Popen 时将大量数据传输到标准输入
基础教程推荐
- 如何让 python 脚本监听来自另一个脚本的输入 2022-01-01
- 何时使用 os.name、sys.platform 或 platform.system? 2022-01-01
- 使用PyInstaller后在Windows中打开可执行文件时出错 2022-01-01
- 用于分类数据的跳跃记号标签 2022-01-01
- 如何在海运重新绘制中自定义标题和y标签 2022-01-01
- Dask.array.套用_沿_轴:由于额外的元素([1]),使用dask.array的每一行作为另一个函数的输入失败 2022-01-01
- 在 Python 中,如果我在一个“with"中返回.块,文件还会关闭吗? 2022-01-01
- 线程时出现 msgbox 错误,GUI 块 2022-01-01
- 筛选NumPy数组 2022-01-01
- Python kivy 入口点 inflateRest2 无法定位 libpng16-16.dll 2022-01-01