Python file parsing: Build tree from text file(Python文件解析:从文本文件构建树)
问题描述
我有一个用于构建树的缩进文本文件.每条线代表一个节点,缩进代表深度以及当前节点是其子节点的节点.
例如,一个文件可能看起来像
<前>根节点 1节点2节点3节点4节点5节点6表示ROOT包含三个子节点:1、5、6,Node1有一个子节点:2,Node2有一个子节点:3等
我提出了一个递归算法并对其进行了编程并且它可以工作,但是它有点丑陋,尤其是非常粗暴地处理上面的示例(从节点 4 到节点 5 时)
它使用缩进计数"作为递归的基础,所以如果缩进数=当前深度+1,我会更深一层.但这意味着当我阅读缩进较少的一行时,我必须一次返回一级,每次检查深度.
这是我所拥有的
def _recurse_tree(node, parent, depth):标签 = 0而节点:tabs = node.count(" ")如果选项卡 == 深度:打印%s:%s"%(parent.strip(), node.strip())elif 选项卡 == 深度 + 1:节点 = _recurse_tree(节点,上一个,深度+1)tabs = node.count(" ")#检查我们是否需要再浮出水面如果选项卡 == 深度:打印%s:%s"%(parent.strip(), node.strip())别的:返回节点别的:返回节点上一个 = 节点node = inFile.readline().rstrip()inFile = open(test.txt")root = inFile.readline().rstrip()node = inFile.readline().rstrip()_recurse_tree(节点,根,1)
现在我只是打印出节点来验证每一行的父节点是否正确,但也许有更简洁的方法来做到这一点?尤其是当我从每次递归调用返回时 elif 块中的情况.
我认为最大的问题是前瞻"导致了有问题的丑陋.可以稍微缩短:
def _recurse_tree(parent, depth, source):last_line = source.readline().rstrip()而最后一行:tabs = last_line.count(' ')如果制表符 <深度:休息节点 = last_line.strip()如果制表符 >= 深度:如果父母不是无:打印 "%s: %s" %(parent, node)last_line = _recurse_tree(节点,标签+1,源)返回最后一行inFile = open("test.txt")_recurse_tree(无,0,inFile)
因为我们在谈论递归,所以我煞费苦心地避免使用任何全局变量(source
和 last_line
).让它们成为某个解析器对象的成员会更加 Pythonic.
I have an indented text file that will be used to build a tree. Each line represents a node, and indents represent depth as well as node the current node is a child of.
For example, a file might look like
ROOT Node1 Node2 Node3 Node4 Node5 Node6
Which indicates that ROOT contains three children: 1, 5, and 6, Node1 has one child: 2, and Node2 has one child: 3, etc.
I have come up with a recursive algorithm and have programmed it and it works, but it's kind of ugly and especially treats the example above very crudely (when going from node 4 to node 5)
It uses "indent count" as the basis for recursion, so if the number of indents = current depth + 1, I would go one level deeper. But this means when I read a line with less indents, I have to come back up one level at a time, checking the depth each time.
Here is what I have
def _recurse_tree(node, parent, depth):
tabs = 0
while node:
tabs = node.count(" ")
if tabs == depth:
print "%s: %s" %(parent.strip(), node.strip())
elif tabs == depth + 1:
node = _recurse_tree(node, prev, depth+1)
tabs = node.count(" ")
#check if we have to surface some more
if tabs == depth:
print "%s: %s" %(parent.strip(), node.strip())
else:
return node
else:
return node
prev = node
node = inFile.readline().rstrip()
inFile = open("test.txt")
root = inFile.readline().rstrip()
node = inFile.readline().rstrip()
_recurse_tree(node, root, 1)
Right now I am just printing out the nodes to verify that the parent node is correct for each line, but maybe there is a cleaner way to do it? Especially the case in the elif block when I'm coming back from each recursion call.
The big issue is the "lookahead" that I think caused the ugliness in question. It can be shortened slightly:
def _recurse_tree(parent, depth, source):
last_line = source.readline().rstrip()
while last_line:
tabs = last_line.count(' ')
if tabs < depth:
break
node = last_line.strip()
if tabs >= depth:
if parent is not None:
print "%s: %s" %(parent, node)
last_line = _recurse_tree(node, tabs+1, source)
return last_line
inFile = open("test.txt")
_recurse_tree(None, 0, inFile)
Since we're talking recursion, I took pains to avoid any global variables (source
and last_line
). It would be more pythonic to make them members on some parser object.
这篇关于Python文件解析:从文本文件构建树的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持编程学习网!
本文标题为:Python文件解析:从文本文件构建树
基础教程推荐
- 筛选NumPy数组 2022-01-01
- Dask.array.套用_沿_轴:由于额外的元素([1]),使用dask.array的每一行作为另一个函数的输入失败 2022-01-01
- 使用PyInstaller后在Windows中打开可执行文件时出错 2022-01-01
- 在 Python 中,如果我在一个“with"中返回.块,文件还会关闭吗? 2022-01-01
- Python kivy 入口点 inflateRest2 无法定位 libpng16-16.dll 2022-01-01
- 何时使用 os.name、sys.platform 或 platform.system? 2022-01-01
- 用于分类数据的跳跃记号标签 2022-01-01
- 如何让 python 脚本监听来自另一个脚本的输入 2022-01-01
- 线程时出现 msgbox 错误,GUI 块 2022-01-01
- 如何在海运重新绘制中自定义标题和y标签 2022-01-01