Why does Python #39;for word in words:#39; iterate on individual characters instead of words?(为什么 Python for word in words: 迭代单个字符而不是单词?)
问题描述
当我在字符串 words
上运行以下代码时:
When I run the following code on a string words
:
def word_feats(words):
return dict([(word, True) for word in words])
print(word_feats("I love this sandwich."))
我得到的是字母而不是单词的输出字典理解:
I get the output dict-comprehension in letters instead of words:
{'a': True, ' ': True, 'c': True, 'e': True, 'd': True, 'I': True, 'h': True, 'l': True, 'o': True, 'n': True, 'i': True, 's': True, 't': True, 'w': True, 'v': True, '.': True}
我做错了什么?
推荐答案
你需要显式拆分空格上的字符串:
You need to explicitly split the string on whitespace:
def word_feats(words):
return dict([(word, True) for word in words.split()])
这使用 str.split()
而没有参数,在任意宽度的空白处分割(包括制表符和行分隔符).否则,字符串是单个字符的序列,直接迭代实际上只会遍历每个字符.
This uses str.split()
without arguments, splitting on arbitrary-width whitespace (including tabs and line separators).
A string is a sequence of individual characters otherwise, and direct iteration will indeed just loop over each character.
然而,拆分成单词必须是您需要自己执行的显式操作,因为不同的用例对如何将字符串拆分成单独的部分有不同的需求.例如,标点符号算不算?括号或引用呢,也许按这些分组的单词不应该分开?等等.
Splitting into words, however, has to be an explicit operation you need to perform yourself, because different use-cases will have different needs on how to split a string into separate parts. Does punctuation count, for example? What about parenthesis or quoting, should words grouped by those not be split, perhaps? Etc.
如果您所做的只是将所有值设置为 True
,那么使用 dict.fromkeys()
改为:
If all you are doing is setting all values to True
, it'll be much more efficient to use dict.fromkeys()
instead:
def word_feats(words):
return dict.fromkeys(words.split(), True)
演示:
>>> def word_feats(words):
... return dict.fromkeys(words.split(), True)
...
>>> print(word_feats("I love this sandwich."))
{'I': True, 'this': True, 'love': True, 'sandwich.': True}
这篇关于为什么 Python 'for word in words:' 迭代单个字符而不是单词?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持编程学习网!
本文标题为:为什么 Python 'for word in words:' 迭代单个字符而不是单词?
基础教程推荐
- Dask.array.套用_沿_轴:由于额外的元素([1]),使用dask.array的每一行作为另一个函数的输入失败 2022-01-01
- 如何在海运重新绘制中自定义标题和y标签 2022-01-01
- Python kivy 入口点 inflateRest2 无法定位 libpng16-16.dll 2022-01-01
- 线程时出现 msgbox 错误,GUI 块 2022-01-01
- 筛选NumPy数组 2022-01-01
- 如何让 python 脚本监听来自另一个脚本的输入 2022-01-01
- 何时使用 os.name、sys.platform 或 platform.system? 2022-01-01
- 在 Python 中,如果我在一个“with"中返回.块,文件还会关闭吗? 2022-01-01
- 使用PyInstaller后在Windows中打开可执行文件时出错 2022-01-01
- 用于分类数据的跳跃记号标签 2022-01-01