Load svmlight format error(加载 svmlight 格式错误)
问题描述
当我尝试将 svmlight python 包 与我已转换为 svmlight 格式的数据一起使用时我得到一个错误.它应该是非常基本的,我不明白发生了什么.代码如下:
When I try to use the svmlight python package with data I already converted to svmlight format I get an error. It should be pretty basic, I don't understand what's happening. Here's the code:
import svmlight
training_data = open('thedata', "w")
model=svmlight.learn(training_data, type='classification', verbosity=0)
我也试过了:
training_data = numpy.load('thedata')
和
training_data = __import__('thedata')
推荐答案
一个明显的问题是您在打开数据文件时会截断它,因为您指定了写入模式 "w".这意味着将没有要读取的数据.
One obvious problem is that you are truncating your data file when you open it because you are specifying write mode "w". This means that there will be no data to read.
无论如何,如果您的数据文件类似于此 example,因为是python文件,所以需要导入.这应该有效:
Anyway, you don't need to read the file like that if your data file is like the one in this example, you need to import it because it is a python file. This should work:
import svmlight
from data import train0 as training_data # assuming your data file is named data.py
# or you could use __import__()
#training_data = __import__('data').train0
model = svmlight.learn(training_data, type='classification', verbosity=0)
您可能希望将您的数据与示例的数据进行比较.
You might want to compare your data against that of the example.
数据文件格式明确后编辑
输入文件需要被解析成这样的元组列表:
The input file needs to be parsed into a list of tuples like this:
[(target, [(feature_1, value_1), (feature_2, value_2), ... (feature_n, value_n)]),
(target, [(feature_1, value_1), (feature_2, value_2), ... (feature_n, value_n)]),
...
]
svmlight 包似乎不支持读取 SVM 文件格式的文件,并且没有任何解析功能,因此必须在 Python 中实现.SVM 文件如下所示:
The svmlight package does not appear to support reading from a file in the SVM file format, and there aren't any parsing functions, so it will have to be implemented in Python. SVM files look like this:
<target> <feature>:<value> <feature>:<value> ... <feature>:<value> # <info>
所以这里有一个解析器,可以将文件格式转换为 svmlight 包所需的格式:
so here is a parser that converts from the file format to that required by the svmlight package:
def svm_parse(filename):
def _convert(t):
"""Convert feature and value to appropriate types"""
return (int(t[0]), float(t[1]))
with open(filename) as f:
for line in f:
line = line.strip()
if not line.startswith('#'):
line = line.split('#')[0].strip() # remove any trailing comment
data = line.split()
target = float(data[0])
features = [_convert(feature.split(':')) for feature in data[1:]]
yield (target, features)
你可以这样使用它:
import svmlight
training_data = list(svm_parse('thedata'))
model=svmlight.learn(training_data, type='classification', verbosity=0)
这篇关于加载 svmlight 格式错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持编程学习网!
本文标题为:加载 svmlight 格式错误
基础教程推荐
- 在同一图形上绘制Bokeh的烛台和音量条 2022-01-01
- 修改列表中的数据帧不起作用 2022-01-01
- 求两个直方图的卷积 2022-01-01
- 包装空间模型 2022-01-01
- 无法导入 Pytorch [WinError 126] 找不到指定的模块 2022-01-01
- PermissionError: pip 从 8.1.1 升级到 8.1.2 2022-01-01
- 在Python中从Azure BLOB存储中读取文件 2022-01-01
- 使用大型矩阵时禁止 Pycharm 输出中的自动换行符 2022-01-01
- PANDA VALUE_COUNTS包含GROUP BY之前的所有值 2022-01-01
- Plotly:如何设置绘图图形的样式,使其不显示缺失日期的间隙? 2022-01-01
