为什么泡菜比 np.save 花费这么多时间?-Python问题

Why does pickle take so much longer than np.save?(为什么泡菜比 np.save 花费这么多时间?)

本文介绍了为什么泡菜比 np.save 花费这么多时间?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我想保存 dict 或数组.

我尝试使用 np.save 和 pickle 并发现前者总是花费更少的时间.

I try both with np.save and with pickle and see that the former always take much less time.

我的实际数据要大得多，但我只是在这里展示一小块用于演示目的:

My actual data is much bigger but I just present a small piece here for demonstration purposes:

import numpy as np
#import numpy.array as array
import time
import pickle

b = {0: [np.array([0, 0, 0, 0])], 1: [np.array([1, 0, 0, 0]), np.array([0, 1, 0, 0]), np.array([0, 0, 1, 0]), np.array([0, 0, 0, 1]), np.array([-1,  0,  0,  0]), np.array([ 0, -1,  0,  0]), np.array([ 0,  0, -1,  0]), np.array([ 0,  0,  0, -1])], 2: [np.array([2, 0, 0, 0]), np.array([1, 1, 0, 0]), np.array([1, 0, 1, 0]), np.array([1, 0, 0, 1]), np.array([ 1, -1,  0,  0]), np.array([ 1,  0, -1,  0]), np.array([ 1,  0,  0, -1])], 3: [np.array([1, 0, 0, 0]), np.array([0, 1, 0, 0]), np.array([0, 0, 1, 0]), np.array([0, 0, 0, 1]), np.array([-1,  0,  0,  0]), np.array([ 0, -1,  0,  0]), np.array([ 0,  0, -1,  0]), np.array([ 0,  0,  0, -1])], 4: [np.array([2, 0, 0, 0]), np.array([1, 1, 0, 0]), np.array([1, 0, 1, 0]), np.array([1, 0, 0, 1]), np.array([ 1, -1,  0,  0]), np.array([ 1,  0, -1,  0]), np.array([ 1,  0,  0, -1])], 5: [np.array([0, 0, 0, 0])], 6: [np.array([1, 0, 0, 0]), np.array([0, 1, 0, 0]), np.array([0, 0, 1, 0]), np.array([0, 0, 0, 1]), np.array([-1,  0,  0,  0]), np.array([ 0, -1,  0,  0]), np.array([ 0,  0, -1,  0]), np.array([ 0,  0,  0, -1])], 2: [np.array([2, 0, 0, 0]), np.array([1, 1, 0, 0]), np.array([1, 0, 1, 0]), np.array([1, 0, 0, 1]), np.array([ 1, -1,  0,  0]), np.array([ 1,  0, -1,  0]), np.array([ 1,  0,  0, -1])], 7: [np.array([1, 0, 0, 0]), np.array([0, 1, 0, 0]), np.array([0, 0, 1, 0]), np.array([0, 0, 0, 1]), np.array([-1,  0,  0,  0]), np.array([ 0, -1,  0,  0]), np.array([ 0,  0, -1,  0]), np.array([ 0,  0,  0, -1])], 8: [np.array([2, 0, 0, 0]), np.array([1, 1, 0, 0]), np.array([1, 0, 1, 0]), np.array([1, 0, 0, 1]), np.array([ 1, -1,  0,  0]), np.array([ 1,  0, -1,  0]), np.array([ 1,  0,  0, -1])]}


start_time = time.time()
with open('testpickle', 'wb') as myfile:
    pickle.dump(b, myfile)
print("--- Time to save with pickle: %s milliseconds ---" % (1000*time.time() - 1000*start_time))

start_time = time.time()
np.save('numpy', b)
print("--- Time to save with numpy: %s milliseconds ---" % (1000*time.time() - 1000*start_time))

start_time = time.time()
with open('testpickle', 'rb') as myfile:
    g1 = pickle.load(myfile)
print("--- Time to load with pickle: %s milliseconds ---" % (1000*time.time() - 1000*start_time))

start_time = time.time()
g2 = np.load('numpy.npy')
print("--- Time to load with numpy: %s milliseconds ---" % (1000*time.time() - 1000*start_time))

给出输出:

--- Time to save with pickle: 4.0 milliseconds ---
--- Time to save with numpy: 1.0 milliseconds ---
--- Time to load with pickle: 2.0 milliseconds ---
--- Time to load with numpy: 1.0 milliseconds ---

根据我的实际大小(字典中约 100,000 个键)，时差更加明显.

The time difference is even more pronounced with my actual size (~100,000 keys in the dict).

为什么 pickle 的保存和加载时间都比 np.save 长?

Why does pickle take longer than np.save, both for saving and for loading?

我应该什么时候使用 pickle?

When should I use pickle?