Update large sqlite database in chunks(分块更新大型 sqlite 数据库)
问题描述
我有一个 sqlite 数据库(大约 11 GB),它有多个表,包括表 distance
和 vertices
.distance
表很大(120 mio 行),vertices
较小(15 000 行).我想在 python 中使用 sqlite3 通过 vertices
中另一列的值来更新 distance
的一列.表顶点在列 cat
上有一个索引,在 orig_cat
上有另一个索引.
I have a sqlite database (appr. 11 GB) that has multiple tables including the tables distance
and vertices
. The table distance
is pretty large (120 mio rows), vertices
is smaller (15 000 rows). I want to use sqlite3 in python to update one column of distance
by values of another column in vertices
. The table vertices has an index on column cat
and another index on orig_cat
.
我在做什么:
import sqlite3
db_path='path/to/db.db'
conn = sqlite3.connect(db_path)
cur = conn.cursor()
cur.execute('''UPDATE distance SET
from_orig_v = (SELECT orig_cat FROM vertices WHERE cat=distance.source)''')
但是,在如此大的数据库上运行该更新语句会导致内存错误.内存使用量稳步增加,直到崩溃.我正在寻找建议在不耗尽内存的情况下执行如此大的更新语句?也许以块的形式处理更新(即 distance
表的行)并在例如之后提交1000 次更新释放内存?这将如何在 python/sqlite 中完成?
However running that update statement on such a large database, causes a memory error. The memory usage is increasing steadily until it crashes. I am looking for advise to perform such a large update statement without running out of memory? Maybe processing the update in chunks (i.e. rows of distance
table) and committing after e.g. 1000 updates to free memory? How would that be done in python/sqlite?
推荐答案
应该可以用这样的语句更新块:
It should be possible to update chunks with statements like this:
UPDATE distance SET ... WHERE rowid BETWEEN 100000 AND 200000;
您不需要使用多个事务;实际上必须保存在内存中的唯一内容是要在单个语句中更新的行列表.(理论上,当内存耗尽时,您应该会收到相应的错误消息.在实践中,某些操作系统 overcommit memory 并且在为时已晚之前不要告诉应用程序.)
You don't need to use multiple transactions; the only thing that actually must be kept in memory is the list of rows to be updated in a single statement. (In theory, when memory runs out, you should get an appropriate error message. In practice, some OSes overcommit memory and don't tell the application about it until it's too late.)
这篇关于分块更新大型 sqlite 数据库的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持编程学习网!
本文标题为:分块更新大型 sqlite 数据库
基础教程推荐
- Sql Server 字符串到日期的转换 2021-01-01
- 在 VB.NET 中更新 SQL Server DateTime 列 2021-01-01
- 使用pyodbc“不安全"的Python多处理和数据库访问? 2022-01-01
- SQL Server:只有 GROUP BY 中的最后一个条目 2021-01-01
- 将数据从 MS SQL 迁移到 PostgreSQL? 2022-01-01
- 如何在 SQL Server 的嵌套过程中处理事务? 2021-01-01
- 无法在 ubuntu 中启动 mysql 服务器 2021-01-01
- SQL Server 中单行 MERGE/upsert 的语法 2021-01-01
- SQL Server 2016更改对象所有者 2022-01-01
- ERROR 2006 (HY000): MySQL 服务器已经消失 2021-01-01