问题描述
假设我们有一组原始数据:
Suppose we have a collection of raw data:
{ "person": "David, age 102"}
{ "person": "Max, age 8" }
我们想将该集合转换为:
and we'd like to transform that collection to:
{ "age": 102 }
{ "age": 8 }
仅使用 mongo(d) 引擎.(如果所有人名或年龄的长度相等, $substr 可以完成这项工作,)有可能吗?
using only mongo(d) engine. (If all person names or ages had equal lengths, $substr could do the job, ) Is it possible?
假设正则表达式是微不足道的/d+/
Suppose regex is trivial /d+/
推荐答案
MongoDB 3.4版本中的最优方式.
此版本的 mongod 提供 $split 运算符,它当然会拆分字符串,如此处所示.
然后我们使用 $let 变量运算符.然后可以在 in 表达式中使用新值,以使用 $arrayElemAt 运算符返回指定索引处的元素;0 表示第一个元素,-1 表示最后一个元素.
We then assign the the newly computed value to a variable using the $let variable operator. The new value can then be use in the in expression to return the "name" and the "age" values using the $arrayElemAt operator to return the element at a specified index; 0 for the first element and -1 for the last element. 
请注意,在 in 表达式中,我们需要拆分最后一个元素才能返回整数字符串.
Note that in the in expression we need to split the last element in order to return the string of integer.
最后我们需要迭代 Cursor 对象并使用 Number 或 parseInt 并使用批量操作和 bulkWrite() 方法到 $set 这些字段的值以获得最大效率.
Finally we need to iterate the Cursor object and cast the convert the string of integer to numeric using Number or parseInt and use bulk operation and the bulkWrite() method to $set the value for those field for maximum efficiency.
let requests = [];
db.coll.aggregate(
    [
        { "$project": {  
            "person": { 
                "$let": { 
                    "vars": { 
                        "infos":  { "$split": [ "$person", "," ] } 
                    }, 
                    "in": { 
                        "name": { "$arrayElemAt": [ "$$infos", 0 ] }, 
                        "age": { 
                            "$arrayElemAt": [ 
                                { "$split": [ 
                                    { "$arrayElemAt": [ "$$infos", -1 ] }, 
                                    " " 
                                ]}, 
                                -1 
                            ] 
                        } 
                    } 
                } 
            }  
        }}
    ] 
).forEach(document => { 
    requests.push({ 
        "updateOne": { 
            "filter": { "_id": document._id }, 
            "update": { 
                "$set": { 
                    "name": document.person.name, 
                    "age": Number(document.person.age) 
                },
                "$unset": { "person": " " }
            } 
        } 
    }); 
    if ( requests.length === 500 ) { 
        // Execute per 500 ops and re-init
        db.coll.bulkWrite(requests); 
        requests = []; 
    }} 
);
 // Clean up queues
if(requests.length > 0) {
    db.coll.bulkWrite(requests);
}
<小时>
MongoDB 3.2 或更新版本.
MongoDB 3.2 弃用了旧的 Bulk() API 及其相关的方法 并提供bulkWrite() 方法,但它不提供 $split 运算符,因此我们这里唯一的选择是使用 mapReduce() 方法来转换我们的数据,然后使用批量操作更新集合.
MongoDB 3.2 or newer.
MongoDB 3.2 deprecates the old Bulk() API and its associated methods and provides the bulkWrite() method but it doesn't provide the $split operator so the only option we have here is to use the mapReduce() method to transform our data then update the collection using bulk operation.
var mapFunction = function() { 
    var person = {}, 
    infos = this.person.split(/[,s]+/); 
    person["name"] = infos[0]; 
    person["age"] = infos[2]; 
    emit(this._id, person); 
};
var results = db.coll.mapReduce(
    mapFunction, 
    function(key, val) {}, 
    { "out": { "inline": 1 } }
)["results"];
results.forEach(document => { 
    requests.push({ 
        "updateOne": { 
            "filter": { "_id": document._id }, 
            "update": { 
                "$set": { 
                    "name": document.value.name, 
                    "age": Number(document.value.age) 
                }, 
                "$unset": { "person": " " }
            } 
        } 
    }); 
    if ( requests.length === 500 ) { 
        // Execute per 500 operations and re-init
        db.coll.bulkWrite(requests); 
        requests = []; 
    }} 
);
// Clean up queues
if(requests.length > 0) {
    db.coll.bulkWrite(requests);
}
<小时>
MongoDB 版本 2.6 或 3.0.
我们需要使用现已弃用的 Bulk API.p>
var bulkOp = db.coll.initializeUnorderedBulkOp();
var count = 0;
results.forEach(function(document) { 
    bulkOp.find({ "_id": document._id}).updateOne(
        { 
            "$set": { 
                "name": document.value.name, 
                "age": Number(document.value.age)
            },
            "$unset": { "person": " " }
        }
    );
    count++;
    if (count === 500 ) {
        // Execute per 500 operations and re-init
        bulkOp.execute();
        bulkOp = db.coll.initializeUnorderedBulkOp();
    }
});
// clean up queues
if (count > 0 ) {
    bulkOp.execute();
}
                        这篇关于通过拆分字段值来重塑文档的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持跟版网!



大气响应式网络建站服务公司织梦模板
高端大气html5设计公司网站源码
织梦dede网页模板下载素材销售下载站平台(带会员中心带筛选)
财税代理公司注册代理记账网站织梦模板(带手机端)
成人高考自考在职研究生教育机构网站源码(带手机端)
高端HTML5响应式企业集团通用类网站织梦模板(自适应手机端)