如何通过流从 dynamodb 推送数据-Python问题

How to push the data from dynamodb through stream(如何通过流从 dynamodb 推送数据)

本文介绍了如何通过流从 dynamodb 推送数据的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

下面是json文件

<代码>[{年份":2013 年，标题":匆忙"，演员":[丹尼尔·布鲁尔"，克里斯·海姆斯沃斯"，奥利维亚·王尔德"]},{年份":2013 年，标题":囚犯"，演员":[休杰克曼"，杰克·吉伦哈尔"，维奥拉戴维斯"]}]

下面是推送到 dynamodb 的代码.我创建了 testjsonbucket 存储桶名称，moviedataten.json 是文件名并保存在 json 之上.创建一个 dynamodb，主分区键为年份(数字)和主排序键作为标题(字符串).

导入json从十进制导入十进制导入json导入 boto3s3 = boto3.resource('s3')obj = s3.Object('testjsonbucket', 'moviedataten.json')正文 = obj.json#def lambda_handler(事件，上下文):# 打印(正文)def load_movies(电影，dynamodb=None):如果不是 dynamodb:dynamodb = boto3.resource('dynamodb')table = dynamodb.Table('电影')对于电影中的电影:年份 = int(电影['年份'])标题 = 电影['标题']print("添加电影:", 年份, 标题)table.put_item(项目=电影)def lambda_handler(事件，上下文):movie_list = json.loads(body, parse_float=Decimal)加载电影(电影列表)

我想从 dynamodb 推送到 ElasticSearch.
我创建了一个弹性域https://xx.x.x.com/testelas

我已经浏览了链接

您也可以查看这篇描述另一种方法的文章 data-streaming-from-dynamodb-to-elasticsearch

对于上述方法，请查看这个 GitHub 项目 dynamodb-stream-elasticsearch.

const { pushStream } = require('dynamodb-stream-elasticsearch');常量 { ES_ENDPOINT，索引，类型} = process.env；功能 myHandler(事件，上下文，回调){console.log('收到事件:', JSON.stringify(event, null, 2));pushStream({ 事件，端点:ES_ENDPOINT，索引:INDEX，类型:TYPE }).then(() => {callback(null, `成功处理 ${event.Records.length} 记录.`);}).catch((e) => {回调(`错误${e}`, null);});}出口.handler = myHandler;

Below is the json file

[
    {
        "year": 2013,
        "title": "Rush",
        "actors": [
                "Daniel Bruhl",
                "Chris Hemsworth",
                "Olivia Wilde"
            ]
        
    },
    {
        "year": 2013,
        "title": "Prisoners",
        "actors": [
                "Hugh Jackman",
                "Jake Gyllenhaal",
                "Viola Davis"
            ]
        }
]

Below is the code to push to dynamodb. I have created testjsonbucket bucket name, moviedataten.json is the filename and saved above json.Create a dynamodb with Primary partition key as year (Number) and Primary sort key as title (String).

import json
from decimal import Decimal
import json
import boto3
s3 = boto3.resource('s3')
obj = s3.Object('testjsonbucket', 'moviedataten.json')
body = obj.json
#def lambda_handler(event,context):
#    print (body)

def load_movies(movies, dynamodb=None):
    if not dynamodb:
        dynamodb = boto3.resource('dynamodb')

    table = dynamodb.Table('Movies')
    for movie in movies:
        year = int(movie['year'])
        title = movie['title']
        print("Adding movie:", year, title)
        table.put_item(Item=movie)


def lambda_handler(event, context):
    movie_list = json.loads(body, parse_float=Decimal)
    load_movies(movie_list)

I want to push in to ElasticSearch from dynamodb.
I have created a Elastic Domain https://xx.x.x.com/testelas
I have gone through the link https://aws.amazon.com/blogs/compute/indexing-amazon-dynamodb-content-with-amazon-elasticsearch-service-using-aws-lambda/
I clicked Managestream also

My Requirement:

Any change in Dynamodb has to reflect in the Elasticsearch?

解决方案

This lambda just writing the document to DynamoDb, and I will not recommend adding the code in this lambda to push the same object to Elastic search, as lambda function should perform a single task and pushing the same document to ELK should be managed as a DynamoDB stream.

What if ELK is down or not available how you will manage this in lambda?
What if you want to disable this in future? you will need to modify lambda instead of controlling this from AWS API or AWS console, all you need to just disable the stream when required no changes on above lambda side code
What if you want to move only modify or TTL item to elastic search?

So create Dyanodb Stream that pushes the document to another Lambda that is responsible to push the document to ELK, with this option you can also push old and new both items.

You can look into this article too that describe another approach data-streaming-from-dynamodb-to-elasticsearch

For above approach look into this GitHub project dynamodb-stream-elasticsearch.

const { pushStream } = require('dynamodb-stream-elasticsearch');

const { ES_ENDPOINT, INDEX, TYPE } = process.env;

function myHandler(event, context, callback) {
  console.log('Received event:', JSON.stringify(event, null, 2));
  pushStream({ event, endpoint: ES_ENDPOINT, index: INDEX, type: TYPE })
    .then(() => {
      callback(null, `Successfully processed ${event.Records.length} records.`);
    })
    .catch((e) => {
      callback(`Error ${e}`, null);
    });
}

exports.handler = myHandler;

这篇关于如何通过流从 dynamodb 推送数据的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持编程学习网！