Search DynamoDB tables using Elasticsearch/Kibana via Logstash plugin

Written by mannem on . Posted in Dynamo DB, Elasticsearch


The Logstash plugin for Amazon DynamoDB gives you a nearly real-time view of the data in your DynamoDB table. The Logstash plugin for DynamoDB uses DynamoDB Streams to parse and output data as it is added to a DynamoDB table. After you install and activate the Logstash plugin for DynamoDB, it scans the data in the specified table, and then it starts consuming your updates using Streams and then outputs them to Elasticsearch, or a Logstash output of your choice.

Logstash is a data pipeline service that processes data, parses data, and then outputs it to a selected location in a selected format. Elasticsearch is a distributed, full-text search server. For more information about Logstash and Elasticsearch, go to https://www.elastic.co/products/elasticsearch.

Amazon Elasticsearch Service is a managed service that makes it easy to deploy, operate, and scale Elasticsearch in the AWS Cloud. aws.amazon.com/elasticsearch-service/


This article includes an installation guide that is tested on EC2 instance where all the per-requsites are installed and Logstash is configured so that it connects to Amazon ElasticSearch using the input/Output plugins to start indexing records from DynamoDB. Click here to get all the instructions :
https://github.com/mannem/logstash-input-dynamodb


Logstash configuration:

After running a similar command on the shell, Logstash should successfully start and begin indexing the records from your DynamoDB table.


Throughput considerations:


Kibana:


References:
http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/Tools.DynamoDBLogstash.html

https://aws.amazon.com/blogs/aws/new-logstash-plugin-search-dynamodb-content-using-elasticsearch/

https://github.com/awslabs/logstash-input-dynamodb


Similar plugins:

https://github.com/kzwang/elasticsearch-river-dynamodb

Query AWS ES cluster by signing http requests with AWS IAM roles (python)

Written by mannem on . Posted in Elasticsearch

The AWS public facing documentation provides some python examples to sign the http reqests with IAM users’s to access other AWS resources. In this case, AWS ES cluster whose access policies are restricted to those IAM users.

If you wish to restrict the access to ES cluster with IAM roles instead, the signing process is a bit different.

The document (http://docs.aws.amazon.com/general/latest/gr/sigv4-signed-request-examples.html) seem to be only for IAM users but not for IAM roles.


Changing the signed header

Signing requests with IAM roles need additional header called ‘session token’ added the request header using a header name of ‘x-amz-security-token’.

So, in the ESrequest.py replacing this line:

headers = {'x-amz-date':amzdate, 'Authorization':authorization_header}

With the following should work for signing requests with IAM roles cred’s.

headers = {'x-amz-date':amzdate, 'Authorization':authorization_header, 'x-amz-security-token':token}

Where, the session ‘token’ string should be obtained from the corresponding IAM role. (These credential trio for roles are rotated frequently and they have an expiration date. So make sure you are using unexpired token)


Obtaining ‘token’ string from IAM Roles

If an EC2 instance is assuming a role, you can get it with

curl http://169.254.169.254/latest/meta-data/iam/security-credentials/MyEc2Role
(http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/iam-roles-for-amazon-ec2.html#instance-metadata-security-credentials)

In python, all these credentials can be obtained with ‘requests’ module and parsing them accordingly.

The token can be obtained similarly from other services assuming IAM roles like Lambda etc.


Find the end-to-end code here : https://github.com/mannem/elasticsearchDev/blob/master/ES_V4Signing_IAMRoles