Archive for November 10, 2017

Querying DynamoDB export data using Athena (with LIST attribute)

Written by mannem on . Posted in Athena, AWS BIG DATA

Presto supports multiple Array and JSON functions using which you can write queries to get the required results. There is no single way to define a CREATE TABLE and later your QUERIES. You can have CREATE TABLE predefine the schema structure and later your queries can refer the elements you need in the schema. Or you can define your schema as a string and later use functions on your queries to parse this string to get the required results.

https://prestodb.io/docs/current/functions/array.html

https://prestodb.io/docs/current/functions/json.html

In addition, Athena has some examples to use most of the presto functions :

http://docs.aws.amazon.com/athena/latest/ug/querying-arrays.html

http://docs.aws.amazon.com/athena/latest/ug/rows-and-structs.html

http://docs.aws.amazon.com/athena/latest/ug/querying-JSON.html

Now, I was able to use the following CREATE TABLE syntax on the DynamoDB items having List of strings. I was able to flatten out the list of strings using some of the functions like CAST. Please note that this is not the only way to define table and query. My query might be over complexing what your are trying to get. There might be a simpler way as well. So, its really important that you understand the data types that you define and what each query returns and use the correct functions as each function takes a datatype and returns another datatype.