3 features of AWS Athena released in 2022 you do not want to miss?

3 features of AWS Athena released in 2022 you do not want to miss?

AWS Athena has been a most popular analytics service since inception. Amazon too invested lots of effort to modernize it. Amazon Athena is a serverless, interactive analytics service built on open-source frameworks, supporting open-table and file formats.

In 2022, Amazon released some very powerful features for Athena which makes Athena cost effective, reliable and versatile solution for analytics. Here are three powerful features you do not miss to know about-

Reusing query results

You have option to choose re-use the last stored query result. This will not scan any byte so will save the cost. You can specify a maximum age for reusing query results. The default age is 60 minutes. You should use this feature only when are sure the results will not change within a given time frame.

Apache Spark Support

You can interactively run Apache Spark Python codes(pySpark) on Athena console or using Athena Notebooks. Since you are using Athena serverless service for you Spark codes so there will be built in autoscaling and high availability. Athena notebooks are compatible with Jupyter notebooks and contain a list of cells that are executed in order.

Amazon Athena connector for Kafka

Now you can run SQL queries on the streaming data. This means you can now join the streaming data with the historical data laying in AWS S3 data lake. This connector supports multiple streaming engines like AWS MSK, self managed Kafka and Confluent Kafka.

#AWSAthena #athena #serverlesscomputing #BigData #DataLake #StreamingData #LearnAWS #Innovation #NewIn2022 #CloudComputing #AWSAnalytics

Nice write up, Pravin. You reminded me of some options that might improve my day to day. I've been utilizing Athena for large queries. I have been finding the initial query run time is slow, but I enjoy that you can stream results out of the s3 bucket no matter the size. It is kind of an interesting way to cache a query; larger results offset that initial run time. I can see some potential in tying data results to queries on cloud watch logs and linked to scheduling or messaging. All things I need to explore now thanks to your post; keep it up. I always enjoyed picking your brain when we worked together.

Pawan Kumar, AWS-CSA

Certified Solutions Architect | Digital Transformation | SaaS Integration | Cloud Architect | AWS | GCP | AZURE | Multi-Cloud | Hybrid Cloud | AI Practioner | Leadership |Ex-IBM

2y

Great article Pravin Dwiwedi and To the Point. It highlights the latest feature added to Amazon Athena. I love the 'reusability of query result without extra cost' feature.

To view or add a comment, sign in

More articles by Pravin Dwiwedi

Insights from the community

Others also viewed

Explore topics