0
When you’re dealing with large amounts of data, it’s helpful to get a quick overview — which is exactly what aggregations provide in SQL. Aggregations, known as “GROUP BY queries”, provide a bird’s eye view, so you can quickly gain insights from vast volumes of data.
That’s why we are excited to announce support for aggregations in R2 SQL, Cloudflare's serverless, distributed, analytics query engine, which is capable of running SQL queries over data stored in R2 Data Catalog. Aggregations will allow users of R2 SQL to spot important trends and changes in the data, generate reports and find anomalies in logs.
This release builds on the already supported filter queries, which are foundational for analytical workloads, and allow users to find needles in haystacks of Apache Parquet files.
In this post, we’ll unpack the utility and quirks of aggregations, and then dive into how we extended R2 SQL to support running such queries over vast amounts of data stored in R2 Data Catalog.
The importance of aggregations in analytics
Aggregations, or “GROUP BY queries”, generate a short summary of the underlying data.
A common use case for aggregations is generating reports. Consider a table called “sales”, which contains Continue reading