> hypequery

Aggregation

Group data and calculate summaries with aggregations

Aggregation

Aggregation functions calculate summaries across groups of rows. Combine them with groupBy() to analyze data patterns and generate insights.

Overview

Use aggregations to:

  • Calculate totals, averages, and counts
  • Group data by one or more columns
  • Filter grouped results with having()
  • Build analytics and reporting queries

Aggregation Functions

FunctionDescriptionExample
sum(column, alias)Sum of valuessum('total', 'revenue')
count(column, alias)Count rows or non-null valuescount('id', 'user_count')
avg(column, alias)Average valueavg('price', 'avg_price')
min(column, alias)Minimum valuemin('created_at', 'first_seen')
max(column, alias)Maximum valuemax('created_at', 'last_seen')
distinctCount(column, alias)Count of distinct valuesdistinctCount('user_id', 'unique_users')

Sum

Calculate the total of a column:

const revenue = await db.table('orders')
  .sum('total', 'total_revenue')
  .execute();

Sum by Group

const revenueByCountry = await db.table('orders')
  .select(['country'])
  .sum('total', 'total_revenue')
  .groupBy(['country'])
  .execute();

Count

Count rows or non-null values:

const userCount = await db.table('users')
  .count('id', 'total_users')
  .execute();

Count by Group

const usersByCountry = await db.table('users')
  .select(['country'])
  .count('id', 'user_count')
  .groupBy(['country'])
  .execute();

Count All Rows

Use count('*') or count('some_column') to count all rows. The column parameter is required by ClickHouse but the value doesn't matter when counting all rows.

Avg

Calculate the average value:

const avgOrder = await db.table('orders')
  .avg('total', 'average_order_value')
  .execute();

Average by Group

const avgPriceByCategory = await db.table('products')
  .select(['category'])
  .avg('price', 'avg_price')
  .groupBy(['category'])
  .execute();

Min / Max

Find minimum or maximum values:

const priceRange = await db.table('products')
  .min('price', 'min_price')
  .max('price', 'max_price')
  .execute();

Min/Max by Group

const dateRanges = await db.table('events')
  .select(['user_id'])
  .min('created_at', 'first_event')
  .max('created_at', 'last_event')
  .groupBy(['user_id'])
  .execute();

Distinct Count

Count unique values:

const uniqueUsers = await db.table('events')
  .distinctCount('user_id', 'unique_users')
  .execute();

Distinct Count by Group

const dailyActiveUsers = await db.table('events')
  .select([datePart('day', 'created_at', 'date')])
  .distinctCount('user_id', 'daily_active_users')
  .groupBy(['date'])
  .execute();

Group By

Group results by one or more columns:

Single Column Group By

const results = await db.table('orders')
  .select(['country'])
  .sum('total', 'total_revenue')
  .count('id', 'order_count')
  .groupBy(['country'])
  .execute();

Multiple Columns Group By

const results = await db.table('orders')
  .select(['country', 'status'])
  .sum('total', 'total_revenue')
  .groupBy(['country', 'status'])
  .execute();

Group By Syntax

All columns in select() must either be:

  • Listed in groupBy()
  • Wrapped in an aggregation function (sum, count, avg, etc.)

Having

Filter grouped results using having():

const results = await db.table('orders')
  .select(['country'])
  .sum('total', 'total_revenue')
  .groupBy(['country'])
  .having('total_revenue > 10000')
  .execute();

Where vs Having

  • WHERE: Filters rows before grouping
  • HAVING: Filters groups after aggregation

Having with Multiple Conditions

const results = await db.table('orders')
  .select(['country'])
  .sum('total', 'total_revenue')
  .count('id', 'order_count')
  .groupBy(['country'])
  .having('total_revenue > 10000 AND order_count > 50')
  .execute();

Having with Parameters

const results = await db.table('orders')
  .select(['country'])
  .sum('total', 'total_revenue')
  .groupBy(['country'])
  .having('total_revenue > ?', [10000])
  .execute();

Common Patterns

Sales by Category

const salesByCategory = await db.table('orders')
  .innerJoin('products', 'orders.product_id', 'products.id')
  .select(['products.category'])
  .sum('orders.total', 'revenue')
  .count('orders.id', 'order_count')
  .avg('orders.total', 'avg_order_value')
  .groupBy(['products.category'])
  .orderBy('revenue', 'DESC')
  .execute();

Daily Active Users

const dailyActive = await db.table('events')
  .select([
    toStartOfDay('created_at', 'date'),
  ])
  .distinctCount('user_id', 'active_users')
  .groupBy(['date'])
  .orderBy('date', 'ASC')
  .execute();

User Activity Summary

const userStats = await db.table('events')
  .select(['user_id'])
  .count('id', 'total_events')
  .min('created_at', 'first_event')
  .max('created_at', 'last_event')
  .groupBy(['user_id'])
  .having('total_events > 10')
  .orderBy('total_events', 'DESC')
  .execute();

Revenue by Time Period

const monthlyRevenue = await db.table('orders')
  .select([
    toStartOfMonth('created_at', 'month'),
  ])
  .sum('total', 'revenue')
  .count('id', 'order_count')
  .groupBy(['month'])
  .orderBy('month', 'ASC')
  .execute();

Funnel Analysis

const funnel = await db.table('events')
  .select(['event_type'])
  .distinctCount('user_id', 'unique_users')
  .groupBy(['event_type'])
  .orderBy('unique_users', 'DESC')
  .execute();

Grouping with Expressions

You can group by expressions and calculated columns:

const results = await db.table('orders')
  .select([
    selectExpr('if(total > 100, \'high\', \'low\')', 'value_segment'),
  ])
  .sum('total', 'revenue')
  .count('id', 'order_count')
  .groupBy(['value_segment'])
  .execute();

Multiple Aggregations

Combine multiple aggregation functions:

const stats = await db.table('orders')
  .select(['country'])
  .sum('total', 'total_revenue')
  .count('id', 'order_count')
  .avg('total', 'avg_order')
  .min('total', 'min_order')
  .max('total', 'max_order')
  .groupBy(['country'])
  .execute();

Aggregation with Joins

const results = await db.table('orders')
  .innerJoin('users', 'orders.user_id', 'users.id')
  .select(['users.country'])
  .sum('orders.total', 'revenue')
  .count('orders.id', 'order_count')
  .groupBy(['users.country'])
  .execute();

Type Safety

TypeScript ensures aggregations are used correctly:

// ✅ Valid - number column aggregation
await db.table('orders')
  .sum('total', 'revenue')
  .execute();

// ❌ Error - can't aggregate string column
await db.table('users')
  .sum('name', 'total_name') // TypeScript error
  .execute();

// ✅ Valid - groupBy columns exist
await db.table('orders')
  .select(['country'])
  .sum('total', 'revenue')
  .groupBy(['country'])
  .execute();

// ❌ Error - groupBy column doesn't exist
await db.table('orders')
  .select(['country'])
  .sum('total', 'revenue')
  .groupBy(['invalid_column']) // TypeScript error
  .execute();

Performance Tips

Aggregation Performance

  • Filter with where() before grouping when possible
  • Use distinctCount() instead of count(DISTINCT column)
  • Consider materialized views for frequently-run aggregations
  • Use appropriate ClickHouse settings for large aggregations

Filter Before Grouping

// ✅ Better - filter first
const results = await db.table('orders')
  .where('status', 'eq', 'completed')
  .where('created_at', 'gte', '2024-01-01')
  .select(['country'])
  .sum('total', 'revenue')
  .groupBy(['country'])
  .execute();

// ❌ Slower - group everything then filter
const results = await db.table('orders')
  .select(['country'])
  .sum('total', 'revenue')
  .groupBy(['country'])
  .having('sum(total) > 1000') // Less efficient
  .execute();

On this page