Shrinking a Search Engine to Fit in Your Browser — Part 2: Feature-Gated Aggregations

Shrinking a Search Engine to Fit in Your Browser — Part 2: Feature-Gated Aggregations

Table of Contents

This is Part 2 of a series on shrinking Pizza Engine’s WASM binary from 1.21 MB to 245 KB. In Part 1, we designed zero-overhead typed bindings. Now we cut the binary by 40% with a single feature flag.

The Problem

Not every deployment needs aggregations. A document viewer that mounts a segment and runs keyword queries doesn’t need terms buckets, histograms, or significance scoring. But that code was shipping unconditionally — inflating the .wasm binary users must download before the first keystroke.

The Approach

We introduced an aggs Cargo feature that gates the entire aggregation subsystem:

[features]
default = ["aggs"]
aggs = []

wasm_nano = ["wasm", "wasm_panic_hook"]                        # no aggs
wasm_mini = ["wasm_nano", "query_string_parser", "wasm_dsl", "wasm_aggs"]

When aggs is disabled:

  • The search::aggregator module is not compiled at all
  • compute_aggregations() trait method disappears from StoreReader
  • The executor skips the aggregation pass entirely

When aggs is enabled (default, mini, ultra):

  • Full aggregation pipeline with 20+ types
  • filter/filters/adjacency_matrix with real query evaluation
  • top_hits / top_metrics with sort support
  • significant_terms with JLH scoring

What Gets Compiled Out

src/search/aggregator/
├── mod.rs          ─┐
├── types.rs         │
├── executor.rs      ├─ all gated behind cfg(feature = "aggs")
├── fast.rs          │
└── filter_eval.rs  ─┘

The gating is surgical — 6 lines across 5 files:

  1. Module declaration#[cfg(feature = "aggs")] pub mod aggregator;
  2. Trait method#[cfg(feature = "aggs")] fn compute_aggregations(...)
  3. Executor call-site#[cfg(feature = "aggs")] { ... }
  4. WASM glue — aggs parsing in wasm/search.rs
  5. Struct field — dual-typed: BTreeMap<String, AggResult> with aggs, Option<()> without

Size Results

All measurements: --release, wasm-strip, target wasm32-unknown-unknown.

TierRaw WASMGzip
nano (no aggs)3.11 MB753 KB
mini (+aggs, +DSL)5.18 MB1.24 MB
ultra (everything)5.24 MB1.25 MB

The aggs feature accounts for roughly 2 MB raw / 483 KB gzipped of code. Disabling it keeps nano under 753 KB over the wire.

Design Decisions

Why keep aggregations in SearchResult without the feature?

Removing the field breaks JSON consumers that expect the key. Instead we define it as Option<()> (always None, always skipped in serialization) so the struct layout stays compatible.

Why default = ["aggs"]?

Native builds always want aggregations. Only WASM nano opts out. Default-on means existing dependents don’t change their Cargo.toml.

Why not also gate scripting inside aggs?

They’re orthogonal concerns. scripting is independently useful for custom scoring. Keeping them separate lets users combine freely:

  • aggs without scripting — all aggregations except scripted_metric
  • scripting without aggs — custom scoring in queries
  • both — full power

Build Commands

# Nano — smallest, basic search only
cargo build --release --target wasm32-unknown-unknown \
  --no-default-features --features wasm_nano

# Mini — adds DSL + aggregations
cargo build --release --target wasm32-unknown-unknown \
  --no-default-features --features wasm_mini

Progress

  1.21 MB ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ original
   753 KB ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━                 ← Part 2 (-38%)
  <300 KB ━━━━━━━━━━━━━                                  ← goal
Milestone
Created aggs feature flag
Gated aggregator module (20+ agg types, 5 files)
Gated compute_aggregations trait + executor call-site
Gated WASM aggregation glue code
Zero breakage: all 2288 tests pass with and without aggs

Result: 1.21 MB → 753 KB gzipped (−38%)

Next: Part 3 — Eliminating serde_json

Related Posts

Profiling Rust Code on macOS: My Daily Workflow

Profiling Rust Code on macOS: My Daily Workflow

Profiling Rust code has become part of my daily routine. As I primarily develop on macOS, I’ve noticed there aren’t many tools that allow …

Read More
Shrinking a Search Engine to Fit in Your Browser — Part 1: Zero-Overhead WASM Bindings

Shrinking a Search Engine to Fit in Your Browser — Part 1: Zero-Overhead WASM Bindings

Pizza Engine ships as a WebAssembly module that runs a full inverted-index search engine inside a browser tab or Node.js worker. You mount .fire …

Read More