How We Measure Generosity

Every provider in the Volksdroid catalog carries a generosity score from 1 to 10. The score answers one question a builder actually has: how much can I get done for free here, and what will it cost me in friction? This document defines exactly how that number is produced so the ranking is transparent, reproducible, and honest.

The score is not a single opinion. It is a weighted blend of five measured dimensions, normalized within each category, then calibrated so the results spread across the full range instead of clumping at the top.

The five dimensions

Each provider is scored 0–100 on five dimensions. The dimensions are deliberately orthogonal — they capture different ways a free tier can be good or bad.

Dimension	Weight	What it captures
Quantity	35%	How much you actually get for free, normalized within the category — tokens/requests, storage GB, compute hours, credit dollars, scheduled jobs, seats, domains.
Renewability	20%	Whether the allowance refills. A monthly/daily renewable tier is worth far more over time than a one-time finite credit, which still beats a short trial.
Accessibility	20%	How low the barrier to start is: no card, no approval, no eligibility gate, instant signup — versus a card on file, a minimum balance, student/startup status, a region lock, or manual review.
Usability	15%	How production-viable the free tier is: latest/frontier capability, always-on, no crippling feature cuts — versus sleeps-when-idle, no high availability, throttled, or evaluation-only.
Longevity	10%	How durable the offer is: a stable, long-standing free tier versus a promo, beta, or time-boxed program that may disappear.

Quantity carries the most weight because “how much is free” is the first thing a builder cares about. But it is intentionally not the whole story: a small renewable, no-strings tier routinely outranks a large one-time credit behind a card, because over the life of a real project the renewable tier delivers more and demands less.

How each dimension is scored (no subjective judgment)

Every dimension is computed by an explicit formula over the provider’s typed data, then percentile-ranked within its category so each indicator is evenly distributed and two providers in the same category land on clearly different numbers. The formulas live in scripts/score-generosity.py (the single source of truth) and are described below. Ranking is what turns raw facts into comparable points: for each indicator we sort every provider in a category by the formula’s output and assign a 0–100 percentile (ties share the average rank).

Quantity — the real, web-verified value of the category’s primary key metric (table below), ranked within the category. Unlimited/unmetered ranks at the top, an absent allowance at the bottom. The primary metric and its supporting metrics are shown on every card and listed in full in the detail panel.

Category	Primary metric	Supporting metrics
LLM providers	Free tokens (M tokens/mo)	Requests/day, Tokens/min, Trial credit, Commercial use
Databases	Storage (GB)	Egress, Compute, Row reads, Databases
App platforms	Bandwidth (GB/mo)	Build minutes, Function calls, Custom domains
Serverless functions	Invocations (M/mo)	Compute (GB-s), CPU/call, Bandwidth
Schedulers	Cron jobs (count)	Min interval, Executions, Run history, Job timeout
Storage	Storage (GB)	Free egress, Write ops, Read ops
Workspaces	Capacity (records/seats)	Storage, RAM, Workspaces
Domains	Free domains (count)	DNS records, Dynamic-DNS refresh, Ad-free

Requests → tokens. Many LLM free tiers publish only a request cap (“50 requests/day”, “30 RPM”). To normalize everything to the primary unit (M tokens/month) we convert using a researched average of ~1,000 tokens per request (prompt + completion for a typical single-turn chat/agent call): tokens/month ≈ requests/day × 30 × 1,000. Heavy RAG/long-context traffic runs higher, but ~1,000 is the right reference class for normalizing consumer-grade request limits.

Renewability — an ordinal from renewal + pricingModel + limitations, then ranked:

5 daily/monthly renewable, or a perpetual free_forever/open_source_selfhost tier with nothing to deplete
4 annual renewal
2 one-time finite pool (total_volume_cap, finite_credit, or renewal: one_time)
1 time-limited (time_limited) or trial
3 otherwise

Accessibility — start at 100, subtract documented barrier penalties, then rank (floored at 5):

−25 card required · −30 minimum balance held · −40 restricted to students/startups/businesses/region/OSS · −35 manual approval · −50 waitlist

Usability — start at 100, subtract quality-cut penalties read from the typed limitations, then rank (floored at 5):

−40 non-commercial · −25 sleeps when idle · −15 throttled · −8 single-instance · −6 per feature cut (max 4) · −5 attribution required

Longevity — an ordinal from pricingModel, capped at 2 when time_limited, then ranked:

5 free_forever / freemium / open_source_selfhost · 4 byok · 3 finite_credit / student · 2 startup · 1 trial

The composite (0–100)

Each dimension sub-score is its 0–100 within-category percentile rank. The composite is their weighted sum:

raw = 0.35·quantity + 0.20·renewability + 0.20·accessibility
    + 0.15·usability + 0.10·longevity

stored on every provider as generosityRaw. Because the inputs are ranks, the composite is naturally spread and has no clustering at the top.

Normalizing to the 1–10 score

The composite is mapped to the displayed 1–10 score with a robust min–max stretch. We take the 2nd and 98th percentiles of all composites as the low/high anchors and linearly rescale, clamping to [1, 10]:

score = clamp( 1 + (raw − P2) / (P98 − P2) · 9 , 1, 10 )

Using the 2nd/98th percentiles instead of the absolute min/max means a single outlier can’t compress everyone else, while the full 1–10 range is still used. The result is a normalized, well-spread distribution (≈ mean 5.9, every tier populated, the full 1–10 range in use).

Because every dimension is ranked within a category, sub-scores are most meaningful when comparing providers in the same category (LLM vs. LLM, database vs. database). The composite is comparable across categories — all inputs are percentiles — and the catalog is organized by category because that is the comparison that matters to a builder.

Tier bands

The 1–10 score maps to a named tier for quick scanning:

Tier	Score	Meaning
Exceptional	9.0–10	Best-in-class free capacity with minimal strings attached.
Generous	7.0–8.9	A strong, genuinely useful free tier with manageable limits.
Solid	5.0–6.9	A workable free tier with real constraints to plan around.
Limited	3.0–4.9	Narrow free allowance or notable barriers; fine for trials.
Minimal	1.0–2.9	Token free tier — short trial, tiny pool, or heavy gating.

What the score is not

It is not a quality rating of the product. A provider can have an excellent paid product and a stingy free tier (low score), or a mediocre product and a remarkably generous free tier (high score). We measure the free offer, not the company.
It is not a substitute for reading the limitations. Two providers can share a score for very different reasons. The typed limitations attached to each provider tell you which constraints apply to your use case.
It is not static. Free tiers change. Each provider records whether its numbers were web-verified, and the score is recomputed when the underlying data is refreshed.

Where the numbers live

Every provider’s markdown frontmatter carries the full, inspectable assessment:

generosityScore — the calibrated 1–10 score
generosityRaw — the 0–100 composite
generosityTier — the named tier
generosityComponents — the five dimension sub-scores (0–100 each)
limitations — the typed, exhaustive limitation list (see Free-Tier Conditions & Limitations)

The rubric itself — dimensions, weights, anchors, tier bands, and the limitation taxonomy — is defined in code at src/data/generosity.ts as the single source of truth shared by the catalog UI, the content schema, and this document.