Design Decisions

A deep dive into the decisions behind Mantle's rating system, from Glicko-2 engine selection to climbing-specific adaptations.

Mantle TeamMarch 31, 2026

Design Philosophy

Most climbing apps treat logging as the end goal. We treat it as the starting point. Every feature in Mantle is derived from one simple action: logging a climb in under five seconds.

Ratings, ranks, insights, achievements, leaderboards, and social features are all computed from the same climb data. No extra input required. The system does the thinking so climbers can focus on climbing.

We made a few foundational decisions early that shaped everything else. First, ratings update per-session, not per-climb. A session is a natural unit in climbing - you warm up, push your limits, and cool down. Evaluating performance across the whole session produces more stable, meaningful ratings than reacting to individual climbs.

Second, bouldering and sport climbing are fully independent tracks. They use different muscles, different skills, and different grade scales. A V7 boulderer might climb 5.11a on ropes - or 5.13a. Combining them into one number would be meaningless.

Third, the system only rewards real climbing behavior. There are no shortcuts to inflate your rating. Repeating the same easy grade gives diminishing returns. Top-roping gets a discount versus leading. Every ascent type, from onsight to hangdog, has a carefully calibrated weight.

Why Glicko-2

Choosing the right rating engine was the most important technical decision we made. We didn't just pick one. We built two complete engines and ran 1,188 simulations to compare them.

1,188

Simulation runs

Climber profiles

Parameter combos

Engines tested

We simulated climber profiles ranging from beginners to elite athletes, each with realistic session patterns, strengths, and weaknesses. Both Glicko-2 and an IRT/Rasch model were built and tested against these profiles.

The results were remarkably close - maximum divergence of just 16 rating points, with 100% agreement on rank assignments. But Glicko-2 won on three key factors:

Feature	Elo	IRT / Rasch	Glicko-2
Tracks confidence	✕	Partial	✓
Tracks volatility	✕	✕	✓
Confidence decays with inactivity	✕	✕	✓
Session-based periods	Partial	✓	✓
Well-established in competitive systems	✓	✕	✓

Glicko-2's confidence measure naturally increases when a climber is inactive, which means the system becomes less certain about your rating over time if you stop climbing, and more responsive when you return. Volatility tracking identifies whether a climber performs consistently or erratically, further tuning how much each session moves their rating.

Fixed-Opponent Adaptation

Glicko-2 was designed for head-to-head matchups (chess, gaming). We adapted it by treating each climbing route as a fixed-rated opponent. A V5 boulder “plays” at 1400 rating points. When you send it, you beat that opponent. When you fall, you lose. The route's rating never changes - only yours does.

The 9-Stage Pipeline

Before any climb affects your rating, it passes through a 9-stage processing pipeline. Each stage addresses a specific climbing reality that a raw rating system wouldn't handle correctly.

Grade to base rating

Each grade maps to a fixed rating value. The gaps between grades grow exponentially, reflecting the real-world truth that the jump from V10 to V11 is far larger than V1 to V2.

Feel modifier

If a climb felt soft or hard for the grade, the route's effective rating shifts by half a grade gap. A "hard V4" is treated closer to a V5. This captures the subjectivity of gym grading.

Top-rope discount

Top-roping subtracts one full grade gap from the route's rating. Leading the same route on a rope is significantly harder than top-roping it. The discount reflects that mechanical advantage.

Multi-attempt expansion

A send after 4 attempts is broken down so the system sees both the difficulty (you fell 3 times) and the success (you eventually sent it).

Fractional type scoring

Each ascent type gets a calibrated score: onsight (1.0), flash (0.9), send (0.75), hangdog (0.4), attempt (0.0). An onsight is worth more because climbing without any prior information is genuinely harder.

Diminishing returns

Your 2nd send at the same grade in a session counts 80%, the 3rd at 60%, the 4th and beyond at 40%. This prevents farming. Sending ten V3s shouldn't have the same impact as sending one V7.

Attempt dampening

Falling on a route 300+ points above your current level receives reduced penalty (not full credit), and repeated hard attempts are further reduced. We want to encourage trying hard without letting repeated falls dominate rating gains.

Sigmoid impact weighting

A weighting curve determines how much each climb matters. Routes near your level have the highest impact. Very easy routes barely move the needle. Very hard ones are capped to prevent fluky sends from distorting your rating.

Exclusion and force-include

Climbs with negligible impact are filtered out for cleaner results. But if every climb in a session would be filtered, the most meaningful one is kept, guaranteeing every session produces a rating update.

Grade Mapping

The gap between climbing grades isn't linear. It's exponential. The jump from V0 to V1 might take a few weeks. The jump from V10 to V11 might take years. Our grade-to-rating mapping reflects this.

Each grade maps to a fixed rating value that serves as the route's “opponent rating.” For the V-scale, gaps between consecutive grades grow steadily, starting at 100 points and increasing by 20 points per grade. Other scales follow the same principle but with varying gap patterns to reflect how sub-grades within a full grade (like 5.10a through 5.10d) are closer together.

Rating Gaps by Scale

600

700 (+100)

800 (+100)

920 (+120)

1060 (+140)

1220 (+160)

1400 (+180)

1600 (+200)

1820 (+220)

2060 (+240)

2320 (+260)

V10

2600 (+280)

V11

2900 (+300)

V12

3220 (+320)

This exponential curve means the rating system naturally distinguishes between grade levels at every ability range. A V2 climber sending a V3 and a V9 climber sending a V10 both represent meaningful jumps, but the V10 send moves the rating more because the gap is genuinely larger.

Four Grade Scales, One Source of Truth

Mantle supports V-scale, YDS, Font, and French grades. All grade-to-rating mappings are centrally managed, so every part of Mantle stays in sync. This means we can adjust grade mappings without shipping app updates.

Ascent Scoring

Not all sends are equal. Onsighting a route on your first look is harder than sending it after ten attempts with beta from a friend. Our scoring system captures these nuances.

Fractional Type Scores

We chose a fractional scoring system over binary (send/no-send) to differentiate between ascent styles. Each type gets a carefully calibrated multiplier:

Onsight - 1.0

Full credit. You climbed it on your first attempt with no prior information. The purest measure of ability.

Flash - 0.9

Nearly full credit. First attempt, but you had beta (watched someone else, read the sequence, etc.). Still impressive, slightly less than blind.

Send (boulder) / Redpoint (lead) / Clean (toprope) - 0.75

Standard credit for a successful ascent after working the moves. Multi-attempt sends expand in the pipeline: a 3-attempt redpoint becomes 2 falls + 1 redpoint, with the expanded falls weighted at 0.25x.

Repeat (boulder) - 0.75

Re-sending a problem you've already completed. Same score, but not expanded for multi-attempt - attempt count is ignored. Naturally hits diminishing returns at the same grade, which is correct since you're repeating known movement.

Hangdog (lead/toprope) - 0.4

Partial credit. You completed the route but rested on the rope between moves. Demonstrates grade awareness and endurance but with significant assistance. Not expanded for multi-attempt - the hangdog itself is the result, regardless of how many attempts it took to get there.

Attempt - 0.0

No base credit for the outcome, but attempts still matter. They feed into weighting and fairness controls - falls on easy routes get boosted weight (capped at 0.3), falls 300+ points above your rating get dampened to 0.35 with repeat-attempt reduction, and hard-attempt learning credit is capped per climb and per session. Multi-attempt bails expand to N separate falls.

Context-Aware Types

The app shows different ascent types depending on discipline and protection. Bouldering offers onsight, flash, send, repeat, and attempt. Lead climbing adds redpoint. Top-rope uses clean instead of redpoint. This prevents invalid combinations and keeps logging fast.

Why Diminishing Returns Matter

Without diminishing returns, a climber could send twenty V3s in a session and gain the same rating boost as someone who sent a V7. The linear falloff (80%, 60%, 40% for repeated grades) rewards pushing into harder territory rather than farming easy sends.

The Impact Curve

The impact weighting is what makes the whole system feel fair. A 1400-rated climber sending a V5 (1400 points) gets moderate impact. Sending a V7 (1820 points) gets high impact. They're punching above their weight. But sending a V2 (920 points) has almost zero impact. They're not proving anything new.

Every session is guaranteed to update your rating. Even if you only climb easy warmups, the force-include rule ensures the most impactful climb still counts.

The Rank System

Raw numbers are precise but impersonal. Ranks give climbers an identity, a tier that reflects where they are on their climbing journey. The seven tier names are drawn from climbing culture.

Prospect is just starting out. Scrambler is finding their feet. Sender is consistent. Crusher is strong. Projector is working hard problems. Dirtbag is living the climbing life. Stonemasters is elite, named after the legendary Yosemite climbing crew of the 1970s.

Rank	Boulder	Sport
Prospect	VB - V0	5.5 - 5.7
Scrambler	V1 - V2	5.8 - 5.9
Sender	V3 - V4	5.10a - 5.10d
Crusher	V5 - V6	5.11a - 5.11c
Projector	V7 - V8	5.11d - 5.12b
Dirtbag	V9 - V10	5.12c - 5.12d
Stonemasters	V11+	5.13a+

The boundaries are asymmetric between disciplines because the grade scales don't map one-to-one. A V5 boulderer and a 5.11a sport climber are at roughly the same tier, but the exact rating numbers differ.

Safety Cap

No session can change your rating by more than 500 points. This prevents a single outlier session, whether exceptionally good or bad, from wildly distorting your rank. If the cap triggers, the system adjusts proportionally to stay internally consistent.

Integrity & Trust

A rating system is only as good as the data behind it. If anyone can fabricate climbs, ratings become meaningless. We built multiple layers of verification without making the logging experience slower.

Partner Verification

Tag a climbing partner during your session. They receive a notification and can confirm your session with one tap. We intentionally chose full-session confirmation over per-climb flagging. The social pressure of confirming an entire session is a stronger trust signal.

Witness Vouching

For partners without Mantle accounts, generate a QR code or shareable link. Anyone can verify your session from a lightweight web page. No account needed. Links expire after 48 hours and are single-use.

Anti-Fraud Protection

Vouch links are rate-limited to 3 per device per 24 hours, and the verifier's device must differ from the climber's. This prevents self-verification with a second browser tab.

Vouch Percentage

Your verification rate across your last 15 sessions is displayed on your profile. Green (60%+), yellow (30-59%), red (< 30%). This creates a visible credibility signal. Other climbers can see at a glance how trustworthy your logs are.

Leaderboard Opt-In

Gym leaderboards require explicit opt-in and public privacy settings. Changing your home gym automatically opts you out of your previous gym's leaderboard. No one appears on a leaderboard they didn't choose to join.

Tamper-Proof Calculations

All rating calculations happen on our servers, not on your device. There's no way to manipulate the math locally, and sessions are processed in strict order to ensure accuracy.

We also built 65 achievements that only reward real climbing milestones: grade firsts, volume benchmarks, style accomplishments, and social engagement. None can be earned through gaming or shortcuts.