Design Decisions

A deep dive into the decisions behind Mantle's rating system, from Glicko-2 engine selection to climbing-specific adaptations.

Design Philosophy

Most climbing apps treat logging as the end goal. We treat it as the starting point. Every feature in Mantle is derived from one simple action: logging a climb in under five seconds.

Ratings, ranks, insights, achievements, leaderboards, and social features are all computed from the same climb data. No extra input required. The system does the thinking so climbers can focus on climbing.

We made a few foundational decisions early that shaped everything else. First, ratings update per-session, not per-climb. A session is a natural unit in climbing - you warm up, push your limits, and cool down. Evaluating performance across the whole session produces more stable, meaningful ratings than reacting to individual climbs.

Second, bouldering and sport climbing are fully independent tracks. They use different muscles, different skills, and different grade scales. A V7 boulderer might climb 5.11a on ropes - or 5.13a. Combining them into one number would be meaningless.

Third, the system only rewards real climbing behavior. There are no shortcuts to inflate your rating. Repeating the same easy grade gives diminishing returns. Top-roping gets a discount versus leading. Every ascent type, from onsight to hangdog, has a carefully calibrated weight.

Why Glicko-2

Choosing the right rating engine was the most important technical decision we made. We didn't just pick one. We built two complete engines and ran 1,188 simulations to compare them.

1,188
Simulation runs
11
Climber profiles
54
Parameter combos
2
Engines tested

We simulated climber profiles ranging from beginners to elite athletes, each with realistic session patterns, strengths, and weaknesses. Both Glicko-2 and an IRT/Rasch model were built and tested against these profiles.

The results were remarkably close - maximum divergence of just 16 rating points, with 100% agreement on rank assignments. But Glicko-2 won on three key factors:

FeatureEloIRT / RaschGlicko-2
Tracks confidencePartial
Tracks volatility
Confidence decays with inactivity
Session-based periodsPartial
Well-established in competitive systems

Glicko-2's confidence measure naturally increases when a climber is inactive, which means the system becomes less certain about your rating over time if you stop climbing, and more responsive when you return. Volatility tracking identifies whether a climber performs consistently or erratically, further tuning how much each session moves their rating.

Fixed-Opponent Adaptation

Glicko-2 was designed for head-to-head matchups (chess, gaming). We adapted it by treating each climbing route as a fixed-rated opponent. A V5 boulder “plays” at 1400 rating points. When you send it, you beat that opponent. When you fall, you lose. The route's rating never changes - only yours does.

The 9-Stage Pipeline

Before any climb affects your rating, it passes through a 9-stage processing pipeline. Each stage addresses a specific climbing reality that a raw rating system wouldn't handle correctly.

1
Grade to base rating
Each grade maps to a fixed rating value. The gaps between grades grow exponentially, reflecting the real-world truth that the jump from V10 to V11 is far larger than V1 to V2.
2
Feel modifier
If a climb felt soft or hard for the grade, the route's effective rating shifts by half a grade gap. A "hard V4" is treated closer to a V5. This captures the subjectivity of gym grading.
3
Top-rope discount
Top-roping subtracts one full grade gap from the route's rating. Leading the same route on a rope is significantly harder than top-roping it. The discount reflects that mechanical advantage.
4
Multi-attempt expansion
A send after 4 attempts is broken down so the system sees both the difficulty (you fell 3 times) and the success (you eventually sent it).
5
Fractional type scoring
Each ascent type gets a calibrated score: onsight (1.0), flash (0.9), send (0.75), hangdog (0.4), attempt (0.0). An onsight is worth more because climbing without any prior information is genuinely harder.
6
Diminishing returns
Your 2nd send at the same grade in a session counts 80%, the 3rd at 60%, the 4th and beyond at 40%. This prevents farming. Sending ten V3s shouldn't have the same impact as sending one V7.
7
Attempt dampening
Falling on a route well above your current level gets a neutral score instead of a penalty. We don't punish you for trying hard routes that are beyond your current level.
8
Sigmoid impact weighting
A weighting curve determines how much each climb matters. Routes near your level have the highest impact. Very easy routes barely move the needle. Very hard ones are capped to prevent fluky sends from distorting your rating.
9
Exclusion and force-include
Climbs with negligible impact are filtered out for cleaner results. But if every climb in a session would be filtered, the most meaningful one is kept, guaranteeing every session produces a rating update.

Grade Mapping

The gap between climbing grades isn't linear. It's exponential. The jump from V0 to V1 might take a few weeks. The jump from V10 to V11 might take years. Our grade-to-rating mapping reflects this.

Each grade maps to a fixed rating value that serves as the route's “opponent rating.” The gaps between consecutive grades grow steadily, starting at 100 points and increasing by 20 points per grade.

V-Scale Rating Gaps

V0
+100
V1
+100
V2
+120
V3
+140
V4
+160
V5
+180
V6
+200
V7
+220
V8
+240
V9
+260
V10
+280
V11
+300
V12
+320

This exponential curve means the rating system naturally distinguishes between grade levels at every ability range. A V2 climber sending a V3 and a V9 climber sending a V10 both represent meaningful jumps, but the V10 send moves the rating more because the gap is genuinely larger.

Four Grade Scales, One Source of Truth

Mantle supports V-scale, YDS, Font, and French grades. All grade-to-rating mappings are centrally managed, so every part of Mantle stays in sync. This means we can adjust grade mappings without shipping app updates.

Ascent Scoring

Not all sends are equal. Onsighting a route on your first look is harder than sending it after ten attempts with beta from a friend. Our scoring system captures these nuances.

Fractional Type Scores

We chose a fractional scoring system over binary (send/no-send) to differentiate between ascent styles. Each type gets a carefully calibrated multiplier:

Onsight - 1.0
Full credit. You climbed it on your first attempt with no prior information. The purest measure of ability.
Flash - 0.9
Nearly full credit. First attempt, but you had beta (watched someone else, read the sequence, etc.). Still impressive, slightly less than blind.
Send / Redpoint / Clean / Repeat - 0.75
Standard credit for a successful ascent. Whether you worked the moves over multiple sessions or sent it clean after a few tries, the result is the same.
Hangdog - 0.4
Partial credit. You climbed the route but hung on the rope between moves. It demonstrates grade awareness but with significant assistance.
Attempt - 0.0
No credit for the outcome, but attempts still matter. They feed into the weighting system and affect how much your rating moves.

Context-Aware Types

The app shows different ascent types depending on discipline and protection. Bouldering offers onsight, flash, send, repeat, and attempt. Lead climbing adds redpoint and pinkpoint. Top-rope uses clean instead of redpoint. This prevents invalid combinations and keeps logging fast.

Why Diminishing Returns Matter

Without diminishing returns, a climber could send twenty V3s in a session and gain the same rating boost as someone who sent a V7. The linear falloff (80%, 60%, 40% for repeated grades) rewards pushing into harder territory rather than farming easy sends.

The Impact Curve

The impact weighting is what makes the whole system feel fair. A 1400-rated climber sending a V5 (1400 points) gets moderate impact. Sending a V7 (1820 points) gets high impact. They're punching above their weight. But sending a V2 (920 points) has almost zero impact. They're not proving anything new.

Every session is guaranteed to update your rating. Even if you only climb easy warmups, the force-include rule ensures the most impactful climb still counts.

The Rank System

Raw numbers are precise but impersonal. Ranks give climbers an identity, a tier that reflects where they are on their climbing journey. The seven tier names are drawn from climbing culture.

Prospect is just starting out. Scrambler is finding their feet. Sender is consistent. Crusher is strong. Projector is working hard problems. Dirtbag is living the climbing life. Stonemasters is elite, named after the legendary Yosemite climbing crew of the 1970s.

RankBoulderSport
ProspectVB - V15.5 - 5.7
ScramblerV1 - V35.8 - 5.9
SenderV3 - V55.10a - 5.11a
CrusherV5 - V75.11a - 5.11d
ProjectorV7 - V95.11d - 5.12c
DirtbagV9 - V115.12c - 5.13a
StonemastersV11+5.13a+

The boundaries are asymmetric between disciplines because the grade scales don't map one-to-one. A V5 boulderer and a 5.11a sport climber are at roughly the same tier, but the exact rating numbers differ.

Safety Cap

No session can change your rating by more than 500 points. This prevents a single outlier session, whether exceptionally good or bad, from wildly distorting your rank. If the cap triggers, the system adjusts proportionally to stay internally consistent.

Integrity & Trust

A rating system is only as good as the data behind it. If anyone can fabricate climbs, ratings become meaningless. We built multiple layers of verification without making the logging experience slower.

Partner Verification
Tag a climbing partner during your session. They receive a notification and can confirm your session with one tap. We intentionally chose full-session confirmation over per-climb flagging. The social pressure of confirming an entire session is a stronger trust signal.
Witness Vouching
For partners without Mantle accounts, generate a QR code or shareable link. Anyone can verify your session from a lightweight web page. No account needed. Links expire after 48 hours and are single-use.
Anti-Fraud Protection
Vouch links are rate-limited to 3 per device per 24 hours, and the verifier's device must differ from the climber's. This prevents self-verification with a second browser tab.
Vouch Percentage
Your verification rate across your last 15 sessions is displayed on your profile. Green (60%+), yellow (30-59%), red (< 30%). This creates a visible credibility signal. Other climbers can see at a glance how trustworthy your logs are.
Leaderboard Opt-In
Gym leaderboards require explicit opt-in and public privacy settings. Changing your home gym automatically opts you out of your previous gym's leaderboard. No one appears on a leaderboard they didn't choose to join.
Tamper-Proof Calculations
All rating calculations happen on our servers, not on your device. There's no way to manipulate the math locally, and sessions are processed in strict order to ensure accuracy.
We also built 65 achievements that only reward real climbing milestones: grade firsts, volume benchmarks, style accomplishments, and social engagement. None can be earned through gaming or shortcuts.