-
Notifications
You must be signed in to change notification settings - Fork 17
Description
I noticed that some of your goals center around player statistics, rankings, team scrambling and balancing, and things of that sort. I've recently built and deployed a skill rating and team balancing system that's currently being tested on one of my servers. I've finally gotten around to documenting the generals of it, and what follows is a description of how it works. The system and the documentation are both still a work in progress, and since testing so far has been limited to Arena, some parts may not fully generalize to other TF2 gamemodes. Feedback, questions, and suggestions are welcome.
Overview of SynElo: yazoo.tf's Arena Rating & Balancing System
Session Flow and Protocol
SynElo has two components: the RBS (ratings, balance, scramble) webservice and a SourceMod plugin. They communicate with each other over HTTP. The plugin reports round outcomes and requests optimal team splits, while the webservice owns all the math.
On player connect, the plugin sends the player's SteamID64 to the webservice which returns that player's current rating, visibility status, and rounds played. New (and unknown) players are assigned
Round 1 is always a 1v1 duel. At round 1's end, the plugin requests an optimal split from the webservice so that the cache is populated before round 2 begins. Round 2 is always scrambled unconditionally. From round 3 onward, a scramble triggers when one team wins 3 rounds in a row, with optimal splits being requested at the end of every round.
At round end, the plugin sends a rating update payload to the webservice which includes the SteamID64 of each player on the team, the winning team, the map name, and a server identifier. The service computes new ratings for all participants and returns them. A participant is any player who was on a team when the spawn doors went up and who did not disconnect before dealing damage, receiving damage, or healing another player. Optimal splits are requested by the plugin during this time, computed by the webservice, and sent to the plugin for caching.
The split cache holds optimal team assignments (splits) for every team size from 2v2 up to
On service unavailability, failed rating update payloads are queued in an 8-slot ring buffer and retried on next round end before the new update is sent. For scramble requests, the plugin continues to use the most recently cached splits if a fresh request fails or times out. If the cache is empty and the service is unreachable, scrambling is deferred: the "scramble pending" flag stays set and the plugin retries at the next opportunity.
The webservice API has 3 endpoints:
- Update: Submitting round outcomes and receiving updated ratings
- Scramble: Submitting eligible player pool and receiving optimal splits for all team sizes
- Admin: Getting/setting/resetting individual ratings and querying pool quality indicators
All requests carry a shared secret in the headers.
Parameters and Formulas
Parameters
| Symbol | Value | Description |
|---|---|---|
| 1000 | Starting rating | |
| 100 | Absolute rating floor | |
| 2200 | Maximum rating on shadow graduation | |
| 50 | Rounds before rating is made visible | |
| 30 | Days of inactivity before stale flag | |
| 2 | Minimum K-factor | |
| 72 | Maximum K-factor | |
| 2400 | Logistic scale parameter (tune after 500 rounds) | |
| 500 | Rolling window size (rounds) | |
| 0.25 | Velocity weight in convergence score | |
| 0.25 | WR-excess weight in convergence score | |
| 0.50 | Volatility weight in convergence score | |
| 400 | Gaussian compression width | |
| 32 | Maximum pool size for exact solver |
Rating Algorithm
F1: Win Probability
where
F2: Rating Update
where
F3: Convergence Score
Let
where
-
$\mathbf{vel}$ captures directional drift: is the rating still moving in a consistent direction? -
$\mathbf{wre}$ captures predictive miscalibration: is the model's win probability estimate still off? Normalized by 0.10; a 10% WR deviation saturates the signal. -
$\mathbf{vol}$ captures dispersion, the normalized spread of rating values within the window regardless of direction.
F4: Gaussian K Compression
Applied in both directions from
F5: Dynamic K-Factor
F6: Brier Score
where
Gosper Partition Enumeration (GPE)
F7: Objective Function
Player 0 (highest-rated) is anchored to team A, exploiting the symmetry that any partition and its mirror are functionally identical. This halves the search space without discarding any optimal solution.
F8: Gosper's Hack (next k-subset)
Given a bitmask
Enumerates all
F9: Two Layer Enumeration
Implement what was shown in F8 as the function gosper_next(...) and then find the best split. The pseudocode does not deliberately approximate any specific language, though there is a loose mix of python-style syntax and C-style bitwise operators. It even syntax highlights mostly-correctly when python is specified as the language. Wowee!
indices_of_set_bits(x):
result = []
for i in 0..N-1:
if (x >> i) & 1: // is bit i of x a 1?
result.append(i)
return result
gosper_next(x):
c = x & (-x) // isolate the lowest set bit
r = x + c // carry that bit forward
return ((r ^ x) / c) >> 2 | r // fix the lower bits and combine
find_best_split(ratings, N):
best_score = INF
best_split = null
for k in 2 .. floor(N/2):
outer = (1 << (2k-1)) - 1 // lowest (2k-1)-bit mask
while outer < (1 << (N-1)): // stay within N-1 bits; player 0 excluded
active = indices_of_set_bits(outer)
inner = (1 << (k-1)) - 1 // lowest (k-1)-bit mask
while inner < (1 << (2k-1)): // stay within the active subset
B = { active[i] : bit i set in inner } // bit i is 1? take him to B
A = { 0 } + { active[i] : bit i clear in inner } // bit i is 0? take him to A
score = |Sum(ratings[A]) - Sum(ratings[B])| // bit i is 2? take him to Detroit
if score == 0:
return A, B
if score < best_score:
best_score = score
best_split = (A, B)
inner = gosper_next(inner)
outer = gosper_next(outer)
return best_splitF10: Complexity Bounds
The time-complexity of GPE is
The lower bound applies when the pool is balanced and early exit fires frequently. The upper bound is a pathological worst case. Observed latency for
The exact solver is used for
Generalization to TF2 Outside of Arena
Structural Differences From Arena:
- No queue. Mid-round joins go straight to their engine-assigned team.
- No player substitutions outside of Arena.
- Waiting for Players period.
Rating Participation
Any player who has dealt damage, received damage, or healed another player during a round should be rated for that round. Win probability uses round-start team sums regardless of when a participant joined: the probability is viewed as being a property of the match, not that of any of the individuals.
Scramble Triggers By Gamemode:
- Attack/Defend, Payload: attackers run out of time without capturing any objectives or 2 full rounds completed
- KOTH, 5CP: one team never captures the/a control point before the other team wins
- CTF: one team never captures the flag before the other team wins
- All modes: unconditional scramble when the Waiting For Players period expires. This is directly analogous to the round 2 Arena scramble that always fires.
Streak-based scrambles do not apply outside Arena. Some rounds can take 20–30 minutes and some modes have no time cap.
Autobalance on Mid-Round Joins:
Only trigger when
F11: Target Transfer Rating
When
Move the player on the larger team whose rating is closest to
Adversarial behavior such as intentionally throwing is self-defeating. A suppressed rating causes GPE to place the player against weaker teams, who they then defeat, and with stronger teammates that help them win; both of which would push the rating back up. The system self-corrects.