Wed. June 14th:
This essay will focus on the balancing philosophy expressed in Overwatch. It will analyze the achievability of the Overwatch team’s apparent desire to achieve balance in all skill brackets simultaneously, and make a few suggestions to this end.
A Philosophical Problem
Before one evaluates the success and failure of specific Overwatch patches, one must establish a clear value–a metric by which to gauge the degree to which a change makes the game better or worse.
Jeff Kaplan, in an AMA three months ago, described the Blizzard approach as a ‘triangle’. “I feel like there are 3 key factors that guide us: The players, statistics and… us… our own feelings as players.” He continued on to add that “Internally, we have a ‘competitive’ playtest that’s helpful to get good feedback from Diamond+ players who work here […] None of this is perfect… but we try hard to listen to feedback and keep the game balanced.”
Ultimately, the system Jeff describes here (also confirmed by other Dev posts on the battle.net forums) is one that seeks to achieve relative balance throughout the skill spectrum. All three points of his triangle belie this reality: player feedback, developer intuition, and even statistics to some extent abstract away from player skill. Keeping a sharp eye on professional pickrates would be importantly revealing, but at the very least it isn’t clear that this is happening. The notion of balance-for-all seems nice enough prima facie, but further analysis reveals a considerable challenge to successfully implementing this broad balance goal.
This fundamental challenge is skill curve differential. Different heroes in Overwatch have remarkably different rates of return on skill growth investment; this is to say that they have significantly distinct skill curves. I use ‘skill curve’ here to mean the rate at which performance (i.e. game impact) increases with constant skill growth.
To illustrate the skill curve differential problem, consider two heroes: Genji and Junkrat. Now consider two players corresponding to each of these heroes. One of each in the 10th percentile of skill (worse than 90% of players) and one of each in the 90th percentile of skill. (worse than just 10% of players). The 90th percentile Junkrat is certainly more impactful than the 10th percentile Junkrat player, but the gulf between the 90th and 10th percentile Genji players is vastly larger. The 10th percentile Genji player is a glorified rock-slinger. Unable to consistently leverage dash resets or find high value reflects, he or she has far less game impact than the 10th percentile Junkrat player. When we reach the right tail of the skill distribution, however, exactly the opposite situation persists. Against strong opponents, Junkrat lacks high-level outplay options and is ultimately left to punish misplays or exploit weak links in the opposing team. At the very highest levels of professional play, this is why he is essentially unplayable outside a very small niche. For our high level Genji player, it is a different story. The design of the character yields exponential gains to game impact resultant from skill growth: as accuracy, speed, and aggression increase so do mobility and longevity in a positive feedback cycle.
Every character has a skill curve of some slope, that is, there is no hero which can honestly be said to require ‘no skill’. However, one can see the skill curve differential problem even embedded in core hero statistics. Ana and Mercy are both powerful single target healers (comparing two different heroes will always be comparing apples to oranges, but hopefully this example is nonetheless illustrative).
Heal Rates: (source: owinfinity.com)
ANA: 75 healed per shot * 1.3 shots per second * (% accuracy) = Effective Heals per second
MERCY: 60 = Effective Heals per second
These Effective Heals per second values equalize when the Ana player’s accuracy reaches ~61.5%. That is to say that an Ana player with accuracy lower than that value will heal less per second than a Mercy and an Ana with higher accuracy will heal more per second. The point of equalization isn’t particularly important, but the fact that Ana is able to do her central job as a healer (healing) faster and more reliably the higher her accuracy gets reveals that her skill curve is steeper than that of Mercy. This cuts both ways; at the far left tail of the skill distribution (where % accuracy values are generally much lower) Mercy outputs more heals per second than Ana. Mercy has important decisions to make in order to maximize her survivability, but Ana’s self defense options are no less complex or skill dense (in fact they, like her healing rate, are significantly more responsive to skill increases)
Ana and Mercy, Genji and Junkrat: contrasting these pairs reveals the central difficulty of simultaneously satisfying players across the entire skill distribution. Professional players lament that Junkrat is meme-trash-tier in organized competitive play while he simultaneously reigns as the uncontested King of Brawl Winnin’ and The Silver Division. Ana’s winrate, meanwhile, steadily climbs with skill tier from a tragic 38.9% in Bronze to a respectable 51.9% in Grandmaster.
Nowhere are the consequences of the skill curve differentials more apparent than when comparing Ranked Matchmaking (of any level) to organized professional play (hereinafter ‘eSports’). Mercy, statistically speaking, performs well (above 50% winrate) all the way up to the top few percentiles of Ranked Matchmaking with a remarkably high pick rate. In professional play, she goes virtually untouched outside of the Pharah + Mercy combo.
This difference is a consequence of an added challenge of balancing popular online games that are also eSports. True coordination (in composition choice and game style) radically changes the way the game is played. Because Resurrect is fundamentally reactive, high level teams will often simply not allow an unsupported Mercy to garner value from her ultimate. She will be hunted by flanking DPS while the rest of the team intentionally staggers kills or saves ultimates to reduce the effectiveness of any Resurrect the Mercy is able to cast. As someone who has spent every season of Ranked Matchmaking at the very highest level of play, I can attest that these sort of plays are rarely if ever made regardless of the skill level of players on either team. I contend that an important reason Ana is so weak in low-tier play is that she demands coordinated protection to fully leverage her abilities (coordination that is virtually nonexistent at low level play). Likewise Mercy is incredibly punishing of undirected or uncoordinated play. Fail to hunt her down at the proper time or forget to save a key ultimate to counterplay Resurrect and a teamfight is quickly lost.
So what can we do? How can Overwatch feel fresh and full of optionality in an eSports context while also remaining balanced and enjoyable to play for those further to the left on the skill distribution?
Skill curve differential isn’t going anywhere, and in my opinion it shouldn’t. Blizzard intelligently marketed Overwatch much more widely than the traditional first person shooter target audience. This wasn’t just a marketing strategy though; the game design purposely features heroes, for instance Mercy, that aren’t so demanding of traditional arena shooter skills and rather allow positioning and decision making to determine game impact. In the long run, I think that this is a good thing. Purity is the enemy of innovation while community stagnancy is in direct opposition to promotion to a wider audience (something absolutely critical to achieving a public perception of legitimacy for eSports and even gaming as a hobby).
The only important question that remains is how to rise to the challenge of balancing for diverse skill tiers simultaneously. The approach that I’d like to see taken more often is the differentiation of mechanical changes and statistical changes.
Sometimes a number gets into the game that is simply broken. Bastion’s 35% value for his Ironclad passive springs to mind as a classic example of “utterly fucking busted”. Sometimes a character just doesn’t have the stats to compare favorably against his/her/its closest substitutes; pre-buff Soldier 76 is a good example. I don’t have date-accurate statistics for the strength of these heroes across skill tiers, but I contend that pre-buff Soldier 76 was probably too weak at every point on the skill distribution and pre-nerf Bastion was probably too strong at every point on the skill distribution. For these kind of across-the-board balance issues, statistical adjustments are warranted as they will have similar impacts on players of all skill levels.
These are the easy variety of balance problems. For the more complex varieties, a mechanical change in isolation or a combination of mechanical & statistical changes is necessary.
A strong example of a very good combination buff is the recent (live) patch to Hanzo. Hanzo felt a little too weak across the board, but at a high level aggressive compositions came to render him nearly obsolete. The 10% charge time buff to Hanzo is significant, but I would argue that even more impactful for high level players is the ability to hold a charged arrow while wall-climbing and to spawn your Dragonstrike early if the arrow collides with a wall. These changes make the space of options for Hanzo players significantly wider and enable much more aggressive and independent play. However, this kind of freedom doesn’t aid those who aren’t ready to use it. The change in totality made Hanzo players of all skill levels slightly stronger but had a significantly greater impact on expert players who can most creatively leverage the new mechanics. Widening the space of options doesn’t make a big difference to players who weren’t already pushing the boundaries of how a hero can be effectively played.
We can use this same mechanical vs statistical differentiation to better examine the past Genji nerf that removed his ability to triple jump in one continuous airtime via wall climbing. For low tier players who weren’t even aware of this possibility, the change had virtually zero impact. For high tier players who were exploiting it as often as possible to maximize mobility and survivability, the change had real consequences to Genji’s overall strength and playability. My honest assessment of the Overwatch development team is that they never thought about these differential effects and instead saw the triple jump as just an unintended bug to be patched out. The ledge-dash-super-jump mechanic was probably thought of similarly, and patching it out only really affected the few hundred (I doubt it was really this many) players who could hit it reliably enough to implement it as part of their play style. The important lesson here is that these pure-mechanical patches had radically different impacts on players of different skill levels.
These two examples provide a powerful blueprint for the formulation of balance adjustments that demand different impacts upon different skill tiers:
If a hero is in a good place for low-skill players but too weak for high-skill players: widen the option space by loosening mechanical restrictions and let creativity and talent shine through as increased game impact by high-skill players.
If a hero is popular and strong in the hands of high-skill players but a bit weak when used by newer players, combine a statistical buff with a restriction of option space. Make the hero more narrowly defined and yet more powerful within that narrow role. This variety of change must be done most carefully, though, as elite players will always seek to exploit any statistical buffs to their maximum potential even if it requires playing the hero in a radically different way (see the most recent attempt at nerfing Lucio).
That’s the theory, but here are my resultant suggestions for real balances changes. Feel free to leave feedback on the article as a whole or just these ideas! My twitter is @jake_overwatch 🙂
Bastion is a worse choice than soldier 76 virtually 100% of the time in high level play, but has a comfortable niche at median and below skill. Remove the self-stun upon Tank Transformation and when returning to Recon mode to allow for more aggressive initiations and the option to use Tank Form as effective counterplay in a fight. Also remove or adjust the Self-Repair animations that block the crosshair (that shit’s just annoying, yo). Average players will play just like they always have, but those smart enough to leverage these adjustments into a much more aggressive style will reap the rewards.
Widowmaker has felt incredibly map-dependent across the skill spectrum even after her charge-time buff. Decrease Hookshot cooldown (I suggest by 2-3 seconds) to increase mobility and escape options versus the dive composition that has come to define the meta. It is very dangerous to buff this hero with pure DPS, but giving her a slightly less narrow role might help her pick and win rates with skilled players.
Junkrat is an effective spammer that applies a ton of pressure to slow team compositions. His ultimate is reasonably effective against newer players but rarely finds sufficient utility in high level play to justify what is very often a suicide play. Give the Rip Tire a new ability (activated with whatever key is bound to Ability 1) that allows it to hop into a drift (yes I do mean cart-racer style) with a short cooldown. This will give stronger players options to bait out counterplays and reasonably juke players with moderate aiming skill while being difficult to abuse by those lower-tier players that don’t have a precise understanding of which counterplays they need to bait and which enemies they need to juke.
(maybe I want to roleplay Junkenstien)
7 thoughts on “The Fundamentals of Balance”
I think you forgot to mention something very critical, which Jeff was also talking about in terms of balancing.
The talked about how dangerous “perceived balanace” would be. How devastating it would be for heroes to be deemed bad by the community, just because.
We had the same with Ana, who was viewed as bad by so many players, when she first came out. Then she got buffed. Then everyone watched great players git gud at Ana and show how incredibly powerful she was. And then she had a 110% pick rate and a 169% win rate across all rankings. And then people cried for nerfs.
I think that’s why Blizzard is so scared to make major adjustments to Sombra. She feels week with her garbage ass damage, but her utility is just so insane, if she is played in a coordinated environment. The ability to disable the abilities and ultimates of multiple enemy heroes during a team fight… That is absolutely insane. It provides so much more value in terms of additional damage, damage reduction, straight up tanking, even healing and so on… Just with the press of the Q button in the right moment.
Simply breathtaking analysis.
There’s a much simpler and better solution: balance for the highest level of play. If a hero is disproportionately strong at the highest level of play, the only way to fix that is either a balance change or the discovery of previously unknown tech that beats them (or, in other words, discovering that they weren’t actually that strong to begin with.)
Conversely, if a hero is too hard to fight against or too hard to play effectively at a low level, there is always a way to fix that: just get better at the game!
What you ignore is that for many people, you don’t “just get better at the game.” The vast majority of players do not put in the same time a professional player puts in. And most of those can’t even put in as much as the top tier non-pro players.
Just because someone plays the game more casually doesn’t mean that they deserve a worse experience. They deserve consideration just like any other player. Neither space should be wholesale sacrificed for the other.
The only balance a casual player recognizes is “the thing that just killed me is bullshit,” which has next to nothing to do with actual balance unless something is monstrously overpowered across the entire spectrum of the game. They’re not going to have a worse experience just because the actual balance of the game is better.
Intermediate players are another story, but intermediate players almost by definition are players who know enough to try and imitiate pros — hero winrates in plat to diamond, for instance, basically reflect the winrates in GM, just less drastically. Ideal balance for top-level play will generally result in good-enough balance for them, and I say this as a ~3k player myself.
I have to agree entirely with this article. My higher win % has not translated to a higher SR rating. My quick play hidden SR seems to be much more accurate as I’ve played many more games in this mode to slowly work past the SR hump I used to be in. It’s sad when my quick play teams are far better then my comp play teams.
In competitive, despite a 55-60% win ratio, my SR slowly slid down. At the same time many friends with a 40% win ratio would slowly rise. I couldn’t comprehend this and so became frustrated with competitive play and eventually quit it.
I would fully support a pure win weighted SR as it would more accurately reflect my contributions to winning, not how amazing I was with a particular hero compared to everyone else.
Great analysis. One area I think you miss is how easy a hero is to counter. Symmetra is a good example of this. Early parts of her skill curve go up very quickly, but you reach a plateau after which improving becomes much harder, but I feel like perfecting elements of her play at the highest level actually require immense knowledge and good judgement. She might still need that skill differentiation, but I personally don’t think so.
However, as long as major pieces in her kit are nearly trivial to counter with a bit of coordination and the right heroes, she’ll never see to tier play. Top tier teams can easily work around her setup, and also know to switch to Winston, D. Va, or Pharah, leaving Symmetra with virtually no recourse for counterplay. However, at lower ranks where this kind of switching and coordination is rarer, Symmetra’s power is extremely high.
In these cases, I think the target of the change should be to reduce her vulnerability to her counters, but also considering a small nerf to her potency. This makes her less overwhelming when people do not counter her while making her more powerful when her opponents are making an effort to counter her.
My suggestion for this would be to increase her Sentry Turret’s health to 50, increase their range to 20, and reduce their damage by a small amount (I’m unsure of the number needed). The non-trivial health would normalize the turrets to a more consistent difficulty to clear. The extra range makes them more of a threat to close range heroes with mobility who currently out range them easily. The damage nerf makes them a little less punishing to everyone who now has a more normal time dealing with turrets instead of an easy time.