A Thing That Most People Who Read Stew's Letter Already Know But I Felt Compelled To Articulate Again

It wasn’t that long ago that cutting-edge video games looked like this:

Donkey Kong (1981)

Donkey Kong (1981)

A few years later, they looked like this:

Super Mario Bros. (1985)

Super Mario Bros. (1985)

Then this:

Super Mario 64 (1996)

Super Mario 64 (1996)

Then this:

Super Mario Sunshine (2002)

Super Mario Sunshine (2002)

And now this:

Super Mario Universe (2017)

Super Mario Universe (2017)

It’s official: video game graphics no longer suck. And the same is true of nearly every popular application that we’ve historically used computers for. Simple tools like Microsoft Paint have evolved into far more capable products like Adobe Photoshop. Google Maps is exponentially better than any consumer mapping software from 15 years ago. 

Up until the past couple of years, I figured that was pretty much the future of progress: computers would keep getting better and better at the stuff they were already used for. Occasionally, a product might come along that seemed entirely “new,” but it was most likely just a clever combination of existing technology (like the early version of Siri). Computers will get faster and cheaper, but at the core, I thought, they’d still have the same basic capabilities.

It turns out, though, that we recently crossed some threshold of computational-power-per-dollar that has enabled computers to do things that aren’t just exponential improvements to the stuff they could already do. Instead, computers are starting to do entirely novel things that lack a historical precedent.

Instead of rendering ever-better video game graphics, for example, a program can now learn how to play complex video games without any prior knowledge and destroy the best human players within a trivial amount of time.

An artificial intelligence program defeats the world’s best Dota 2 players. It essentially knew nothing about the game when it first started playing. (2017)

An artificial intelligence program defeats the world’s best Dota 2 players. It essentially knew nothing about the game when it first started playing. (2017)

One of the important, new things that computers can do is “learn” from enormous amounts of data. Historically, teaching a computer what The Incredible Hulk looked like or how to beat somebody at chess required writing out a bunch of complex rules for a program to follow - and oftentimes those rules performed worse than toddlers at the same task.

Today, we can show a computer a bunch of photos of The Incredible Hulk and it will start to “understand” the general characteristics of what makes The Hulk different from other stuff. Then, when it’s shown a photo it’s never seen before, it will be able to take a highly-accurate guess if The Hulk is in it or not. It can learn this skill in seconds.

If that doesn’t blow your mind, consider some research that came out from Google’s DeepMind this month. Researchers were able to create an AI program that could look at a single 2D image it had never seen before and “imagine” what the same scene would look like from different vantage points.

Using a single 2D image that it has never seen before (left), an artificial intelligence program predicts the properties of the surrounding area (right). (2018)

Using a single 2D image that it has never seen before (left), an artificial intelligence program predicts the properties of the surrounding area (right). (2018)

That is nuts. And it’s just a preview of what’s coming.

The DeepMind research in particular sheds light on how a “uniquely” human attribute - in this case, the ability to infer depth and space from a stationary view of a scene - can now be mimicked by a computer, albeit crudely. 

In order to get a computer to learn, we must represent specific aspects of the world as data which it can interpret. The thing I under-appreciated until recently is just how many of our senses and skills can be represented as data:

  • Sense of vision -> an array of various color values
  • Written language -> sequences of characters
  • Voice -> waveforms
  • Certain emotions -> changes in pupil dilation that can be easily measured

Those are just a few wildly over-simplified examples. There's a far wider, and nuanced, range of human attributes that can be quantified in fairly precise data structures.

Now that we have machines capable of learning from this data, it seems likely that we are entering into an unprecedented moment in history. Our computers won't just be getting better at their old tricks, they'll be wrapping their tentacles around more and more previously-out-of-reach domains.

This isn't the essay where I say if that's terrifying or exciting or both, this is where I say something obvious, but worth repeating:

We are alive during a time of ludicrous technological change.