Natural (Language) Interfaces

This blog post is not about Siri, sorry. I remember when the best way to control a games console was like this: But over the years, controllers started to look like this: From a single red button to a plethora of buttons, triggers, D-pads, joysticks, joysticks which act as buttons and switches, it’s no wonder … Continue reading “Natural (Language) Interfaces”

This blog post is not about Siri, sorry.

I remember when the best way to control a games console was like this:

But over the years, controllers started to look like this:

From a single red button to a plethora of buttons, triggers, D-pads, joysticks, joysticks which act as buttons and switches, it’s no wonder that there was a bit of a “revolution” when this hit the market:

But everyone has been a little fascinated with this for the last couple of years. And not surprisingly – this is one of the interfaces that we use to control the world. It seems natural to use it for direct manipulation.

And despite the fact that the hardware is obviously capable of it, games designers haven’t been making use of one of the other obvious interfaces. One that we humans excel at.

This isn’t the same as using a headset to bark commands at team-members, but using defined commands to instruct a game element. Yes, these games exist (Shouter being one of the most well-known) but the sophistication is low.

What I’m looking for is the difference between Newton and Palm, but in terms of voice. Newton tried to recognise your handwriting while Palm made you learn a certain alphabet. For games, at this stage, we need to create a basic control set that can be easily recognised by a language processor. Whether that is in understanding actual words or whether it is mapping wave patterns – it doesn’t matter. The point is to use our voice to control games.

The instructions can be short, they can be words, they can be screams and cries. When I call “Retreat”, my units should start to retreat back to base, making a tactical withdrawal. When I order “Advance”, they should use cover and opportunity to advance upon the enemy position. And when I shout “Charge”, you get the idea.

(images not used with permission)

0 thoughts on “Natural (Language) Interfaces”

  1. Sorry, but bad idea…

    First, it was tried with a Star Trek game, I think it was Bridge Commander, but don’t quote me. It was a novelty and largely ignored because physical controls were much more responsive.

    Secondly, you have to consider the environment in which people play games. Could you handle a few kids, each screaming at the TV in their bedrooms? Or how about a partner at 1am? Or would you appreciate a friend trolling you by shouting charge, while you should be retreating? Bottom line is voice is social, but gaming is largely an insular experience – even multiplayer games (it’s still YOUR experience).

  2. Hm, not sure basing a 2012 era interface on an experience you had in 2002, using 2000-era technology is a valid comparison.

    The voice commands are an alternative, an addition to an interface, not a replacement.

    The one thing that frustrates me with “gamers” is their inability to articulate. There are few gamers who actually use the mic on their computer at all. Even to communicate when they are being chowed down on by a horde of zombis, we’re expected to notice a little line of text?

    I think your observations are archaic. I appreciate the point about social vs insular but, again, I disagree. Gaming for me is a social experience (apart from it being about the only time I talk to my bro).

  3. Totally disagree, and I think you just proved my point – It’s about the only time you talk to your brother.

    X: “Hold position…Flank Left”
    Y: “Wedge formation”
    iPhone: “beep!”
    X: Sounds like your iphone needs CHARGEd…”
    X: “CRAP! RETREAT RETREAT!”

    If your voice is taken up controlling a game then it will kill the social aspect. I remember being a teenager, sat around a console with friends – If you die you pass the controller to the left – There were between 3-5 of us in the room. They chatted away, or were sometimes backseat drivers. What happens in that situation? Nobody else in the room is allowed to speak?

    If you look at it in terms of sheer physics with a physical controller you pass electrons down a wire, with zero interference between your responses and the game world. Voice, on the other hand, adds a new point of failure. Your vocal response has to travel through a medium full of noise and interference, before it is interpreted by the game world.

    To make voice as reliable and precise as a physical controller a player would either have to 1)Wear a headset with a noise-cancelling microphone, or 2) have hardware/software with an omni-directional microphone that could differentiate between voices with zero latency – Which ain’t going to happen anytime soon.

    Nah, I’ll stick with the electrons, occasional elbow-nudges and ability to talk to friends without worrying I’ll inadvertently say my last words.

Leave a Reply