At-a-glance functions for modelling utility-based game AI

One of my current projects involves developing a decision-making AI for autonomous characters (i.e. NPCs) in a computer game.

The idea is pretty straightforward – instead of switching between a set of finite states based on simple triggers (i.e. if Pacman has eaten power pill then “run away”, else “chase”), each character constantly assesses the actions available to them in their current environment, and assigns a utility (i.e. the benefit) from each of those actions on a continuous scale. Then, being rational agents, each character simply chooses to perform the action that provides them with the greatest utility. I’m no psychologist, but that seems like a reasonable model of human behaviour to me.

For example, when a character is already at full health, the utility from picking up a health-giving medikit is zero, so you’d never expect the character to perform that action. As the character’s health diminishes, the relative utility of the medikit increases, but in what way? Is it a linear scale?

image

Or, would a character remain relatively content with 90%, 80%, or 70% health, but then feel increasing urgency to look for medikits when their energy drops under 30%, say? More like this:

image

Likewise, consider how the utility of searching for an ammo box changes depending on how many bullets the player already has, or the utility of running away varies on the relative strength of an opponent’s weapon.

When deciding how to assign utility to different actions, I find it helpful to refer to the following graphs to consider what function might best describe how I think a rational human player would determine the utility of that action, based on different environment variables (remember, because this is artificial intelligence, there are no “correct” answers – what’s nice is that you can swap in and out different models until you find something that looks right, without worrying about whether it has sound psychological reasoning)

 

Step Function

If x > 0.5 then y = 1, else y = 0.

This is the equivalent to the simple Boolean trigger logic that the Pacman ghosts use – “Has Pacman eaten a power pill? Then definitely run away!”. Note however that the utility assigned to this action when the condition is true doesn’t have to be 100% – it’s possible to step up in stages, or to only step up a certain amount, so that other actions may still have greater utility, even when the condition is true.

image

 

Linear Function

y = mx + c

Remember that you can change both the gradient (m) and the intercept (c), but any increase in the underlying variable will always lead to a constant proportional increase in utility. To give the gradient a downwards slope, set m to be less than 0.

image

 

Increasing rate of increase (i.e. exponential increase)

y = x^a where a>1

As the independent variable increases, the marginal utility increases more dramatically.

image

 

Decreasing rate of increase (i.e. logarithmic increase)

y = x^a where 0 < a < 1

When the independent variable is small, a little increase leads to a big increase in utility from that action. As the independent variable gets larger, the marginal utility increase becomes less and less.

image

 

Exponential Decay

y = a^x where 0 < a < 1

When the independent variable is small, a little increase leads to a substantial decrease in marginal utility. As the independent variable increases, the marginal utility decrease diminishes.

image

 

Sigmoid  curve

y = 1/(1+ex) (or y = 1/1+e-x for reverse)

This gives an S-shaped curve that, as defined above, is centred about x=0, but is easy to shift to make the middle of the curve (where the gradient is steepest) lie wherever is appropriate.

image

 

Frequently, the utility of an action (y) varies not only with a single variable (x) as shown here, but with multiple variables – the utility of attacking another character may increase linearly with the value of the prize if successful, but decrease exponentially with the relative strength of that character. That’s no problem – just combine the utilities as:

Utility from attacking = w1(prize to be won) – w2(relative strength^2)

By changing the functions used to calculate the utility of each action, and adjusting the weights (w1 and w2 above), you can create surprisingly sophisticated AI decision-making behaviour using only the handful of functions above.

 

Choosing the Action with the Greatest Utility

As an example, suppose that your computer-controlled agent was capable of three actions: attack, heal, or reload, and that you had chosen utility curves for those actions based, respectively, on the perceived enemy strength, current health, and number of bullets held, as follows:

imageimageimage

We can now determine what the “best” course of action would be for the character at any point in time by plugging in the current values of strength, health, and bullets and reading off the associated utility of each action. For example, starting from this situation:

  • Enemy Strength = 6   ==> Utility of Attacking = 40
  • Health = 70                ==> Utility of Healing = 14
  • Bullets = 5                ==> Utility of Reloading = 30

The action with the greatest utility is to attack. So, the character starts to attack, and fires off a few bullets. The situation now becomes:

  • Enemy Strength = 6   ==> Utility of Attacking = 40
  • Health = 70                ==> Utility of Healing = 14
  • Bullets = 3                ==> Utility of Reloading = 50

Now, the greatest utility comes from reloading. So the player refills the chamber of their gun. Unfortunately, in doing so they take a few hits, damaging their health:

  • Enemy Strength = 6   ==> Utility of Attacking = 40
  • Health = 50                ==> Utility of Healing = 50
  • Bullets = 20                ==> Utility of Reloading = 0

At this point, health has become an issue, and the greatest utility now comes from healing.

By tweaking the utility curves, you can create differences in character’s personalities – for example, a “brave” character would derive greater utility from attacking than a more cowardly character would, so the attacking utility curve for the brave character would be higher than the cowardly one. Likewise, a cautious character might place greater emphasis on reloading and healing than on attacking.

About these ads
This entry was posted in AI and tagged , , . Bookmark the permalink.

3 Responses to At-a-glance functions for modelling utility-based game AI

  1. Alex Norcliffe says:

    Fascinating post. I’d love to see your thoughts on how to represent the domain itself (agents, actions & utility) if you’re using OO that is

  2. Pingback: Whats That Over There? – AI Code Woes | Midnight Launch Studios

  3. Pingback: Smooth Unity Camera Transitions with Animation Curves | Alastair Aitchison

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s