There are many possible ways to judge the intelligence of a potential AGI. However, there are two criteria for intelligence that matter the most.

1. Is the AGI intelligent enough to destroy humanity in the pursuit of its goals?

2. Is the AGI intelligent enough to be able to steer the future into a region where no AGI will ever arise that would destroy humanity?

If an Unfriendly AGI reaches the level of intelligence described in (1) before a Friendly AGI reaches the level of intelligence described in (2), mankind perishes.

To paraphrase Glengarry Glen Ross: First prize in the race between FAI and UFAI is universal utopia. Second prize is, everyone dies. (No gift certificate, not even a lousy copy of our home game.)

## Wednesday, March 26, 2008

## Sunday, March 9, 2008

### Making the Best of Incomparable Utility Functions

Suppose there are only two possible worlds, the small world w that contains a small number of people, and the large world W that contains a much larger number of people. You current estimate there is a .5 chance that w is the real world, and a .5 chance that W is the real world. You don't have an intuitive way of "comparing utilities" between these worlds, so as a muddled compromise you currently spend half your resources on activities AW that will maximize utility function UW if W is real, and the other half on Aw to maximize utility Uw if w is real.

This is not a pure utilitarian strategy, so we know that our activities may be suboptimal; that we can probably do better in terms of maximizing our total expected utility. Here is one utilitarian strategy, among many, that usually dominates our current muddled compromise.

Calculate expected utility EU' of a local state s as:

EU'(s) = P(W) U'(s|W) * P(w) U'(s|w)

With P being the probablity, and U' being any new utility function that obeys these constraints:

1. U'(s|W), the utility of s if W is real, is an affine transformation of UW.

2. U'(s|w), the utility of s if w is real, is an affine transformation of Uw.

3. If the only activities you could do were AW and Aw, and P(W) and P(w) are both .5, then EU' is maximized when you spend half of your resources on AW and half on Aw.

What have we gained?

1. We now have a way to evaluate the utility of a single action that will help in both W and w.

2. We now have a way to evaluate the utility of a single action that helps in W but harms in w, or vice versa.

3. As future evidence comes in about whether W or w is real, you have a way to optimally shift resources between AW activities and Aw activities.

This is not a pure utilitarian strategy, so we know that our activities may be suboptimal; that we can probably do better in terms of maximizing our total expected utility. Here is one utilitarian strategy, among many, that usually dominates our current muddled compromise.

Calculate expected utility EU' of a local state s as:

EU'(s) = P(W) U'(s|W) * P(w) U'(s|w)

With P being the probablity, and U' being any new utility function that obeys these constraints:

1. U'(s|W), the utility of s if W is real, is an affine transformation of UW.

2. U'(s|w), the utility of s if w is real, is an affine transformation of Uw.

3. If the only activities you could do were AW and Aw, and P(W) and P(w) are both .5, then EU' is maximized when you spend half of your resources on AW and half on Aw.

What have we gained?

1. We now have a way to evaluate the utility of a single action that will help in both W and w.

2. We now have a way to evaluate the utility of a single action that helps in W but harms in w, or vice versa.

3. As future evidence comes in about whether W or w is real, you have a way to optimally shift resources between AW activities and Aw activities.

Subscribe to:
Posts (Atom)