Inductive learning as search: Version spaces

Consider the following examples:

From these examples, can you come up with a description of days on which you enjoy playing your favorite water sport?

Is the answer that you enjoy playing water sports on sunny days?

Is it that you enjoy playing water sports on warm days?

Or is it sunny and warm days?

Or perhaps it's sunny and warm days with strong wind?

In Version Spaces, we look for all answers that are consistent with the examples.

• an incremental learning method

• makes use of positive and negative examples

• will keep track of all possible definitions that are consistent with the examples seen (but not explicitly)

Rather than literally storing all possible answers, we store two sets of information:

G: the most general classification rules consistent with the examples.

S: the most specific classification rules consistent with the examples.

The Candidate Elimination Algorithm

Initialize G to be the set of maximally general classification rules: "Every day is a good day to enjoy water sports." or { < *, *, *, *, *, *> }

The left hand side of a classification rule will be a conjunction of attribute values. The attribute values in a rule will be represented by a vector < a1, a2, ..., an>, where each a specifies the value of an attribute. * signifies "don't care".

Initialize S to be the set of maximally specific classification rules: "No day is a good day to enjoy water sports." or { }

If d is a positive example
- Remove from G any rule inconsistent with d
- For each hypothesis s in S that is not consistent with d
  - Remove s from S
  - Add to S all minimal generalizations h of s such that
    - h is consistent with d, and some member of G is more general than h
  - Remove from S any rule that is more general than another one in S
If d is a negative example
- Remove from S any rule inconsistent with d
- For each rule g in G that is not consistent with d
  - Remove g from G
  - Add to G all minimal specializations h of g such that
    - h is consistent with d, and some member of S is more specific than h
  - Remove from G any rule that is less general than another one in G

Let's trace through the algorithm for the examples given in the table above:

The first example is positive, so our S set is too specific. We generalize it so that S now is

S = {<Sunny, Warm, Normal, Strong, Warm, Same>}

The next example is also positive. Our S set is too specific again. So we generalize it.

S = {<Sunny, Warm, *, Strong, Warm, Same>}

The next example is negative. This time our G set must be modified, as it is too general.

G = {<Sunny,*,*,*,*,*>, <*,Warm,*,*,*,*>,<*,*,*,*,*,Same>}

And now we see our final example. The example is positive, so we might have to modify our S set. In fact, we do.

S = {<Sunny, Warm, *, Strong, *, *>}

In addition, we need to modify the G set, because it contains an element that is not consistent with the example we've just seen.

G = {<Sunny,*,*,*,*,*>, <*,Warm,*,*,*,*>}

These two sets (along with everything in between) describe all classification rules that are consistent with the examples we've seen.

The objective of the Candidate Elimination Algorithm is to find all describable rule that are consistent with the observed training examples.

Pros and Cons

Pros:

a least commitment strategy - the algorithm modifies the S and G sets as little as possible when accommodating new examples.
performs an exhaustive search of the space of all possible classification rules.
don't have to store in memory every rule consistent with the examples - only the S and G sets.

Cons:

a least commitment strategy - the algorithm modifies the S and G sets as little as possible when accommodating new examples.
performs an exhaustive search of the space of all possible classification rules.
does not tolerate any noise: the G and S sets "pass" each other.