Posted by Big Joe . Published on 27 January 2009, No Comments Received
Now that we have a method of determining which records are most similar to the new, unclassified record, we need to establish how these similar records will combine to provide a classification decision for the new record. That is, we need a combination function. The most basic combination function is simple unweighted voting.
Simple Unweighted Voting
1. Before running the algorithm, decide on the value of k, that is, howmany records will have a voice in classifying the new record.
2. Then, compare the newrecord to the k nearest neighbors, that is, to the k records that are of minimum distance from the new record in terms of the Euclidean distance or whichever metric the user prefers.
3. Once the k records have been chosen, then for simple unweighted voting, their distance from the new record no longer matters. It is simple one record, one vote.
We observed simple unweighted voting in the examples for Figures 5.4 and 5.5. In Figure 5.4, for k = 3, a classification based on simple voting would choose drugs A and X (medium gray) as the classification for new patient 2, since two of the three closest points are medium gray. The classification would then be made for drugs A and X, with confidence 66.67%, where the confidence level represents the count of records, with the winning classification divided by k.
On the other hand, in Figure 5.5, for k = 3, simple voting would fail to choose a clear winner since each of the three categories receives one vote. There would be a tie among the three classifications represented by the records in Figure 5.5, and a tie may not be a preferred result.
I actually enjoyed reading through this posting.Many thanks.
ReplyDeleteFunction Point Estimation Training