Euclidean Distance Score
October 30th, 2008 | Published in Machine Learning, Statistics | 1 Comment
I am currently reading Toby Segaran’s book “Programming Collective Intelligence” and one of the first topics it covers is how do you determine of similar two people are.
One approach is to use the Euclidean Distance Score. Arun Vijayan C has an excellent power point presentation “Finding more people like you” – on this topic:
I then wrote up some quick code in C# that uses the values in Arun’s presentation:
// Euclidean Distance Score
using System;
using System.Collections.Generic;
namespace ConsoleApplication1
{
internal class Program
{
private static void Main()
{
// Load People and Values
var myP = new List
{
new People("John", 1.5, 4),
new People("Ravi", 4.5, 1.5),
new People("Kiran", 1, 3.5),
new People("Deepti", 3, 5)
};
// Print header
Console.WriteLine("People And Scores");
Console.WriteLine("###################");
Console.WriteLine();
// Loop through people and values
foreach (People people in myP)
{
Console.WriteLine(people.Name + "\t" + people.xScore + "\t" + people.yScore);
}
// Print Distance And Value Headers
Console.WriteLine();
Console.WriteLine("Distance Comparison");
Console.WriteLine("###################");
Console.WriteLine();
// Loop through people and scores
int myCount = 1;
for (int i = 0; i < myP.Count; i++)
{
for (int j = myCount; j < myP.Count; j++)
{
// Euclidean Distance Score
// Sqrt( (x1-x2)^2 + (y1+y2)^2)
Console.WriteLine(myP[i].Name + "\t" + myP[j].Name + ":\t" +
Math.Sqrt(Math.Pow(myP[i].xScore - myP[j].xScore, 2) +
Math.Pow(myP[i].yScore - myP[j].yScore, 2)).ToString("0.##"));
}
// Skip to the next guy
myCount++;
}
// Print Closer
Console.WriteLine();
Console.WriteLine("Press enter to continue...");
Console.ReadLine();
}
}
internal class People
{
public string Name;
public double xScore;
public double yScore;
public People(string _Name, double _xScore, double _yScore)
{
Name = _Name;
xScore = _xScore;
yScore = _yScore;
}
}
}


August 30th, 2011at 7:02 am(#)
[...] compare two words/documents by using the dot product of two row vectors. Or one could also use the Euclidean Distance Score. And if you are also interested I would recommend Sujit Pal’s blog post “IR Math in Java : [...]