Show me who your friends are, and I'll show you who you are...

It's not the data that's important, it's the links between them that matter.

That's one of my policies about data, especially now with relational databases allowing businesses and people to see the links between people. Advertisers don't care who you are, they just care that one person has gone to X, Y, and Z places and that they must enjoy reading about C to advertise C to you. Police really don't care what person you are, so long as they can identify you buying a gun, then holding up a liquor store, then going to a residence so they can arrest you. Doctors don't need to know your name to know that you have the flu because you have D, E, and F symptoms. Given most circumstances where data is collected, the name of the person is irrelevant. It's all about tying the data together.

Computers have allowed us to rapidly collect data, organize it into a standardizable format, and (most importantly) perform analysis on it. The Internet, merely a vast collection of information and links to other pieces of information, is the crowning achievement of interconnected data. Wikipedia, which makes linking between data dangerously easy, has been the source of many a curious perusal. Facebook, in particular, has been an excellent repository of links between people, pictures, messages, status updates, demographical data, and more! While Facebook has always made this information usable for advertisers, Facebook has revealed a new feature that is genuinely helpful (if not a little scary) to users.

Facebook calls it "People You May Know" (must be logged into Facebook to do anything with it), and it's a very compelling feature in my opinion. Basically, Facebook takes your friends' friends and finds the people that show up the most often. Their logic is that if many of your friends are friends with someone, then that person has a good chance of being your friend. Normally this would be a very hard thing to look at by hand, as you would need to compare something along the lines of n^3 people, where n is the amount of friends each person has. You take every single person I have as my friend, then for each of them, you have to take every one of their friends and see if they exist in all the other n lists of friends. Even using efficient algorithms, it's an NP-complete problem. Assuming everyone on your friends list has 50 friends, that's anywhere from 100,000 to 125,000 comparisons (assuming a good amount of efficiency on the programmer's part about removing people once they've been matched from other friend's lists). Of course, I currently have 282 friends, and I know of about 5 people on my friends list with over 400 friends, so this would take a LONG time.

Whatever crazy amount of processing power or super-efficient algorithms Facebook is using to find these common friends, it's impressive. I've used the feature to add 5 new friends from high school and college. Of course this increases the work Facebook has to do when calculating my new "People You May Know", but that's not my problem :). As soon as Facebook starts guessing my political philosophy based on the friends I keep (and they can, it's easier than doing what they just did), then I may need to get out. But I don't think I can very easily...