Over The Counter Culture

Staring at the sun
Latest Posts »
Popular »
» Beijing/Shanghai
» A-nyhao!
» India on the road - Part 2
» India - a summary
» Google Friend Connect - part 2: The largest Social Network ever built
» Social networking dividend of open conversations
» Conversation platforms will make blogs redundant
» Arsenal FC transfer budget to be cut ‘because of property market slowdown’
« Self-replicating, open source 3D printers
A song to make your spine tingle »

Is Google using your brain as you browse?

I just stumbled across a research paper published by a Google employee and a Microsoft employee entitled “A Case for Usage Tracking to Relate Digital Objects“. I have no idea who Elin Rønby Pedersen is but she’s published both on this and on Google’s much vaunted foray into organising your health data.

The paper highlights an interesting idea, potentially just as important to Future Google as Pagerank has been to Google so far. It’s not groundbreaking - you see it on, for example, Amazon. But it’s worth thinking about, applied to the whole web.

The idea is that related objects - and I use the term extremely loosely here - can be identified because you looked at them during a session of Internet browsing; you started with one, and your later browsing takes you to related objects - blog posts or news articles on the same or related subject; similar videos; etc. Your brain does the hard work of deciding what objects you’re looking for; average that with other similar datasets and Google has a pretty damn good idea of what objects on the web are related, no matter what format the object has (could be visual, textual, a flash game, a picture - they could all be related in some way that a machine has no way of ever being able to decipher the way a brain can) - the beauty of this is, the Google machine doesn’t HAVE to understand.

Evidently, there’s a lot of ‘noise’ in the data since people can be quite random when browsing, or visit an unrelated page, etc. The answer to noisy datasets is to aggregate more datasets and average them. Google definitely has access to a lot of data - just through google.com, but also the emails you send through Gmail, through content you share through Google Opensocial apps, by registering your IP each time you view their ads on any of the sites you visit, by monitoring the sites visited by anyone with a Google toolbar - etc.

This is more top-down “semantics”, and only a few companies have the capability of tracking all Internet users around the web; Google is fairly unique because it has so much share of search, email and ads (you could argue that the doubleclick merger approval really missed the significance of the move, with huge privacy and antitrust concerns going unnoticed). Two additional categories of players present themselves: your browser, and your OS. The OS could (controversially) monitor websites you visit. As could your browser. I see huge potential in Mozilla Weave - if, when I send it my web visit history data (at the moment i do that so it syncs my data between my computers), with my approval it processed the data (looking at what I did during my browsing sessions) and pooled it with that of others, it could infer relationship between objects and recommend it within a sidebar.

Technorati Tags: Google,search engines,tracking,web 3.0,implicit web,semantics,data portability,relational web
Bookmark/Share:

Related:

The semantic elephant in the room - Google will settle the "top down vs. bottom up" debate for us
Here is a useful primer into what some people (perhaps not the best advised) are calling Web3.0. The fundamental principle of semantifying data is that information becomes more easily found and understood by computers. Mix that with AI and you've got some very, very powerful, useful tools for information gathering, processing and decision making! So why is Google - the information lynchpin of the Internet, and thus, of modern society - not THE focus of attention in all this hubris about Web3.0? This is a company with around five THOUSAND(1) computer scientists devoted to improving their search engine (~35,000 man hours a day). SURELY they're building some amazing semantic IP that will help cement their dominance. A big debate in the semantic field at the moment is whether the best approach is 'top-down' or 'bottom-up' Bottom-up: when information is created, it is annotated by machine-readable tags. Technologies like RDF, OWL and microformats (to a basic extent, XML) do this. Bottom-up semantics got a big boost this week when Yahoo announced it was adding RDF descriptors to its pages Top-down: when a Google machine finds a document on the web, it reads it and understands the information. That's very, very advanced...
Google Friend Connect - part 2: The largest Social Network ever built
Having originally assumed that the reason Facebook, Hi5 and LinkedIn (FHL), amongst others, were involved in the Google Friend Connect (GFC) service, I initially wanted to write this post to argue that this was the biggest strategic mistake of their lives. Turns out, Google is involving them whether they like it or not - using their APIs to let you pull in your friend data to your Google Friend Connect profile from your other social networks. In light of this, the point I'll argue is therefore that not slamming the door on GFC's scraping of their data would be a fatal mistake for FHL. Needless to say, deprived of their data, GFC loses all its value to users - so this is a zero-sum game. I argued yesterday that all FHL could possibly gain from this is more information about you as you browse around the web and use social features on various websites. That's an interesting datapoint (which they may not even have access to because they're unwilling participants in this scheme), but long term, being part of GFC means their sites will be abandoned as Google rolls out the biggest social network mankind has ever seen, building...
Google Friend Connect - part I: it’s about the data
This week, Google announced a new tool to help me and all other website owners create social  features in our sites. It's a library of javascript gadgets that I link to (in the Google library) from my site, and loads up in the site (imagine it instead of the Disqus comments system I currently have installed) to add features for visitors which they can use by signing in - like comments, a chatroom, a photo gallery for people to upload photos to, product reviews, whatever. Blogopunditry and civil rights hippies are pleased that you can log in with a google account, or OpenID, AIM, Yahoo, maybe others in future - so this isn't a straight-up move to get people to sign up Google Accounts. No, it's far more clever than that. According to their demo video, once you have a Google Friend Connect (GFC) account (having logged in with yahoo, google, openID, whatever), you can tell it who all your friends are - you simply link to your Facebook, Hi5, Orkut and/or LinkedIn social networks and it sucks that information out. For you, that's cool, because when you use the chatroom on my site, it will tell you which of...

Related posts brought to you by Yet Another Related Posts Plugin.

This entry was posted on Saturday, April 12th, 2008 at 6:58 pm and is filed under Musings. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.

discussion by DISQUS

Add New Comment

  • Subscribe:  This Thread
  • Go to:  My Comments ·  Community Page
  • Thanks. Your comment is awaiting approval by a moderator.

    Do you already have an account? Log in and claim this comment.

     
    discussion by DISQUS

    Add New Comment

    Trackbacks

    (Trackback URL)

    close ()

    status via twitter

    recent comments (follow comments)

      View Profile »
      Powered by Disqus · Learn more
      close Reblog this comment
      Powered by Disqus · Learn more
      blog comments powered by Disqus
      • Home
      • About
      • List all posts
      • Current Reading
      • Categories
        • Culture bucket
        • Lifestream
        • Musings
        • New science
      • Search

      Over The Counter Culture is proudly powered by WordPress
      Entries (RSS) and Comments (RSS).