I picked up on an interesting post about attention data from Damien Mulley’s blog. Whilst writing a comment on his post, I realised it was turning into an epic. It presented me with an opportunity to talk about Segala’s Semantic Web Firefox Trust extension too, so I’ve decided to write here and link to Damien’s blog instead of posting a comment on his.
The most applicable point for me in Damien’s post, was
It was in a past blog post here where I said that if we controlled our activity data, we could actually make money from search engines and the likes of Microsoft HealthVault, so there’s potential there. So I was quite interested when Mozilla announced Weave, their system which will store your Firefox preferences on their servers and when you install a new Firefox on a new computer, it can go to the Mozilla servers and download all your preferences and bookmarks.
According to Mozilla
The idea behind Weave is that all your personal information — bookmarks, passwords and account names, for example — are synced to your Mozilla account via Firefox. If you lose your computer, you can download Firefox, log into your account and you can restore all that information. You can do some of this today if you use Google Browser Sync and Dot Mac services. You can start by creating an account with Mozilla Services. You will need Firefox 3.0 or higher to get this working.
This is relevant to me as the functionality behind the Mozilla Weave has been available in Glaxstar’s Firefox browser for more than 2 years. When I say available, I’m referring to every single last detail. Whilst Glaxtstar’s Glubble browser is new, I’ve had insight to their technology for quiet some time.
Glaxstar is possibly the only development company in the world that could build a competitive Firefox browser to Mozilla in my opinion (Flock is a 1.0 effort compared to what these guys can do!). That’s if Ian decided to take that route. As it happens, he’s just interested in helping guardians to protect their loved ones from inappropriate content.
Note that I didn’t say, help to protect minors, or help governments protect people. That’s not his job. It’s not Google’s job, it’s not Segala’s job and it’s not the Government’s job either. Ian’s job is to help guardians who are responsible for deciding what’s appropriate and inappropriate for the people they’re responsible for. Technology should be perceived and used as an enabler, not a prohibiter. Furthermore, what a guardian in Germany deems approproate is not likely to be the same as what a guardian thinks in the UK for example. This is why I’d like people to perceive Content Labels as an enabler to help mainstream search engines and browsers to provide better content discovery, not a method for policing the Web.
So, I wouldn’t be surprised if Glaxstar gave the Weave code to Mozilla given that they’ve had it for more than a couple of years and they built Mozilla’s mainstream browser extensions for companies such as Google, Yahoo!, PayPal and eBay. They also maintain spreadfirefox.com and are responsible for resolving defects in the mainstream Firefox browser. That makes Glaxstar the most qualified company in the world to build Firefox add-ons in my opinion.
Luckily for me, Ian Howard, Founder of Glaxstar, is a personal friend of mine. So, who better to build Segala’s Firefox trust extension (not plug-in, that’s something different) Search Thresher. Our extension really is based on The Semantic Web, unlike the claims made by many of the co-called Semantic Web search engines.
Sorting the wheat from the chaff
As I’ve said, Glaxstar and Segala have been working together for the past couple of years. Although, we haven’t updated our extension in over a year (I guess that demonstrates how ahead of the curve we’ve been). As of February though, you should expect to see regular updates for our Trust extension.
Search Thresher is just one of the pieces in our jigsaw to help demonstrate why and how we feel very confident that 2008 is the year to tell Segala’s story. You will notice me talking less about conferences that I host and Chair and more about our Semantic Web method of classifying content.
What’s with the name?
The thrashing machine, or, in modern spelling, threshing machine (or simply thresher), was a machine first invented by Scottish mechanical engineer Andrew Meikle for use in agriculture. It was invented (c.1784) for the separation of grain from stalks and husks.
For thousands of years, grain was separated by hand with flails, and was very laborious and time consuming. Mechanization of this process took the drudgery out of farm labour.
Today, searching the Web is equally laborious. You may or may not find what you’re ’searching’ for and even when you do find what you want, can you trust what you find?
Think of Search Thresher as a threshing machine. It’s a Firefox extension used to demonstrate to search engines and mainstream browsers, how they can (and should!) provide users with more trust on the Web using a method called Content Labelling.
We haven’t touched the extension for over a year as we’ve been focused on other stuff that I’ll tell you about soon. If you’re a designer and would like to be recognized for your work, please feel free to volunteer your services to rebrand the Web site. Search Thresher is a non-profit standards based browser, so this may be of interest if you’re a standards enthusiasts.
We’re not emotionally attached to the name Search Thresher. What do you think of it? We’re open to suggestions if you can propose something better.
Read more about Content Labels - this post also includes sample use cases.



Posted on January 2, 2008 at 9:58 pm |
By


7 Comments
So far,

January 3, 2008 @
Aidan Finn
The real problem with having a semantic web browser and the problem that has dogged the semantic web in general is that almost no web sites currently attach semantic annotations to their content. And for publishers there is very little benefit in adding semantic annotations to their content since there are no real applications that use them.
So if you’re going to use semantic annotations to filter content you’re limited to a teeny tiny pool of sites from which to take content.
How can you overcome this problem?
Coming from a machine learning/NLP background I think that the solution is build more intelligent web crawlers that can automatically recognize different types of content and generate the annotations automatically. This scales much better as it doesn’t depend on the presence of pre-existing annotations.