Tag Archive for 'api'

Semantic Services

If you’re a blogger or have any interest in semantic/content management technologies then you may be interested in a couple of new services which have recently launched with the aim of making content creation easier by automatically suggesting contextually relevant images, links, articles and tags which you may like to include.

Tagaroo

Tagaroo is based on an initiative called Calais by Thomson Reuters to “connect the world’s content by providing automated metadata services“. The video below sums the concept up pretty well -

It has an extremely slick and easy to use UI which sits neatly below the post editor on the Wordpress write page, suggesting tags and images as you type.

Underlying the interface the magic is carried out using “natural language processing and machine learning algorithms to extract the people, organizations, companies, geographies and events hidden within it”. To do this it connects to Calais via a free API (registration required). Pictures come from Flickr with a CC license.

My tests have found it pretty reliable and an extremely quick way tag your posts using a standard global taxonomy. At the moment the plugin is only available for Wordpress and Drupal however a number of other tools are currently under development.

Zemanta

Described as “a brilliant product for lazy or otherwise time-focused bloggers“, Zemanta is similar in many respects to Tagaroo, although perhaps a little more mature in its functionality (it’s European after all!). The video below shows how it works -

The tool uses its own database of content (indexed from over 300 “top media sources”) in order to suggest related pictures, links, articles and tags. It has a clean UI which integrates well with whatever backend you use and is offered either as a plugin for all the major platforms; WordPress, Blogger, TypePad and LiveJournal, or as a browser extension for IE or Firefox.

As someone who frequently links to Wikipedia in my posts I’ve found the link suggestion component an especially easy and quick way to insert these references with virtually no effort. Although the interface for picture insertion isn’t quite as nice as Tagaroo, Zemanta is currently my plugin of choice.

Yahoo also have a competing offering although it’s restricted to Yahoo content only so I’ve not taken time to review it.

Implications

Whether you call it Web 3.0, the Semantic Web or the Giant Global Graph I think these sorts of services are an important step towards the automated inference of knowledge from information. When we reach the point where machines can “understand” the content which they are parsing the implications are massive. Aside a whole herd of near-term applications I can also imagine scenarios in the not-so-distant future where every piece of content on the web is automatically linked to everything else which is relevant to it without the need for human interaction - Wikipedia without the editors or boundaries (or inherant bias?).

Enjoyed this post? Please subscribe to my RSS feed to stay up to date.

Remixing The Web

The web as we know it is changing. Be these changes small or large we have already gone way beyond a mere collection of pages linked together and are now at the stage of connecting individuals through social interaction and harnessing its collective intelligence. The next step appears to be evolving towards the concept of the semantic web through the use of feeds and markup technologies (RDF, OWL, XML, Microformats etc.) to represent meanings in information which allow us to infer and connect knowledge within and around it.

A lot of this will involve annotating information to make it machine understandable (and not just readable); we will design for re-use of information. The upshot of all this should mean that the user spends less time and effort carrying out complex tasks.

Knowledge Evolution

Put another way (from Wikipedia):

“The Semantic Web is an evolving extension of the World Wide Web in which web content can be expressed not only in natural language, but also in a format that can be read and used by automated tools, thus permitting people and machines to find, share and integrate information more easily.”

A road map might look something like this and interestingly almost exactly mirrors how information architects commonly define the process of converting Data to Information to Knowledge to Wisdom (or intelligence) in the human mind:

  • Stage 1 [DATA] - connecting information (the humble hyperlink)
    • Data on its own tells us very little
    • By observing context, we can distinguish data from information
  • Stage 2 [INFO] - connecting people (social networking) â† here & now
    • Information is derived as we organise and present data in various ways
    • Organisation can change meaning (either intentionally or unintentionally)
    • Presentation enhances existing meaning, mostly on a sensory level
  • Stage 3 [KNOWLEDGE] - connecting knowledge (semantic web)
    • Knowledge can be distinguished from information by the complexity of the experience used to communicate it
    • Design helps the user create knowledge from information by experiencing the it in various ways
    • Conversations and stories are the traditional delivery mechanisms for knowledge
  • Stage 4 [WISDOM] - connecting intelligence (ubiquitous web)
    • Wisdom is the understanding of enough patterns to use knowledge in new ways and situations
    • It is personal, hard to share and reflective

Getting there will take some time to develop but already we are seeing major sites like Amazon and Flickr exposing their data via REST APIs allowing for their it to be reused and remixed. What we are beginning to see is web sites as web services; the unstructured is becoming structured (more detail here). What you end up with is the web as one big re-mixable database platform upon which new applications will be built to manipulate data in ways unthought of before. (Potential applications)

Content Remixing

Helping this along the way are a number of freely available tools which make it easier to do things that only programmers could do before by allowing anyone to scrape content from web pages or feeds and then manipulate them however they like (legal issues aside). Here are the main contenders which I have found particularly useful:

OpenKapow Dapper Yahoo Pipes

Yahoo Pipes

Yahoo Pipes is an ingenious web app which provides a very intuitive GUI for remixing content without any complex syntax - you simply drag and drop the building blocks then connect them together with pipes to control the flow and transformation of information. In one end you plug your data and out the other end comes a variety of feeds (RSS, JSON, email, mobile).

Yahoo Pipes

I’ve personally used Yahoo Pipes for a proof of concept at work and found it incredibly powerful yet simple to use. Whilst I would describe myself as technical, I’m not a hardcore developer and in that respect this tool hits the nail on the head perfectly; I can visually plumb things together without having to write a line of code and know that it will be error free. (More)

A cool enhancement to the RSS feeds Yahoo Pipes produces is to plug them into Feedburner so what you can take advantage of it enhancement, publishing and analysis tools.

Dapper

Dapper allows you to scrape websites using a visual interface, turning the data you select into dynamic web services (outputting to RSS, email, iCal, CSV, Google Gadgets and Google Maps). Dapper learns from the examples you feed it and then by comparison can create a query that turns an unstructured html page into a set of structured records. If the site you want data from doesn’t already provide a feed this is where you’ll want to go. (More)

OpenKapow

OpenKapow is more industrial strength than the other two; more powerful but more complex also. It uses a desktop based visual IDE to gather data from websites which can then be processed by different types of “robots” to create RSS feeds, REST web services or Web Clips. Seems to be aimed more at professional developers rather than casual users but still a pretty cool tool if you need some serious power. (More)

More tools are examined here and here.

Whilst all these tools and technologies are very good there is still the issue of data cleanliness as we don’t have the same level of control or constraint that you get with relational databases. No doubt this will improve over time as the services mature but for those early adopters there’s still plenty to play with. Regardless of the labels we choose to give new concepts there is no doubt in my mind that this one is going to be big - watch this space!

Enjoyed this post? Please subscribe to my RSS feed to stay up to date.