Day Dreaming About Voice Web

IBM Next Five in Five” is a list of innovations that have the potential to change the way people work, live and play over the next five years.

New technology will change how people create, build and interact with information and e-commerce websites – using speech instead of text. We know this can happen because the technology is available, but we also know it can happen because it must. In places like India, where the spoken word is more prominent than the written word in education, government and culture, “talking” to the Web is leapfrogging all other interfaces, and the mobile phone is outpacing the PC.

Here is the list from the IBM’s article.

  • Energy saving solar technology will be built into asphalt, paint and windows
  • You will have a crystal ball for your health
  • You will talk to the Web . . . and the Web will talk back
  • You will have your own digital shopping assistants
  • Forgetting will become a distant memory

One of my favorite hobbies is to pick one or more of these and try to figure out what we need to get there. It is a good way to dream about the near future and try to see where the gaps are and do some intermediate predictions.

Here are some random, incomplete thoughts for the Voice Web. There are several starting points depending on where you interests lie.

  • It has to be a layer on Web 1.0 and 2.0 (since a lot of useful content is already there).
  • Web 2.0 layer may be a better starting point since some of the underlying technologies – rest based APIs, social interfaces, mashup tools  are already available.
  • Some of the semantic technologies may help in providing some contextual structure and meta data over existing content. This may be an alternate starting point (using Freebase/dbpedia/Open Calais).
  • Voice recognition is one starting point. Many of the mobile providers already have something in this space but they are not perfect yet. Voice commands on our cell phones have limited context. There can be a bunch of innovations there.
  • Voice output is another starting point. This is an easier problem than voice recognition if the input (web content) and output (voice) are of the same language. This is another good starting point.
  • If the voice input and output are different languages (instructions originally written in English translated to a Tamil farmer, for example), we have some more chances of innovation. I am not talking about the babelfish style translation but a couple of steps above that.
  • From a device point of view, hands free operation of the cell phone may work better. These require innovation in both audio input and output technologies and miniaturization.
  • Obviously integrating search into this equation is one of the steps. There are some early attempts at doing this from Google. Not sure how well they work. But here are a few more opportunites. Layering voice search over meta search.

I can go on. But you get the drift. One cool way to capture all this (and the collective intelligence) is through some kind of voice annotated mind map (which in itself is another innovation waiting to happen). Your thoughts?

InfoMinder Alerts – 20th Oct 2008

Some interesting links I got through my InfoMinder Alerts today. I will just add a teaser for each entry. Some of these are blogs. Others are announcements or wiki links.

The Future of Enterprise Software

We are at the beginning of a massive shift from client-server to web-based software in the enterprise.  This move will be even more dramatic than the move from mainframe to client-server.  The move to self-service distribution will lower sales costs and make comparable technology available to enterprises of all sizes on an eat-as-you-go basis.  Having all data in a centralized repository with open interfaces will lead to geometric increases in functionality as customers munge data and functionality together themselves or through third party developers (who will also have access to self-service platforms).

Finally, I predict that (CRM) will have a valuation higher than SAP (SAP) in 5 years.  Today CRM is just under $8 billion in market value and SAP is just under $68 billion in market value.

What is hot in the Semantic Web Community (discovered through Planet-RDF)

Here are some of the topics that have already been put on the wish-list for the Semantic Wiki Mini-Series (source):

  • usability vs expressivity
  • community building
  • uncovering more implementations
  • HCI: navigation of large, high-dimensional knowledge spaces
  • e-science (especially pharma research & biomedicine)
  • semantic wikis and mashups
  • recommendation and personalization in semantic wikis,
  • knowledge representation (expresivity vs. simplicity)
  • how to make business subject matter experts able to enter, review and validate
    meaningful information without them having to learn new words
  • what dialect of OWL supported
  • integration of semantic resources (Protege / OOR / MW / …)
  • content quality
  • integration of external data
  • a semantic wiki & OOR session
  • experiences with distributed collaboration
  • server-side infrastructure to support semantic wikis
  • survey of semantic wikis for vertical domains (e.g. HCLS)
  • integration with other tools / linking wiki content to other apps

Motorola and the Android Social Networking Phone

Motorola, which is recruiting as many as 350 people to work on Android phones, is gearing up to make its first one:the Android Social Smart Phone. Last week, Android Guys spotted a job posting for the project, and now BusinessWeek has more details, including a mention of the Motorola job posting pictured at left on Monster looking for an Android application developer.

Five Ways to Google Proof Your Business

Google has acquired more than 50 companies, and it’s unlikely the spending spree will stop any time soon, as many of Google’s most recognized services came through acquisitions — including AdSense, Android, AdWords, Blogger, Gmail, Google Analytics, Google Docs, Google Maps, Picassa, and, of course, YouTube Inc.

But there are millions of businesses that will not be acquired by Google. As the saying goes, “Google has plenty of money, but you won’t get any of it.” The reality is, if you’re not one of the lucky chosen, Google can be both a competitor and a phenomenon that marginalizes your business model by making alternatives easy to find, or by turning your paid products into an advertising-sponsored free-for-all.

As an optimist, I’d prefer to think of Google as another arrow in the quiver to be used to expand your Internet business, drive qualifying traffic, improve your brand, and ultimately help support an exit (if you want to be rich) or a viable business model (if you’d rather be king) — or perhaps both.

There are several actions I’d suggest for those looking to make Google a weapon for positive gain. Many of these are mutually exclusive, but some combination warrants consideration for any online business

Design Considerations for Parallel Programming

Parallel programming has all of the correctness and security challenges of sequential programs plus all of the difficulties of parallelism and concurrent access to shared resources.

RDB to RDF Mapping (On Demand and ETL)

We expect cases favoring on demand mapping to be characterized by any of:

  • High rate of change of the data
  • Very large volume of data
  • Relatively straightforward translation between RDF and the data
  • Relatively few RDB’s being integrated.

We expect cases favoring ETL to be characterized by:

  • Large number of heterogenous sources of data
  • Complex application logic needed for transforming the data
  • RDF reasoning being performed on the mapped data
  • Queries with variables in class or predicate positions

The Web’s Red Pill

I liked this video, especially the part where Harry says:

It is like a red pill for the web. When you take it, all you see is triples. The true graph nature of the universe is revealed.

While we are still a bit far away from that red pill, the concept of seeding a little metadata to your web pages to enable a data web is a great idea.

GRDDL, bridging the interwebs? from Marcos Caceres on Vimeo.

Technology Trends – A List

Here is a small sample of trends in software.  This is work in progress. I will keep updating it frequently. Instead of waiting till I have my full list, I thought I may just publish this crude list and get some feed back. Some trends are current (like Web 2.0) some of them are future (Semantic Web). Over the next few weeks, I will revisit and keep adding to the list. If you think some thing should be included here, please add your comment. If you have a blog or discussion on trends, you can add that link too. Some of these trends are great blog topics too.

Each trend is an opportunity (or several opportunities). These trends create new jobs, transform existing jobs and the way we live.

Application Development AJAXRich Internet Applications – Microsoft’s Silverlight, Adobe Flex, Open LazloWeb Frameworks – Ruby On Rails, Django

Scripting Languages – Python, Ruby

Parallel Programming – Haskell, Erlang

Database XML databases and XML support in relational databasesNew query languages – SPARQL

New query interfaces to languages – LINQ

Open Data  – Freebase, DbPedia

Streaming Databases, Continuous Query Languages

Web Data Stores – Amazon’s SimpleDB, S3

Information Distribution Podcasting, Screencasting, VideoCasting, Blogs, Wikis, Micro-blogging, Portals, Feed Readers
Information Mining Text AnalyticsA wiki for text analytics
Software Agents
Information Sharing and Collaboration Knowledge ManagementWikis and Portals, Social Bookmarks, Video Conferencing
Information Visualization A Periodic Table Of Visualization Methods
Interaction AIML – Alicebot and othersTouch/Multi-touch/Surface  – iPhone, Microsoft Surface
Laptops for Learning Triggered by the visionary OLPC effort, this is a broad movement that may spark several new trends in cheaper, better laptops and several innovative interfaces for interaction.This leads to a broader trend on mlearning – mobile learning. Learning content on cell phones.
Mashups An easy way to combine services in hours, days, weeks triggered by Web ServicesWatch for  Enterprise Mashups, Mashup Tools, Languages for Mashups
Mobile and Wireless Open Mobile Platforms – ex: AndroidLocation based mobile servicesWiMax, 3G
Mutli-core Intel is promising a 32 core chip by 2010. What do we do with all that power. Where are the programmers and programming tools for leveraging this trend? How can we use a multi-core chip in every device from a PDA to a computer?Parallel Programming – Techniques, Tools, Research, Initiatives
Services Software as a ServicePublishing as a ServiceMentoring as a Service – MentornetKnowledge Sharing Services – Wikipedia, Wikibooks, LibriVox, WikiHow
Services Infrastructure On Demand Computing, Elastic Computing, Cloud Computing  – Amazons ECS, Google’s AppEngine
Search Collaborative Search – Like Wikia
Contextual Search – Yahoo’s Y!Q and Eurekster Swicki Powerset
Semantic Web Semantic Wikis – A wiki on steroids
Linked Data – FreeBase, Twine, DbPedia
Social Applications Is Social networking site a service or infrastructure? Should it be a layer on the web?Social Networks – Facebook.Others to watch OpenSocial, Ning, LinkedIn,Social Networks in the Enterprise,FriendConnect, OpenData, Data Portability, OpenId
Web Services and SOA Web Services are the new breed of application components. Popularized by Amazon, web services are growing at a rapid pace. You can get a list of publicly available services at Programmable Web


Top 10 Disruptive IT Trends – CIO Insight


WebTrends Map 2008 – A clickable Map

MarkMail – a tool for parsing mailing lists and providing trend information

LinkLog: Semantic Web

Semantic Web from A Chat with Dave Beckett

Semantic Web…it’s about connecting things together, about getting the jobs done

If every one uses the semantic web data formats, they all connect together.

I think data centric approach is better than API centric approach, because the data will live longer than APIs.

The Semantic Web in one slide from POWDER – Smarter Navigation On The Web

• Allows machines to process the meaning of data

• Data is distributed and extensible

• Machines can automatically deduce facts and relationships

• (which means you can offer users more of what they want and less of what they don’t want)

LinkLog: Enhancing the semantics of your web pages

A simple, easy to understand introductory video on RDF, triples, how to embed RDF in XHTML (known as RDFa)

  • A triple consists of subjects, predicates, objects
  • subjects and objects typically represent resources (people, places, events)
  • predicates represent relationships
  • A vocabulary provides a set of common (shared) understanding of relationships
  • You can embed RDF triples in a very simple easy way in your web pages, thereby enhancing the semantic content

This video is shortest simplest demonstration on how to do this:


We Need Academic Efforts for Web Science

A Robert Scoble interview with Tim Berners-Lee.

It involves serious study. It involves research. We need to study it as a huge complex system.

A few snippets:

  • Social Web is just a subset of the larger semantic web
  • eBay has over 21 identity systems
  • Tim B was not sure whether the web would work
  • He is still worried about the lack of interoperability
  • SPARQL – a query language/protocol to query huge amount semantic end points, a SQL for the Web
  • Web 2.0 is exciting but is producing stove pipes. Some people are using Web 3.0 to break down the stove-pipes

[podtech content=]

bloglet: Cognitive Assistant that Learns and Organizes

CALO is in the news. I heard about it from Adam Cheyer about four years ago, when it was just starting. In fact, we were looking at integrating HyperScope into it some how. May still be a good idea.

CALO is  a massive, four-year-old artificial-intelligence project to help computers understand the intentions of their human users. Funded by the Defense Advanced Research Projects Agency (DARPA), and coordinated by SRI International, based in Menlo Park, CA, the project brings together researchers from 25 universities and corporations, in many areas of artificial intelligence, including machine learning, natural-language processing, and Semantic Web technologies. Each group works on pieces of CALO, which stands for “cognitive assistant that learns and organizes.”

Knowledge Maps

I was browsing through facebook today (a facebook user visited my blog and I clicked on the incoming link to take a look). I found moneylet, a social bookmarking site for financial news. There was an item on Forbes report on the wealthiest men. More interesting, there was a of Knowledge Map of Warrren Buffet. Unfortunately, you can view these only on Internet Explorer.


Knowledge Maps are a special application of Mindmaps. While I have seen and used mind maps before, I have not seen this particular product. IntellectSpace provides a free trial and your own way to create and share Knowledge Maps on the web. The Intellispace seems to be a cut above simple mind maps. It has several features. You can filter entities and relationships and navigate and browse nodes in the map.

How do you build a knowledge map? Here are a few thoughts.

  • Manually – Take all the facts and enter them into a mind mapping tool (define nodes types, connection types)
  • Semi-Automatic – Generate a wiki mind map from a wiki (like Media Wiki) and annotate or modify the map
  • Automatic – This may be in the future. It requires the ability to map concepts, links and additional semantics (how do you recognize people, places, things)? Microformats and Semantic Web technologies (like GRDDL) may help here.

I wonder what the next step in the evolution of these maps would be? Social Knowledge Maps? Seems like a cool Social Networking Application waiting to happen. Imagine the ability to generate a Knowledge Map and let people share and improve it. For that we may need something more than a drawing tool. Some kind of meta language for knowledge maps may be useful.

Links: RDF, Semantic Web Tools

RDF 123 – A mechanism to transform spreadsheets to RDF graphs:

RDF123 is an application and web service for converting data in simple spreadsheets to an RDF graph. Users control how the spreadsheet’s data is converted to RDF by constructing a graphical RDF123 template that specifies how each row in the spreadsheet is converted as well as metadata for the spreadsheet and its RDF translation. The template can map spreadsheet cells to a new RDF node or to a literal value. Labels on the nodes in the map can be used to create blank nodes or labeled nodes, attach a XSD datatype, and invoke simple functions (e.g., string concatenation). The graph produced for the spreadsheet is the union of the sub-graphs created for each row. The template itself is stored as a valid RDF document encouraging reuse and extensibility.

Semantic Web and Related Tools (over 500)

This posting of Sweet Tools — semantic Web and related tools — is now in version 9, with 542 tools, an addition of 42 newly listed tools since the previous version. It was last updated on 6/19/07.

Over 49% of them are Java based.

CWM -A general purpose data processor for the semantic web

Cwm (pronounced coom) is a general-purpose data processor for the semantic web, somewhat like sed, awk, etc. for text files or XSLT for XML. It is a forward chaining reasoner which can be used for querying, checking, transforming and filtering information. Its core language is RDF, extended to include rules, and it uses RDF/XML or RDF/N3 (see Notation3 Primer) serializations as required.