Our First Twitter App – CheckPage

We launched  our first Twitter app – CheckPage. It is currently in Beta.  Hopefully this will be the beginning of several nifty tools for finding, tracking, filtering, extracting and sharing Information.

CheckPage is a Twitter based tool for tracking a web page, detecting changes and sending you a notification through. It can be used for tracking:

  1. Your partners/customers
  2. Your competitor’s
  3. Government sites fo

To track a page, say Hacker News, all you do is to Tweet:

@checkpage start http://news.ycombinator.com/

Checkpage will track this page every day and send you a notification when something changes on the page. The notification will come in the form of a reply Tweet and will contain a pointer to a changed page (in addition to the original page). When you go to the changed page, you will notice that all the new content is highlighted.

Give it a try. If you need to see other commands, they are described in this help page.  Send us suggestions, either as comments to this post or by email (specified in the help page). We have a lot of features planned, but we want to hear what you think first.

Information Diffusion in Social Networks

I was listening to Guy Kawasaki when he recently visited Bangalore and gave a few talks and a workshop on Twitter. One question that often pops up is how Guy manages to follow 180,000+ people. His simple answer was that he does not follow their public time line. I understand that because even with less than 2000 people, I have trouble keeping up.

What Guy actually does is track mentions of a few phrases including his name and direct messages. This reduces the load some what but it can still be considerable.

So let us take a hypothetical scenario. I want a piece of information to be propagated to 20 of the top tech bloggers who are actively interested in a specific subject area.  I can’t see anyway this can happen reliably through Twitter. We don’t know how they sample messages. We don’t know how frequently they follow their public time line. We cannot mention all of them in the Tweet. Many of them (understandably) hate to be directly messaged. So how do we really reach them?

This report on Information Diffusion provides some ideas on  how information propagates through Social Media.

“Those who respond very quickly to e-mails, technology addicts who are always connected, are the ones responsible for spreading certain rumors or campaigns quickly via Internet,”

if information is so interesting that it reaches many people, the diffusion is faster because these people quickly forward the message. This explains why some computer viruses quickly spread via e-mail in a matter of hours, despite the fact that the email response time is one day. However, if information is not so interesting, the diffusion is slower because it is controlled by those persons who take a long time to respond; this causes some rumours or bits of information to remain dormant in social networks a long time after they are released.

Will lists alter this? May be. It depends on the patterns of use. I think we still have a lot to study on how to effectively communicate marketing messages on Twitter and reach the right people.

LinkLog: InfoStreams and Embarassingly Parallel Data Analysis Tasks

I have been interested (but have not really done anything useful yet) in large scale data analysis. Here are some personal interests:

  1. Analyze the InfoStreams I track from Twitter, Blogs and our own Customized feeds on programming, multi-core and semantic web topics
  2. Explore Open Linked Data, visualization, connections and analysis
  3. Applying machine intelligence to understand raw data and notifications of change as well as tracking velocity of change.

This leads to dabbling in the semantic encoding of data (RDF/OWL), visualization techniques (processing), data analysis (R Language) and large scale streaming data (map/reduce, hadoop).

So when I stumbled across  Ben Lorica’s Big Data: SSD’s, R and Linked Data Streams I could not resist reading it. A few comments and some links below:

This is how I landed in this strangely named platform called Pig, a sub-project of Apache’s Hadoop. From the wiki:

Pig is a platform for analyzing large data sets that consists of a high-level language for expressing data analysis programs, coupled with infrastructure for evaluating these programs. The salient property of Pig programs is that their structure is amenable to substantial parallelization, which in turns enables them to handle very large data sets.

At the present time, Pig’s infrastructure layer consists of a compiler that produces sequences of Map-Reduce programs, for which large-scale parallel implementations already exist (e.g., the Hadoop subproject). Pig’s language layer currently consists of a textual language called Pig Latin, which has the following key properties:

  • Ease of programming. It is trivial to achieve parallel execution of simple, “embarrassingly parallel” data analysis tasks. Complex tasks comprised of multiple interrelated data transformations are explicitly encoded as data flow sequences, making them easy to write, understand, and maintain.
  • Optimization opportunities. The way in which tasks are encoded permits the system to optimize their execution automatically, allowing the user to focus on semantics rather than efficiency.
  • Extensibility. Users can create their own functions to do special-purpose processing.

Hope to give it a spin and try to see whether I can manage a drink from my InfoStreams firehose.

Twitter Streamgraph for Silverlight

One of the cool things about being on Twitter is the rapid rate at which you discover new resources on the web. Just a few minutes ago, I saw  “6 Unique Twitter Visualizations” http://bit.ly/Fz05O from @RSS_Buzztracker. The first thing you do, of course, is to make it a favorite and retweet it so that your friends can enjoy it too.

The next thing is to go and investigate. Two of them just blew me away. I managed to grab and reproduce one of them here. It is called Twitter StreamGraph. You simply type the word or phrase and it provides a nice visualization. Since I was watching MIX09 videos yesterday (from Chennai, India) and knew that there is a lot of buzz going on around Silverlight 3.0, I decided to try it out. Here is what I got. It may be different when you try it:


The next one, called Twitter Thoughts is a fabulous motion chart. I have been fascinated by the power of motion charts ever since I saw one in a TED Talk a couple of years ago. Here is how it works:

TwitterThoughts creates charts based on Twitter tweets in combination with lots of APIs: From a sample of 600 tweets/minute served by the Twitter Api that we send to Yahoo Pipes where it extracts all phrases from the tweet text and the latitude/longitude with use of Yahoo YQL. This Yahoo Pipe outputs serialized PHP back to our local update script that grabs every tweet and phrase and puts it in our MySql database. Daily overviews for fast rendering of the chart data are generated with a daily CRON update. Finally Google Visualization API generates an interactive flash chart based on our JSON data feed.

I think as we get more useful data, especially trending data, we will start finding more innovative visualizations.

A Couple of Twitter Trends

It is interesting to watch Twitter take off.  Over the past few months, I have seen increasing adoption of Twitter.  Here are a couple of trends worth mentioning.

eZines on Twitter

I think this is a great idea. I follow several ezines (the latest being IDG Connect). I found some of them through Twitter Search and others through links in their email alerts. Here are some advantages:

  • You learn about new articles and webcasts as soon as they are ready (right now many ezines still do a batch mode bursts of Tweets but hopefully that will change).
  • There is finer granulartiy of information. Since each article link is a separate Tweet, you can individually reply to or Retweet, announcing it to your own group of followers.
  • You can also selectively bookmark (favorite) individual articles
  • If you are interested enough in that area, you can even start a group or a Friend Feed room

Even though I used to receive the same information in an email form, when it comes as several Tweets, I am able to do more with it with more convenience.

Twitter Packs

A couple of weeks ago, I noticed one of my friends Tweet about Semantic Web Pack.  Since it is one my areas of interest I followed the link to see what it was. To my pleasant surprise I found these:

  • A Twitter pack is a collection of individuals/bots,  Tweeting about a certain topic. It is like a BOF (bird of feathers) group that I normally participate in conferences.
  • This was in fact a pbwiki application where you can simply follow all the people in a pack (or selectively follow a few)
  • There were several useful and interesting packs

It gave me one click access to my special interest group. Even though I could have done the same (with some difficulty) using Twitter Search, this was so much easier.

Trending Topics

One of the easy ways of following hot news is to subscribe to bots that track  Trends.  It is  a nice way to follow conversations and growing interest in certain topics on Twitter. Two trending bots I follow are Trending and Real Time Trends.

Program For The Future – InfoStream

Program for the Future was fun. We set up an InfoStream, thanks to Yahoo Pipes and my hard working colleague, in a few hours. We combined, filtered feeds from Twitter, del.icio.us, Flickr and YouTube. I know it can be improved. If you have any suggestions, please free to post a comment.We also set up a Social Network on Ning (public), and a group on Facebook.

You can see the feed in action here.

You can get the feed source here.

We are tracking the conversation after the event. Hopefully there will be a lot more in blogs (the Twitter traffic being real time may die down).

This was a good exercise. Even though I have been playing around Yahoo pipes a bit, this is the first time we did some thing useful with it.

If you need help in setting up similar feeds for other events, let us know. We can send you more info (or write a blog about it).

Many Ways I Use Twitter

I am on and off on Twitter. But of late I am more on. Here is why. I found a bunch of uses of Twitter.

  1. As a bookmarking tool (I started using it in addition to del.icio.us)
  2. As a source for instant tech info – I follow some cool dudes like Dave Winer, Robert Scoble, Jon Udell, Dion Hinchcliffe, eHub and zdnet. I get a lot more than what I can handle in any given day
  3. As a source of instant social news – I follow a bunch of friends and as the twit, I get hit
  4. As an idealog – I just started this. Throw out ideas and someone will respond. A quick validation. Previously I used to blog about it (takes too much time), write it down (forget where it is), put it in a wiki (again takes too much time). I know that I can take these idea stream and work on some of them at some point in time.
  5. I think some really cool stuff is happening with twitter as a collaboration tool. I followed Dion and team building a social graph app on Google engine (heard it through Dave’s posts)
  6. Talk to myself. Read-pause-reflect an eternal spiral.
  7. As a LearnLog (what did I do, why did I do it and when did I do it). Always amusing to see why work seems so much fun and after some time realize that you are kind of working but mostly having fun.

I think I will set some self imposed limits so that I do not become an addict. I haven’t figured out what they are yet. Here is my Twitter Account.

Update: The original title was Seven Ways I use Twitter. Then I though, why limit it? Why not keep adding new ones?

8. Purely for fun – it is kind of intoxicating

9. Just getting out “Ideas Worth Spreading” – a tag line of TED Talks, I love

10. As an advertisement for my blog? Mixed feelings about that.