From Data to Insights – A Nice Story of Planet Discovery

Astronomers are probably one of the most data intensive people. They use data  for observation and discovery. They use some of the insights from data to find and analyze more data. Here is a classic example – From Kepler Data, Astronomers Find Galaxy Filled With More but Smaller Worlds

The new planets were culled from 3,601 candidates previously found by Kepler, using a new statistical technique known as verification by multiplicity. The method vastly reduces the need for outside telescopic observations to verify suspected planets in batches. It works only for multiple-planet systems, but as Dr. Lissauer and his colleagues pointed out, that includes about 40 percent of the Kepler candidates.

There is more to come, the astronomers said. The present results are based on a statistical analysis of only the first two years of Kepler data. There is two more years’ worth to go, and several hundred more planets are likely to be verified.

LinkLog: Ara – An Innovative Modular Phone Project from Google

From Google’s Project Ara

Ara is definitely an amazing innovation, and a project that it would be amazing to see come to fruition. It’s also massively ambitious, and not every experimental tech Google develops ends up as a proper shipping project. Modularity has a lot to potentially offer the smartphone market (and could also be very interesting when applied to tablets) but there’s a lot of ground to cover between here and selling these things in stores. Still, if anyone has the resources and runway to make it happen, Google is a pretty good candidate

The TIME profile also sheds light on some of the fundamental mechanics of how Ara works. Modules are designed to slot in to each compartment on the basic chassis interchangeably, regardless of what each does. They’re also hot-swappable, so you don’t need to power down the phone to replace individual parts. Finally, the modules are secured to the device using hardware latches, which use magnets to lock stuff in place. That lock is released using an app of the phone, so that they won’t fall out when jostled or when the phone drops.

Project Ara: Inside Google’s Modular Smartphone

A Simple Survey: If I Take Care of Your Salary …

If I take care of your salary and let you do whatever you want to do, what would you do?

This was a question, I posed to a few students who signed up for my Introduction to Python class. There were seven of them (and three more expected to join later).

The answers came slowly. They don’t know me. They were not sure how to start. But they did, after a pause.

I want to help children learn.

She did not hesitate for a moment. She pretty much knew what she wanted. I was really thrilled.

I want to teach

Great. We need more of you, I thought. She said she was passionate about teaching.

I want to make every one know the name of my village.

This boy had a determined look. Like the others, he did not mention anything about getting a job or making money. I could how serious he was.

I want to become an entrepreneur. And a good one.

It is rare to see this among students learning computer programming. Yet, there it was. I was glad to hear that.

I want to just travel around the world and get to know people and places.

A very different tone. A very different voice. I have a friend who does this. And I always envied him.

This is a good starting point. I feel that I know them a bit better. I can tweak the course and the projects to help them get one step closer to their dream. A good challenge to have.  I know I am going to need lots of help.

A Few Slides on Designing Data Visualizations

It took more than an hour to watch this video on Designing Data Visualizations. Every minute was worth it. While I knew a bit about the subject, this talk walks you through the elements of visualization and how to go about designing one. There is a lot to learn from this talk and later from the book:

  • Types of Information Products
  • Difference between Infographics and Visualizations
  • Choice of Visualization elements
  • A walk through of few visualizations
  • How to go about designing a visualization

Here are  few screen shots from the talk.

At a high level, you need to understand:


Then you focus on the user.


Paying attention to the data you have to work with helps.


These slides are just a small sample. They will give you an inkling about the approach. For all the good stuff, please watch the video and then get this book.



From the Preface of the book:

The path from journeyman to master is long. In the case of data visualization, the path has been well marked by many accomplished designers and cognitive scientists who have been doing great work for decades. We gladly follow in their footsteps, and we hope you will, too.

Our goal is to give you confidence as you begin your journey.


Thank God that there is internet that enable tools like YouTube which provide acess to channels like LinkedIn Tech Talks who bring you people like Noah to share their knowledge.

Knowledge Required to be an Entrepreneur

My answer to the question on “What knowledge is required to be an entrepreneur or to start a new company?” Do you need to study books on management for that? on Quora:

The knowledge that you need:

  1. Some good ideas to improve or create products/services in your chosen field.
  2. The ability to build/make/serve people in a specific marketplace (solve problems people face)
  3. Some skills in doing basic research to understand the needs of your potential customers
  4. Ability to clearly articulate benefits of your product/service and interact with people and listen to their problems.
  5. To learn from interactions/research and continuously improve your product/service
  6. Understand that being a good entrepreneur is hard work and may take a while to get right.
  7. An open mind to change some of your ideas and persevere

Some of these skills can be learned. Some through books and others through experience.

If can find an active entrepreneur community and find some advisers who have done that before can benefit you immensely.

I also suggest that taking this free Udacity course on Building a Startup can get you thinking in the right direction – How To Build A Startup: The Lean Launchpad


Even though I answered it on Quora, I was hoping that this would reach some of my audience members who are entrepreneurs and others who are thinking about entrepreneurship. I also hope that it will generate some good questions/answers and reactions or encourage you to go to Quora and participate.

If you Google Things that an entrepreneur needs to know (you will get a lots of answers). There are some that are really good. But I think the most basic one is that you need to know what entrepreneurs do and what kind of problems they face.

Tools for Twenty First Century Learners

A LinkedIn discussion on Learning tools took me on a journey to find this.


Learning about Learning is one of my hobbies. Once in a while I wander off  from my regular work to read books, articles and blog posts about this topic.

You can download the full (8 page) brochure here (from American Libraries association).  If you are interested in some of the other learning related tools and communities you may also like some of these links.

It all started from this LinkedIn discussion – Open access resources for school library – a discussion on LinkedIn 

Open Library is participating in our eBook lending program. Browse the growing lending library of over 200,000 eBooks!

 Tools for students to communicate, question, investigate, evaluate, collaborate, test, and create. 


Machine Learning Application: Job Classification at LinkedIn

I am fascinated by Machine Learning (ML) and keep looking for case studies were ML solves real world problems. This Talk – Machine Learning: The Basics by Ron Bekkerman( video), provides a great overview  of machine learning and how it is being used by LinkedIn for Job Analysis. LinkedIn is one of the early companies to jump in to Data Science. With over 200 million subscribers, they have ample data to analyze. The data is very contextual too and that helps build better algorithms (they claim 95% accuracy in prediction in a specific case). At one point in the talk Ron mentions that the ML study helped in building a product that generates about 6 million dollars in revenue for LinkedIn. That is great pay off.


Why is job analysis interesting in general? It provides you with some interesting insights into the direction a specific industry is moving:

  • If you are in the (IT staffing) industry, you may want to know what kinds of jobs are in demand? And which ones are growing and which ones are shrinking?
  • If you are an outsourcing company, you may want to analyze the hiring patterns in different parts of the world
  • What kinds of skills are in demand for startups, medium sized companies and large enterprises? Lots of people from startups to training companies can use this data to build and tailor their offerings.
  • How do training companies and conference organizers meet the need for skills using job analysis?

Ultimately, it is all Market Intelligence of a kind. It is fascinating that, now we have large data to analyze and get some glimpses into the patterns of demand/supply.  So where do you get all this data from? That is a topic for another blog post.


One of our interns is working on an app to do Job Classification and automatic tagging of jobs. We were debating whether we should use some simple techniques or ML. I was going around looking for case studies and stumbled upon this video.

If all the Data on the Web were Open and Linked…

If all the data on the Web were open and linked, it would be easier to establish information systems combining different distributed data repositories. Thus, the Web of Data would enable access and sharing of data and knowledge without barriers.

The article talks about 7 things you need to know about Linked Open Data

  1. What is Linked Data and Linked Open Data?
  2. How does it work?
  3. Who is doing it?
  4. Why is it significant?
  5. What are the downsides?
  6. Where is it going?
  7. What are the implications for institutional repositories?

A great example of Linked Open Data is DbPedia (an LOD repository extracted from Wikipedia). If you have ideas for DbPedia and want to help at Google Summer of Code (GSOC) program, please read this.

We are still in need of ideas and mentors. If you have any improvements on DBpedia or DBpedia Spotlight that you would like to have done, please submit it in the ideas section now. Note that accepted GSoC students will receive about 5000 USD, which can help you to estimate the effort and size of proposed ideas. It is also ok to extend/amend existing ideas (as long as you don't hi-jack them). Please edit here:

DbPedia Ideas


A Data Driven Reporting Trend?

Here is a post from FiveThirtyEight looking for writers who use data journalism methods:

First, and most important, we’re looking for freelance features and articles that involve original research, analysis, or reporting — specifically those that involve statistical analysis, data mining, programming, data visualization, or other data-journalism methods. FiveThirtyEight is not the right outlet for “smart takes,” opinion pieces, or long-form essays that don’t involve some data component. We would potentially have interest in features that involve shoe-leather reporting (i.e., interviewing, first-person observation) if they are numerate as well as literate, and help our readers put data and statistics into context.

Is it the Nate Silver effect? Where do you get these new breed of journalists? Are there enough of them? Can you train writers to be data miners? Or is it the other way around? Will you find statisticians, data mining specialists and make them write?

Nice to see that articles will be written using (at least) some data. The “smart takes”, “opinion pieces” can be layers or in a parallel universe. How will they meet the challenge of  identifying stories from mining the data and  surfacing them?

Cloud Interoperability, Cloud and Big Data, Cloud Skills, Everything As A Service, Cloud for Small Businesses

A few cloud links from TopicMinder Alerts:

  1. Coupling Big Data With Cloud Computing To Reap Finer Results!

    Cloud Hosting emerged as a pioneering concept and led to the democratization of IT sector. With expanded reach to the masses, it has brought in drastic cost reduction with ample application choices giving users the power to make the most of technology. The autonomous transformation of IT has not only …

  2. Interoperability: A Much Needed Cloud Computing Focus

    Cloud computing transitions information technology (IT) from being “systems of physically integrated hardware and software” to “systems of virtually integrated services”. This transition makes interoperability the difference between the success and failure of IT deployments, especially in the Federal government. Recent government IT failures like the healthcare portal roll out …

  3. Desperately Needed: More Cloud Training, More Cloud Skills

    Okay, just about everyone is convinced at this point — cloud computing is a good thing that can provide tremendous business value, if applied …

  4. Cloud Infrastructure Management: Companies and Solutions 2013

    This report evaluates cloud management including types of cloud computing models, challenges facing cloud computing, implementation of cloud …

  5. Cloud Security 2013: Companies and Solutions

    Cloud security is the set of security protocols and technologies that protect the cloud resources and the integrity of data stored in a cloud computing …

  6. Why Cloud 2.0 will be everything-as-a-service – Cloud Computing Intelligence

    The cloud computing market has matured significantly in the past two to three years, becoming almost synonymous with infrastructure-as-a-service …

  7. Is the Cloud Applicable to Small Businesses?

    … their business in an easier, more cost effective way. In fact, Gartner research indicates that the cloud computing market reached $150 billion in 2013!