enterprise applications, many of which are built on social media premises and therefore provide a lot of unstructured data to tackle – 70% unstructured data, according to one estimate from McKinsey. One example here is Yammer, which DataSift already users as a data source for enterprise customers. Halstead explains it this way: “Let’s say you have 10,000 users on Yammer. What they talk about in there can flow into DataSift, where we use our processes to apply curation and context.”
“It’s not the individual data sources, but what you can do with them. Companies that work with social data are overwhelmed by it, and think they are wasting massive amounts of money. What we hear from companies is not that they we want more data, but more precise data, more usable data.”
So what does this translate into? Here are a few thoughts:
- Managing both internal and external information streams (data firehoses) is a starting point
- Adding contextual meta data to these streams is the next step. This has to be an automated process but can use smart algorithms including machine learning. The context can be based on topics/subject areas, temporal (currency of information), source rank (authenticity of sources) among others.
- Armed with data and context, you can apply a variety of analysis and visualization techniques to obtain insights
- Perhaps the last step in the process is to curate and distribute the findings to the right groups/teams in an organization