Data Science – A Few Tweets and Links

What is Data Science?

What is Data Science from Wikipedia Talks a bit of the history as well.

What is data science? – O’Reilly Radar

Data Science Courses and Recipes

Coursera Introduction to Data Science Course

RT @radar: Want to be a data wrangler? School of Data offers free online data science  courses

Applications, Tools

If you are wondering about the applications of Data Science, please watch the first couple of videos from this course

RT @StartupYou: DIY Data Science – when will this happen and think of how big it will be

Data Science Tools: Tools slowly democratize many data science tasks

“Deep Learning – The Biggest Data Science Breakthrough of the Decade” – Free webcast from O’Reilly

Tim O’Reilly – “Data science is transformative. The first wave was marketing analytics, before that financial arbitrage.”

Mapping Twitter’s Python and Data Science Communities

Data science and the analytic lifecycle  by @bigdata #strataconf

Other Resources

A bitty bundle of data science blogs Collected by @hmason. via @mikeloukides Call for more http://t.co/2iFABfIl2q (look at the comments in the blog for more resources links)

What’s A ‘Data Scientist’ Anyway? Real-Time With m6d’s Claudia Perlich”

LinkLog: Guidelines to Help You Design A Great News App

News apps tell stories. They’ve got much of the same structure as any news story. They’ve got the graphical equivalent of ledes and nut grafs. At their best, they help a reader to find their personal stories in a large data set and to understand the story you’ve reported using the example of themselves and their own community. A great news application lets a reader understand new concepts by relating them to their own experiences.

Here are some guidelines to help you design a great news application

LinkLog: Big data: What’s Your Plan?

From Big Data – What is Your Plan?

create a simple plan for how data, analytics, frontline tools, and people come together to create business value. The power of a plan is that it provides a common language allowing senior executives, technology professionals, data scientists, and managers to discuss where the greatest returns will come from and, more important, to select the two or three places to get started.

The plan, according to this article contains three elements – Data, Analytics and Tools.

Plans may highlight a need for the massive reorganization of data architectures over time: sifting through tangled repositories (separating transactions from analytical reports), creating unambiguous golden-source data,2 and implementing data-governance standards that systematically maintain accuracy…

Integrating data alone does not generate value. Advanced analytic models are needed – A plan must identify where models will create additional business value, who will need to use them…

Intuitive tools that integrate data into day-to-day processes and translate modeling outputs into tangible business actions

This article gives a sense of several fine grained opportunities in the Big Data Area.

Great Reads: Cyclical Tools

Every tool should nourish the things upon which it depends.

We see this principle at varying levels in some of our tools today. I call them cyclical tools. The iPhone empowers the developer ecosystem that helps drive its adoption. A bike strengthens the person who pedals it. Open-source software educates its potential contributors. A hallmark of cyclical tools is that they create open loops: the bike strengthens its rider to do things other than just pedal the bike.

Cyclical tools are like trees, whose falling leaves fertilize the soil in which they grow.

This essay Missions and Metrics is a great read. It has some really great insights about metrics and their impact on development (of various kinds).

This is slightly different from the notion of “improving improvement” which Doug Engelbart talks about.

So how do we build cyclical tools? We already have two great examples to start with – the bike and open source.

Meta:

Found this via @swombat – a great resource for entrepreneurs.

A Discussion on Personal Productivity Tools

It was an unusually small gathering (about 10) for kcommunity standards. As the conversation progressed more people joined us and we grew to 15. It was an unusual event too – the audience were the speakers. To me it was one of the most productive kcommunity sessions. I learned a lot more about personal productivity tools but more important, got a peek into how people think about their work and productivity. Here is my log of the event. It won’t do justice to the event itself. For that you need to see the video recording. When it is available, I will add a link in the post.

Here is a list of productivity tools with a brief note to provide context.  The order in this list is not significant. We went around the table asking people what they use and why. I took notes and I am typing all this in the same order in which they were mentioned. Some of them were mentioned a few times but I have only one entry, for example Post-it Notes.

  • Text Expander – converting short cuts to fully editable text http://bit.ly/Mx4NoW
  • Pure Text – Paste any text to applications without formatting http://bit.ly/Mx4Sck
  • Mindmaps – Several people mentioned it. There are several free and paid ones. My favorite is http://bit.ly/Mx50bI. Lakshman mentioned mindjet. Many of us use mindmaps on paper as a thinking and note taking tool
  • Todo lists were mentioned by several people (Google Tasks and Outlook)
  • Kanban was mentioned once – http://www.kanban101.com/
  • Several people use post-its and its variants sticky notes, for example
  • Latex http://www.latex-project.org/intro.html
  • A couple of interesting uses of Excel came up. This research scholar uses Excel to schedule calls and also uses it as a priority  list for tasks.
  • White board (I guess the physical one) came up multiple times. It is there as a constant reminder of things to be done.
  • Mobile devices (guessing that these are smart phones) was mentioned as a multipurpose tool. I am sure various uses of mobile may fill another session.
  • Expense manager
  • Evernote (a note taking tool that is available on several devices and operating systems)
  • Time recorder
  • Excel macros
  • Supermemo (http://bit.ly/Mx5PRZ) to manage memory decay. The tag line for this tool is “Forget about forgetting”. To me this was an amazing discovery. This is a tool as well as a philosophy to remember things long term.
  • Screen grabber (I use snagit)
  • YSlow
  • Automatic code review enablers in Eclipse (plugins)
  • RSS and OPML
  • Diigo -a web highlighter
  • Watson for system learning (need to get more info and links)
  • Slideshare
  • Dropbox
  • Netvibes ( a portal builder)
  • iBook Authoring tools
  • One Note
  • Sharepoint calendar for announcements
  • Physical Notebook (turns out to be one of the most popular tools)
  • Markview – a tool to convert pdf files so that it is easy to flip through them
  • Stephen Covey Planner
  • Ideabooks (to jot down ideas)
  • Delicious for social bookmarks
  • Twitter as a social bookmarking tool – it is an active bookmark that pulls suggestions from fellow tweeters
  • Personal wiki (I use wikidpad a combination of wiki and notepad)
  • Argument maps and debategraph
  • wikibook
  • Programmers Journal
  • Podcasts
  • Google docs (especially spreadsheet) to share tasks, lists
  • Lists, Lists, Lists

In addition to tools several practices (habits were mentioned) during the conversation.

  1. A research scholar mentioned that he preserves the keywords he uses to search (for later recall). It is a neat idea. Google must be caching this somewhere but there is no tool if you use multiple search engines. A browser history can probably be extracted and tweaked.
  2. Use a mobile phone to take pictures of documents and convert them to pdf
  3. Creating automated scripts to check for availability of internet (a very Indian phenomenon), continue downloads after they are paused etc. Requires some scripting knowledge (shell, python, perl, windows powershell)
  4. Calender analysis to find where a lot of energy is spent
  5. Group Whiteboard (checking items by others as completed triggers others to respond)
  6. Type phone numbers to memorize them instead of storing them in contacts
  7. Organizing everything every day to have a clean desktop
  8. Set time to do things that repeat at predictable intervals
  9. Fifteen minutes of reading everyday
  10. Taking time off from Twitter, FB and email – one day a week
  11. Use folders for management
  12. Focus drives all the patterns and usage of tools
  13. Six thinking hats
  14. Algorithms for passwords

My book recommendation to the group was 18 minutes.

Ideas About Ideas

Do you have lots of ideas rattling around in your head? Do you mostly dismiss them? Or do you pick one and dwell  on it for a while? This is a subject we will come back to again in future posts. But today, I just want to provide a few resources that you can use to play with ideas a bit. Try it for a week or a month. First, here is what I do with ideas. I am not very systematic about it but I do consciously capture most of my ideas.

I Keep an IdeaLog

This is just a list of ideas in a central place. I used to use  a desktop wiki but recently shifted to Evernote. Evernote allows me to type my ideas on my mobile device and sync it with the one on my laptop.  I just make a list. Each idea is about 5-7 words.

I Select A Few Ideas and Do a 3W exercise

Once in  a while, I go back to my idealog, and review them. Most of them look crazy but I do not delete them. I take a few and do my What,Why, Who exercise. This consists of writing down:

  • What the idea is in a few sentences
  • Why this idea seems important to me and why it may be useful to others
  • Who can use and benefit from the implementation of this idea. Some times, it may just be me. Some times it may be a others like me. Some times, it may be some one completely unrelated. It does not matter. Forcing myself to think about the beneficiary is probably the best filter for selecting ideas for further processing.

If I get this far and still interested in the idea, I write down a list of what questions, why questions, who questions. This is  subset of the 6W framework from The Back of the Napkin | DanRoam.com

I Try the List of 100 Approach

I pick just a couple of ideas from the pile and do a List of 100 exercise. Let me be frank. So far, I have done it only for 3 ideas, successfully. This really forces you to think about the idea a lot deeper.  I find this technique of writing down 100 thoughts about a specific idea as a very useful thinking exercise.

Sketch a prototype

Since most of my ideas are about software products, it is easy for me to take a few sheets of paper and sketch a user interface. I just use pen and paper. Some times I scan these sketches and attach it to my notes.  Once I do this, I put it in a list of projects to try. I try to find some interns or students to try projects and make them build a version of the prototype. I give this to people to look at. If people find it useful, we build an MVP.

A few ideas turn into products

A few ideas turn into usable products. I need to go to the next stage and get people to pay for it. This is where most ideas die. But a few have flourished.

Here are some resources you may find useful, if you want to play around with ideas.

The first step to have great ideas is to adopt an attitude of having lots of ideas. Going further, there are some strategies we can use to dramatically increase the amount of ideas we generate. The Idea Quota is one of the simplest and most effective of them.

If the best way to get quality ideas is by creating them from a vast pool of ideas, then our job is to have as many ideas as possible. Here are six tips that can help you develop an “idea abundance” mindset

In The Medici Effect, author Frans Johansson explores one simple yet profound insight about innovation: in the intersection of different fields, disciplines and cultures, there’s an abundance of extraordinary new ideas to be explored.

The List of 100 is a powerful technique you can use to generate ideas, clarify your thoughts, uncover hidden problems or get solutions to any specific questions you’re interested in.

I have not always done this. In the initial stages, I used to filter ideas in my head and simply build a prototype. But now, I think of market validation a lot more than I used to think before.

LinkLog: Python and Data Handling

Pipes and Filters are a familiar pattern for people managing data. Its use has been popularized by Yahoo Pipes. I always wanted to get a programmable version of pipes and filters and felt that a mini language would help a lot.

Guess what? I found two packages for creating piples and filters today through my Infostream alerts  –  FilterPype  and Joblib.

Pypes and Filters is a framework for working with data. The purpose of Pypes and Filters is to make it easy to manipulate streams of data by “filtering” the data through Filters that in turn form a Pipeline, or Pype.

Here are some features from the Introduction page.

FilterPype is being used for multi-level data analysis, but could be applied to many other areas where it is difficult to split up a system into small independent parts.

Some of its features:

  • Advanced algorithms broken down into simple data filter coroutines
  • Pipelines constructed from filters in the new FilterPype mini-language
  • Domain experts assemble pipelines with no Python knowledge required
  • Sub-pipelines and filters linked by automatic pipeline construction
  • All standard operations available: branching, joining and looping
  • Recursive coroutine pipes allowing calculation of e.g. factorials
  • Using it is like writing a synchronous multi-threaded program

Joblib is a set of tools to provide lightweight pipelining in Python. In particular, joblib offers:

  • transparent disk-caching of the output values and lazy re-evaluation (memoize pattern)
  • easy simple parallel computing
  • logging and tracing of the execution

Planning to give both a try. Have you used any of these?

Three Marketing Tools You Can Use

I mostly deal with startups and small and medium enterprises. No matter who you are, you can always use some help in marketing. So what are some of the marketing tools and services out there that are either free or inexpensive? Here are a few I can think of:

1. Website Grader by Hubspot

It is a simple tool to grade your website. I don’t pay that much attention to the score but I like all the things they point out for improving your site. You can check your site and your closest competitors and figure out what to do next. This would be a good starting point. BTW, hubspot is a good company to follow. They practice what they preach (Inbound Marketing) and provide a lot of very useful content for inbound (and social media) marketing. While you are at it, you may want to check out the other graders too.

2. Using Google Suggest for Keyword Research

Google Suggest is a great tool for doing some keyword research. It basically works like this. Let us say you are interested in Cloud Computing Migration. You can actually try to find out whether people are searching for this term. A Google suggest list for this term yields the following (as of Nov 3).  The numbers are the number of  searches.

cloud computing migration issues,7150000
cloud computing migration plan,1040000
cloud computing migration strategy,922000
cloud computing migration tools,1330000

3. Micro Niche Finder

A micro niche is a specialized area where you can make an entry as a startup. This is a free tool that takes you through the steps of validating a market niche.  If you are interested in microproducts and microniches, you may wan to take a look at this post as well.

 

What tools do you use?  There are several tools for market automation, lead nurturing, inbound marketing, social media marketing and tracking competition and your industry. We will cover them in future posts.

cloud computing architecture,29600000
cloud computing applications,57000000
cloud computing advantages,9700000
cloud computing amazon,36700000
cloud computing articles,67000000
cloud computing apple,69800000
cloud computing act of 2011,27600000
cloud computing and virtualization,25900000
cloud computing apps,61900000
cloud computing and security,116000000
cloud computing benefits,37200000
cloud computing books,29500000
cloud computing blog,73100000
cloud computing bible,5230000
cloud computing business,136000000
cloud computing business model,6400000
cloud computing bible pdf,2510000
cloud computing bulletin,3600000
cloud computing backup,17600000
cloud computing basics,10100000
cloud computing companies,61900000
cloud computing certification,13000000
cloud computing conference,73800000
cloud computing costs,46900000
cloud computing course,30900000
cloud computing careers,14000000
cloud computing consulting,21600000
cloud computing cons,2960000
cloud computing comparison,19900000
cloud computing concepts,12500000
cloud computing definition,37800000
cloud computing disadvantages,1420000
cloud computing database,42600000
cloud computing data center,48000000
cloud computing definition nist,346000
cloud computing diagram,4880000
cloud computing development,95400000
cloud computing deployment models,7400000
cloud computing demo,9950000
cloud computing drawbacks,1330000
cloud computing examples,39500000
cloud computing etf,1210000
cloud computing expo,15300000
cloud computing explained,70200000
cloud computing events,78100000
cloud computing education,52200000
cloud computing environment,53200000
cloud computing events 2011,78300000
cloud computing economics,10100000
cloud computing experts,24100000
cloud computing for small business,33800000
cloud computing future,73000000
cloud computing for dummies pdf,1470000
cloud computing for dummies,3000000
cloud computing free,117000000
cloud computing for business,136000000
cloud computing for individuals,63900000
cloud computing for lawyers,3680000
cloud computing forum,43900000
cloud computing facts,11700000
cloud computing google,84500000
cloud computing growth,31900000
cloud computing government,34600000
cloud computing gartner,5570000
cloud computing games,65600000
cloud computing growth projections,4780000
cloud computing google docs,4060000
cloud computing gupta,1090000
cloud computing glossary,5490000
cloud computing group,134000000
cloud computing history,31900000
cloud computing healthcare,18800000
cloud computing hardware,60400000
cloud computing hype,3850000
cloud computing hipaa,952000
cloud computing hosting,69600000
cloud computing how it works,32700000
cloud computing hp,44500000
cloud computing hosting providers,44900000
cloud computing humor,5570000
cloud computing issues,75700000
cloud computing in healthcare,18500000
cloud computing interview questions,1150000
cloud computing infrastructure,43200000
cloud computing in education,51400000
cloud computing industry,85000000
cloud computing ibm,45300000
cloud computing infographic,1650000
cloud computing icloud,1980000
cloud computing introduction,26800000
cloud computing jobs,44300000
cloud computing journal,2900000
cloud computing jokes,3580000
cloud computing jobs salary,6840000
cloud computing java,30200000
cloud computing jobs in india,3070000
cloud computing jobs in us,84700000
cloud computing jobs in usa,43500000
cloud computing job opportunities,9480000
cloud computing jobs in bay area,1510000
cloud computing kansas city,2930000
cloud computing k-12,2850000
cloud computing key words,19200000
cloud computing key players,1140000
cloud computing kings,2940000
cloud computing knoxville,722000
cloud computing kpmg,681000
cloud computing kenya,6990000
cloud computing kit,18100000
cloud computing kpi,2660000
cloud computing leaders,13800000
cloud computing legal issues,2430000
cloud computing layers,14800000
cloud computing leaders 2011,47800000
cloud computing linux,48800000
cloud computing law firms,1260000
cloud computing laws,15900000
cloud computing looking beyond the cloud,51800000
cloud computing leading companies,3860000
cloud computing logo,61300000
cloud computing microsoft,80900000
cloud computing market size,5190000
cloud computing market,74500000
cloud computing magazine,29600000
cloud computing models,42000000
cloud computing meaning,13400000
cloud computing management,112000000
cloud computing mutual fund,642000
cloud computing mac,92500000
cloud computing music,38600000
cloud computing news,67600000
cloud computing nist,913000
cloud computing negatives,2300000
cloud computing network,132000000
cloud computing news 2011,134000000
cloud computing newsletter,49200000
cloud computing news articles,119000000
cloud computing names,59300000
cloud computing new york times,5670000
cloud computing ncsu,134000
cloud computing overview,52800000
cloud computing options,34400000
cloud computing open source,61100000
cloud computing operating system,35800000
cloud computing oracle,43100000
cloud computing os,60100000
cloud computing outages,1140000
cloud computing origin,15900000
cloud computing opportunities,85100000
cloud computing outline,8440000
cloud computing providers,55700000
cloud computing pros and cons,1400000
cloud computing ppt,3330000
cloud computing pdf,28300000
cloud computing presentation,17800000
cloud computing problems,40400000
cloud computing pricing,56900000
cloud computing platform,45700000
cloud computing primer,3900000
cloud computing presentation ppt,1990000
cloud computing quotes,7840000
cloud computing questions,39700000
cloud computing quickbooks,1340000
cloud computing questions to ask,9080000
cloud computing quincy washington,1390000
cloud computing q&a,12800000
cloud computing quiz,7300000
cloud computing qa jobs,7700000
cloud computing quicken,865000
cloud computing questionnaire,1040000
cloud computing risks,23500000
cloud computing reviews,64100000
cloud computing research,98400000
cloud computing research paper,8950000
cloud computing research topics,38400000
cloud computing risks and benefits,6070000
cloud computing resources,67100000
cloud computing resume,5700000
cloud computing richmond va,785000
cloud computing revenue,17800000
cloud computing stocks,8370000
cloud computing security,72700000
cloud computing services,93200000
cloud computing software,104000000
cloud computing security issues,44100000
cloud computing service providers,19000000
cloud computing security risks,3140000
cloud computing security concerns,6620000
cloud computing statistics,19400000
cloud computing storage,59600000
cloud computing training,34200000
cloud computing technology,98800000
cloud computing trends,30100000
cloud computing tools,68800000
cloud computing terms,140000000
cloud computing types,63600000
cloud computing testing,25800000
cloud computing tutorial for beginners,4420000
cloud computing tutorials,19500000
cloud computing trends 2011,34200000
cloud computing use cases,7980000
cloud computing ubuntu,13700000
cloud computing uf,1180000
cloud computing uses,46300000
cloud computing ufl,239000
cloud computing usage,70000000
cloud computing user group,58000000
cloud computing usa today,3220000
cloud computing university,43900000
cloud computing uiuc,222000
cloud computing vendors,14300000
cloud computing video,90400000
cloud computing vs saas,4760000
cloud computing vs virtualization,8040000
cloud computing virtualization,25800000
cloud computing vs grid computing,1180000
cloud computing vs client server,1780000
cloud computing vmware,20300000
cloud computing vs local computing,38900000
cloud computing video tutorial,2350000
cloud computing wiki,38200000
cloud computing white paper,8970000
cloud computing world forum,3470000
cloud computing websites,43000000
cloud computing webinar,4850000
cloud computing workshop,12400000
cloud computing web services,69100000
cloud computing windows,94800000
cloud computing with vmware vcloud director,670000
cloud computing windows 7,62800000
cloud computing xml,39100000
cloud computing xkcd,234000
cloud computing xen,3310000
cloud computing youtube,24800000
cloud computing yahoo,40300000
cloud computing yahoo answers,6150000
cloud computing youtube funny,12300000
cloud computing zone,39500000
cloud computing zoho,917000

Role of Blogging in Partner Development

In this article Gilmore, Shipwire’s VP for marketing and business development,  provides useful tips for partner development. At the end he touches upon the payoff of blogging:

“We want to be a thought leader,” Gilmore says. “We want to be a visionary. We want to get our ideas out there.”It’s also a very easy way for us to put out a position and keep our customers and partners up-to-date. It also gives us feedback from them. Blogs can start anywhere in or outside the company, and some of our best have come from our customer support team. They’ll ask, ‘Can we write a blog about how to do XYZ? A lot of customers are asking about it.’ Sure, put it up.

“Finally, the more information you put out there, the better you’ll do with search engines. The more content you have, the better.

“You do have to know who the audience is, and get relevant information out there to start or join a conversation. We had a forum for awhile, but we turned it off because it wasn’t getting a lot of traction — there weren’t enough people involved — but we might go back at some point.”

Having the customer support team blog on how to do ‘xyz’ is a great idea. This article is a great read.