Machine Learning Application: Job Classification at LinkedIn

I am fascinated by Machine Learning (ML) and keep looking for case studies were ML solves real world problems. This Talk – Machine Learning: The Basics by Ron Bekkerman( video), provides a great overview  of machine learning and how it is being used by LinkedIn for Job Analysis. LinkedIn is one of the early companies to jump in to Data Science. With over 200 million subscribers, they have ample data to analyze. The data is very contextual too and that helps build better algorithms (they claim 95% accuracy in prediction in a specific case). At one point in the talk Ron mentions that the ML study helped in building a product that generates about 6 million dollars in revenue for LinkedIn. That is great pay off.


Why is job analysis interesting in general? It provides you with some interesting insights into the direction a specific industry is moving:

  • If you are in the (IT staffing) industry, you may want to know what kinds of jobs are in demand? And which ones are growing and which ones are shrinking?
  • If you are an outsourcing company, you may want to analyze the hiring patterns in different parts of the world
  • What kinds of skills are in demand for startups, medium sized companies and large enterprises? Lots of people from startups to training companies can use this data to build and tailor their offerings.
  • How do training companies and conference organizers meet the need for skills using job analysis?

Ultimately, it is all Market Intelligence of a kind. It is fascinating that, now we have large data to analyze and get some glimpses into the patterns of demand/supply.  So where do you get all this data from? That is a topic for another blog post.


One of our interns is working on an app to do Job Classification and automatic tagging of jobs. We were debating whether we should use some simple techniques or ML. I was going around looking for case studies and stumbled upon this video.

A Few Links On IOT – From TopicMinder Alerts

Here are a few links on Internet of Things – Smarter Homes

  1. Industry Opinion: What Will Drive the Home Control Market in 2014?
    Four years on from the launch of the iPad, and here we are, with iOS and Android devices, apps galore, the Connected Home, the Cloud, the Internet of Things/Everything, Google buying Nest, and ever more smart home initiatives and ecosystems being announced. Interest in home control has never been greater, but what does this all mean to the residential custom installer?

    We asked a selection of suppliers about home control trends and what is likely to drive the market in 2014.

  2. Italians’ love of car and home turning Internet of Things into €900m marketIn the not so rosy Italian ICT landscape, there’s a glimmer of hope coming from the Internet of Things (IoT) sector, which last year showed a double-digit growth, according to a newly-published report by the School of Management of the Politecnico of Milan.While Italy’s ICT market declined by 4.3 percent in 2013, the country’s IoT sector grew 11 percent in value, with the number of mobile-connected objects reaching six million.
  3. I Turned My Tiny, Dark, And Overpriced New York Apartment Into A ‘Smart Home’ For Just $300That’s one of the problems with living in New York. I spend about a zillion dollars per month in rent, and still have a teeny tiny apartment that faces the back of a bunch of taller buildings that block the sun.For the last several weeks, my apartment has been programmed to light itself up. Whenever I enter my building, my apartment knows I’m home and switches on the lamp in my living room so I don’t have to fumble around in the dark. When I leave the room, the light shuts itself off.

Think About Shifting Emphasis on Smart Data

From Key Digital Trends for 2014:

Don’t just focus on Big Data; think about shifting emphasis to “Smart Data”.



So what are the jobs that let us:

  • Find a variety of useful data sources and integrate them?
  • Analyze large volume of (unstructured) data?
  • Intelligence Monitoring?
  • Make sense of it all the “smart data”
  • Gain insight and ask critical business questions?

Something to think about.


Five of My Favorite Predictions from Future Magazine

I always liked Future Magazine. I even subscribed to it for a while. Hope they have a digital version now so that I can subscribe again. Here are 10 (bold) predictions about the future. Five of my favorite ones:

  1. We will revive recently extinct species.
  2. Doctors will see brain diseases many years before they arise.
  3. Buying and owning things will go out of style.
  4. Quantum computing could lead the way to true artificial intelligence.
  5. The future of science is in the hands of crowd sourcing amateurs.

Why only five? I like all of them but I don’t understand some and mainly I did not want to reproduce the entire list here. Link to  all the 10 predictions and details.

As you look at these predictions, it may be interesting to reflect a bit on their impact on our lives in the next few years.

From ThoughtWorks: Infrastructure as Code and Other Trends

A fascinating read from Thoughtworks Technology Radar on Development Trends

  • Embracing falling boundaries — Whether you like it or not, boundaries are falling down around you. We choose to embrace this by examining concepts like perimeterless enterprise, development environments in the cloud, and co-location by telepresence.
  • Applying proven practices to areas that somehow missed them — We are not really sure why, but many in our industry have missed ideas like capturing client side JavaScript errors, continuous delivery for mobile, database migrations for NoSQL, and frameworks for CSS.
  • Lightweight options for analytics — Data science and analytics are not just for people with a PhD in the field. We highlight collaborative analytics and data science, where all developers understand the basics and work closely with experts when necessary.
  • Infrastructure as code — Continuous delivery and DevOps have elevated our thinking about infrastructure. The implications of thinking about infrastructure as code and the need for new tools are still evolving.

ThoughtWorks also provides suggestions on how to handle these trends – Adopt, Trial, Assess, Hold.

The 13 page free pdf is full of valuable insights and links.

Source: TopicMinder Alerts

Internet of Things – A Few Links

Ti enables Ineternet of Things (popularly known as IOT). Here is a white paper on the Evolution of Internet of Things from TI.

“The Internet of Things (IoT) is quickly growing with the expectation of 50 billion connected devices by 2020 to provide smart, invisible technology that works for you based on your preferences. With the industry’s broadest portfolio of embedded wireless connectivity technologies, microcontrollers, processors and analog solutions, Texas Instruments offers many cloud-ready system solutions for the IoT. From high performance home, industrial and automotive applications to battery-powered or energy harvested wireless sensor nodes; TI makes developing applications easier with hardware, software, tools and support to get anything connected within the IoT.”

Monetizing M2M (Machine to Machine) aka IOT.

“There is a burst of creative ideas emerging in the Machine-to-Machine (M2M) space or what has become more affectionately known as the Internet-of-things world. With so many opportunities around the globe to connect devices and assets, many companies are starting to stake out claims in the M2M space”

The Internet of Things is here.

Drew Turney of WA Today recently wrote, “Today your smartphone knows your location, so everything from the local weather to nearby Facebook friends is available. What about tomorrow when your jacket can measure your vital signs or a hat can extrapolate your mood from your brain activity? Connect it with information on your schedule (from your calendar), spatial information such as whether you’re running or at rest, the time of day and a hundred other factors, and machines everywhere can decide on, find and present the information they think you need.”

IOT is moving from Industry to Consumers – Sooner than you think.

Devices connected to the internet — everything from coffee makers to toys — are going to become awidespread consumer phenomenon sooner than you expect, even though Europeans and Americans for now regard the technology in different ways.

Until now, smart machines connected to the internet have largely been the province of industry and governments. In the view of two executives, however, such devices will soon become ubiquitous at a consumer level, and everything from coffee machines to toys will have at least a brief life on the internet.

What is this Internet of Things? What jobs will it create? What jobs will it destroy? What is the impact on your life or career? Is it good or bad? Do we lose more control as human beings and become hopelessly dependent?

These are a few of the fascinating questions. I know you probably have even more. It will be exciting to watch the developments, adoption rate and the accelerating change this technology and applications will bring into our lives.

Data Science – A Few Tweets and Links

What is Data Science?

What is Data Science from Wikipedia Talks a bit of the history as well.

What is data science? – O’Reilly Radar

Data Science Courses and Recipes

Coursera Introduction to Data Science Course

RT @radar: Want to be a data wrangler? School of Data offers free online data science  courses

Applications, Tools

If you are wondering about the applications of Data Science, please watch the first couple of videos from this course

RT @StartupYou: DIY Data Science – when will this happen and think of how big it will be

Data Science Tools: Tools slowly democratize many data science tasks

“Deep Learning – The Biggest Data Science Breakthrough of the Decade” – Free webcast from O’Reilly

Tim O’Reilly – “Data science is transformative. The first wave was marketing analytics, before that financial arbitrage.”

Mapping Twitter’s Python and Data Science Communities

Data science and the analytic lifecycle  by @bigdata #strataconf

Other Resources

A bitty bundle of data science blogs Collected by @hmason. via @mikeloukides Call for more (look at the comments in the blog for more resources links)

What’s A ‘Data Scientist’ Anyway? Real-Time With m6d’s Claudia Perlich”

Machine Learning – A Few Links and Tweets

On Machine Learning from A free book on ML – A First Encounter of Machine Learning by Max Welling

The first reason for the recent successes of machine learning and the growth of the field as a whole is rooted in its multidisciplinary character. Machine learning emerged from AI but quickly incorporated ideas from fields as diverse as statistics, probability, computer science, information theory, convex optimization, control theory, cognitive science, theoretical neuroscience, physics and more.
The second, perhaps more important reason for the growth of m
achine learning is the exponential growth of both available data and computer power. While the field is build on theory and tools developed statistics machine learning recognizes that the most exiting progress can be made to leverage the enormous flood of data that is generated each year by satellites, sky observatories, particle accelerators, the human genome project, banks, the stock market, the army, seismic measurements, the internet, video, scanned text and so on.

On why this book was written

Much of machine learning is built upon concepts from mathematics such as partial derivatives, eigenvalue decompositions, multivariate probability densities and so on. I quickly found that these concepts could not be taken for granted at an undergraduate level.

Machine learning will be one of the most important tech trends over the next three to five years for innovation”

Startups making machine learning an elementary affair

Use Cases Machine Learning on Big Data for Predictive Analytics #ml usecases

A startup journey, the improvement in Python’s data science capabilities and hosted machine learning #techtrends

RT @woycheck: Zico Kolter wants to use machine learning to analyze electrical current behavior and provide details about your power bill (@…

Microsoft Research Machine Learning Summit: April 22-24, 2013

RT @siah: A free ebook by Max Welling “A First Encounter with Machine Learning”

Google Hires Brains that Helped Supercharge Machine Learning | Wired Enterprise |

RT @siah: PyMADlib: A Python wrapper for MADlib – an open source library for scalable in-database machine learning algorithms http://t.c

Peekaboo: Machine Learning Cheat Sheet (for scikit-learn)

Panels and Discussions

This is a panel from Churchill Club featuring
Peter Norvig, Director of Research, Google ,Gurjeet Singh, Co-founder & CEO, Ayasdi, Jeremy Howard, President and Chief Scientist, Kaggle


Once in a while, I go and gather my recent tweets and create a Tweet Cloud (a project developed by a student). I find some interesting topics, save the tweets and start a blog. I have written about this Linked Tweet Cloud a couple of times.


Paul Graham on Startup Investing Trends

Paul Graham’s on  Startup Investing Trends

I’m going to take a shot at describing where these trends are leading. Let’s start with the most basic question: will the future be better or worse than the past? Will investors, in the aggregate, make more money or less?


I think more. There are multiple forces at work, some of which will decrease returns, and some of which will increase them. I can’t predict for sure which forces will prevail, but I’ll describe them and you can decide for yourself.


There are two big forces driving change in startup funding: it’s becoming cheaper to start a startup, and startups are becoming a more normal thing to do.