When I give talks on Machine Learning, I often get these questions:
- What is Machine Learning?
- What are some Machine Learning Applications?
- Is Machine Learning Mature?
- Who is using Machine Learning?
- How do we get started?
If you are using Google or Bing Search, if you get recommendations for books or other products from Amazon, if you are getting hints for the next word to type on a mobile keyboard, you are already using Machine Learning.
Here is a sample list of Machine Learning applications.
From Apple’s Core ML Brings AI to the Masses:
- Real Time Image Recognition
- Sentiment Analysis
- Search Ranking
- Speaker Identification
- Text Prediction
- Handwriting Recognition
- Machine Translation
- Face Detection
- Music Tagging
- Entity Recognition
- Style Transfer
- Image Captioning
- Emotion Detection
- Text Summarization
From Seven Machine Learning Applications at Google
- Google Translate
- Google Voice Search
- Gmail Inbox Smart Reply
- Google Photos
- Google Cloud Vision API
Also, see – How Google is Remaking Itself as a “Machine Learning First” Company.
While Apple, Google, Facebook, Amazon, IBM, and Microsoft are the most visible companies in the AI space, take a look at business applications of Machine Learning.
What is Machine Learning? It is a common question that I get asked a lot. I wanted to find a simple, intuitive definition. After doing a few Google searches, I settled on this one from Arthur Samuel.
from Arthur Samuel (in 1959)
“[Machine Learning is the] field of study that gives computers the ability to learn without being explicitly programmed.”
It is a field of study. I like that. I picked this after Googling and finding over 100 descriptions. Here is a shorter curated list of results from this Google Search. From this list, you may find that Machine Learning is:
- A technique
- A field of study
- An application
- A Method
- A type of AI
- A sub-field of AI
- A general term
- A cure-all for all human problems (just kidding)
- A data based application generator
- A statistical method of learning from data
- A mapping function of inputs to outputs
So, what do you think is Machine Learning?
Artificial Intelligence (aka AI), will have a deep impact on our lives – both positive and negative. Like any other tool or technology, a lot depends on how we use it. I often get asked these questions:
- What is AI?
- What is good about it?
- Will it destroy jobs?
- Will it take over humanity?
- What do we need to do to leverage AI?
AI traditionally refers to an artificial creation of human-like intelligence that can learn, reason, plan, perceive, or process natural language. These traits allow AI to bring immense socioeconomic opportunities, while also posing ethical and socio-economic challenges.
Right now the opportunities are in research, technology development, skill development and business application development.
The technologies that power AI – neural networks, Bayesian Probability, Statistical Machine Learning have been around for several decades (some as old as the late 50’s). The availability of Big Data is bringing AI applications to life.
There are concerns about misuse of AI and a worry that it may result in uncontrolled proliferation, killing jobs in its wake. Other worries include unethical uses, unintended biases, and other problems. It is too early to take one side or the other.
Please take a look at Artificial Intelligence and Machine Learning: Policy Paper. It looks at AI from a variety of lenses.
Anitha sent me a link and asked for my opinion about this article – Artificial Intelligence or Intelligence Augmentation. What’s in a name?
She likes everything in brief – ideally 100 words. Me, I like to pontificate, take my time (in words, I mean), and ramble a bit.
There are three reasons why I think AI as Augmenting Human Intelligence:
- Humans have to be in the loop to teach AI. In supervised learning, they are designing the training sets, doing feature engineering and other tweaks. In reinforcement learning they are provided with feedback through reinforcement signals.
- Humans will figure out where to apply AI, how to apply AI and how to interpret and improve the results.
- There may be some situations when the AI may be autonomous – like in space robots or in some hazardous situations where humans cannot get involved in real time.
As AI learns more and discovers new insights, humans will use them to move them to the next higher level. In my opinion, humans and AI co-evolve. This is the process of Augmenting Human Intelligence.
A few links on Machine Learning and Software Engineering. The first one talks about how to explain machine learning to a software engineer and why software professionals need to pay attention to ML. It is both a tool and a bit of a threat.
The second article compares the way we build software and how it differs from building ML applications.
How to Explain Machine Learning to a Software Engineer
Software engineering is about developing programs or tools to automate tasks. Instead of “doing things manually,” we write programs; a program is basically just a machine-readable set of instructions that can be executed by a computer.
Now, machine learning is all about automating automation! Instead of coming up with the rules to automate a task such as e-mail spam filtering ourselves, we feed data to a machine learning algorithm, which figures out these rules all by itself. In this context, “data” shall be representative sample of the problem we want to solve — for example, a set of spam and non-spam e-mails so that the machine learning algorithm can “learn from experience.”
Software Engineering vs Machine Learning Concepts
Not all core concepts from software engineering translate into the machine learning universe. Here are some differences I’ve noticed.
A few thoughts:
- ML and Software development will co-evolve. Software will be used to build tools for building ML. ML will automate automation. Since software is the current tool for automation, ML will replace many of the software activities. Does this pose a threat to software profession?
- Do we need a different mindset for building ML apps, compared to building software? What principles of software development can be reused while building ML apps?
- Can ML help us build better software by improving the building process?
- The software industry is one of the heaviest users of tools for automating their own work. Various low-level (assembly), high-level (Java, C++, C#) and very high-level (Python, Ruby) languages and their associated tool chains simplified building applications. Now we have tools for not only building software, but debugging, profiling, optimizing, and managing it. Is ML going to be another one of these tools? Will these new class of ML apps take software as input and produce better software as output?
A nice blog post from Asana on How to Start Small and Scale Over Time:
Recently, we’ve made a series of changes to our data infrastructure that have all proven extremely valuable:
- Investing in monitoring, testing, and automation to reduce fire-fighting
- Moving from MySQL to Redshift for a scalable data warehouse
- Moving from local log processing to Hadoop for scalable log processing
- Introducing Business Intelligence tools to allow non-experts to answer their own data questions
Got this from Hadoop Weekly, Issue #95, 9 November 2014. It has many other valuable articles on scaling.
Twitter is a rich source of useful information. It is a great tool for:
- Researching Needs (for early customer development),
- Tracking Trends (in your industry),
- Watching Competition,
- Finding Influencers in your industry segment
We have been dabbling in some tools for mining Tweets and I am always on the look out for more.
There are a few kindred spirits who seem to be interested in similar topics. Here is one Scooped by Jose C Gonzalez. An Introduction to Text Mining using Twitter Streaming API and Python by Adil Moujahid
Twitter data constitutes a rich source that can be used for capturing information about any topic imaginable. This data can be used in different use cases such as finding trends related to a specific keyword, measuring brand sentiment, and gathering feedback about new products and services.
This tutorial teaches you:an approach to mining tweets, analyzing them and visualizing them using simple open source tools. You will learn:
- How to decode JSON, returned by the Twitter searches
- How to use a Python library called Pandas to analyze Twitter Streams
- How to use another Python library (matplotlib) to plot the results of analysis
IoT devices offer huge potential for electronic component manufacturers, but this is clearly not where the value will stop. Most of the added value in IoT solutions will come from the processing of the generated data. In fact, the ratio between electronic components and data processing can reach 1:50 in certain long-term cases!
Technologies & Sensors for the Internet of Things mentions three types of companies that will benefit from IOT initially.
The IoT is a multi-billion dollar market emerging from several different markets (i.e. industrial sensors, wearable electronics and home automation) which will see strong convergence in the next five years. Three industrial and service sectors will be integral to the valorization of this new market:
- The electronics industry, which will manufacture the sensing devices
- The communication and cloud data storage industry, which will handle data transmission, storage and processing
- Service companies, which will valorize the data either through processing or by selling to a third party
This got me interested in finding jobs in the IOT space. Wanted to find out who is hiring, what kinds of jobs are being offered and where are they hiring. I did a simple Data Journalism style experiment.
- Tried a job search in Indeed and SimplyHired for IOT jobs (Indeed and SimplyHired are job aggregators)
- Got the list of jobs (wrote a small Python script to extract the job list).
- Wrote another program to take the list of jobs and extract entities using Open Calais API. Open Calais is a tool for extracting entities from text. They have a free API (with certain rate limits) that you can use to automate the entity extraction.
- A final tiny Python program took the entity file (generated from the previous step) and extracted the cities, job positions and companies.
- Here is the output of the first run from steps 1-4 and some minor edits to the output.
Cypress Semiconductor Corporation
IOT INFOTECH INC
Product Development Company
Sasken Communication Technologies
Senior Application Engineer
Manager – PTC
Host Protection Architect
IOT Solution Architect
Director of Evangelism
Princ Research Engineer
Manager – Internet
iOS WiFi IOT Engineer
Home Research Scientist
This is a small sample and just a tiny peek into the industry.
The cool thing about languages like Python is that you can write about 10-20 lines of code to extract, clean and generate useful data. In this specific instance, I reused some code and wrote a couple of simple scripts. I need to clean up the code and turn it into a more usable tool.
Meta: Writing a data blog is fun. It took me a couple of hours of exploration and writing and testing scripts, but I feel it is worth the time I spent. I also learned a bunch of things I did not know before.
“Most people who choose NoSQL as their primary data storage are trying to solve two main problems: scalability and simplifying the development process,”
From Number of NoSQL options grow