Early Signals

“What are you doing?”, my wife asked. Most of the time I grunt something. I think of this question as more of a greeting than a question. This time, however, I said enthusiastically “I am reading about the Wiki Wiki Web”.  This cracked up the entire family (two kids included). Wiki Wiki means quick in Hawaiian. This was in the early 90s before the days of Wikipedia. I was browsing through c2 wiki.

Wikis became mainstream when Wikipedia became popular and now my family does not laugh at me anymore.

There are times you have a hunch about certain trends. I felt strongly about – Databases in the 80s, Wikis and application components in 90s, XML, Python in early 2000 and now ML and Chatbots. This resulted in my working on an SQL database engine in the mid-80s, database components in 90s, an XML chip in the 2000s and on Python since 2006. Now, it is ML and chatbots and Natural Language Processing.

Not every thing I was excited about became mainstream. RDF and Semantic Web, OLE from Microsoft,  Domain Specific Languages and Pattern Oriented Languages did not go very far.

Over a period, I have built a few thumb rules about paying attention to early signals in emerging technologies.

  1. What is the research behind the technology and how long has it been going on? For example, neural networks and ML are several decades old. AI has gone through several difficulties.
  2. What is the volume and velocity of research papers?
  3. Is government funding research in this space? (Internet, Page Rank algorithm, self driving cars and many others started as funded Government research).
  4. Who are the major companies involved in the early adoption of these technologies? For XML it was Microsoft, Sun Microsystems, and several others.
  5. What are pilot projects being done for commercialization and who is working on those?
  6. Who is hiring in this space?
  7. Which business publications are covering topics about this space?
  8. What companies are getting funded? Funding is both a leading and lagging indicator depending on who is funding and why they are doing it.
  9. How is the information about this space being propagated? Who is propagating it?
  10. What are the conversations going on in Twitter?
  11. Are there books on the subject? Books are most of the time lagging indicators.
  12. Are these technology topics being covered in conferences?

Some of these indicators are easy to find. You need to look for others.

Restarting My Read Log

I am restarting my Read Log. A read log is a blog of a list of things I read and find useful. I tweet some of them but Tweets have a short half-life.

The inspiration for Read Log comes from different sources – Four short links by Nat, Brain Pickings from Maria, Farnam Street by Shane, and a few others.

Some of the best bloggers I know work hard at writing their posts and sometimes I feel like I am cheating. But these posts are worth sharing and if I am lucky, some of them may even start conversations.

Half Life of Knowledge

How long is your knowledge relevant? In other words, what is the half-life of your knowledge?

Wikipedia has a nice description of the half-life of knowledge

The half-life of knowledge or half-life of facts is the amount of time that has to elapse before half of the knowledge or facts in a particular area is superseded or shown to be untrue. These coined terms belong to the field of quantitative analysis of science known as scientometrics.

Here are a few things to think about:

  • What is the half-life of entrepreneur knowledge? Can we take lessons from the past and use them today?
  • What is the half-life of knowledge about software architecture and design?
  • What is the half-life of knowledge about sales and marketing techniques?

Some knowledge may have a shorter half-life, than others. To stay relevant in your industry you need to figure out how much of your knowledge is still useful.

Why I Retweet

There are several reasons. In no particular order:

  • I like the message in the tweet. I resonate with it. 
  • I like the link – typically a pointer to good reading material
  • Because it provides a different point of view 
  • I use it as a marker in my life – a part of my daily log
  • It is a hat tip to the author who causes me to pause and think. 
  • I think this tweet requires recognition and I would like to spread the idea
  • It may be a part of a discussion. I jump in and do my two bits.
  • It may be an event and I want to share it (a picture, a quote, a sound bite)
  • Same reason I tweet – to start a conversation
  • Same reason I tweet – to ask a question

Applied ML – How Uber Uses Machine Learning at Scale

This is one of the most comprehensive engineering blog posts on how Uber uses Machine Learning (ML) at scale. It covers:

  • Uber’s ML Platform – Michael Angelo
  • Uber’s research and production efforts and how they inter-relate
  • How Uber achieves Model Developers Velocity

I made a list of few terms and concepts from the article:

  • ML deployment use cases
  • Pervasive deployment of ML in several applications
  • Distributed training of ML
  • Aligning ML applications with Uber’s priorities
  • ML tools across the company (where and what)
  • Internal events like – ML conferences, ML reading groups, talk series
  • Data Science Workbench (a tool to build and iterate ML models)
  • ML Platform team and how they work to support ML development inside Uber
  • Technology stacks – Spark, Cassandra, Python and others
  • Experiments with external tools both open source and commercial
  • Uber’s open source contributions

It is nice to know how a dynamic company uses Machine Learning. There is a lot to learn from here. If you are thinking about building and deploying ML applications Scaling Machine Learning at Uber with Michelangelo | Uber Engineering Blog is a must read. I may go back and read it again.

A Great CTO Talk about Technology at Walmart

OrangeScape has this new initiative called CTO Talks. I think it is a brilliant idea. “While there are a lot of conversations taking place at the software development level, there are none at the CTO level”, says Suresh. I agree. We need different levels of conversations on technology.

I enjoyed  The talk on Technology at Walmart – a few Glimpses. I hope to see other more comprehensive blog posts. I was looking for the use of Machine Learning at Walmart, and I was not disappointed.

Here is a list of uses of Machine Learning (ML) at Walmart.

    • Competitive Intelligence and Analytics
    • Crawl frequency prediction (how frequently you can crawl certain sites for price information – too many crawls, and you will be blocked. Too few and you an miss useful information. Different sites update information at different intervals)
    • Natural Language Processing (NLP) of product catalogs
    • Bossa Nova robots roaming the aisles at Walmart locations checking out of stock items mislabeled shelf tags, and incorrect prices.
    • IOT  at Walmart – Monitoring temperatures of Refrigerators in real time
    • Visual inspection and Spoilage predictions
    • Predictive analytics of future failure of equipment
    • Predicting attrition (they have 2.3 million associates in over 11,000 locations)
    • Predicting absenteeism based on weather patterns an HR application (and an important contributor to maintaining service levels in their stores)
    • Hari briefly touched upon blockchain. They are looking at using it for tracing grocery items from source to customer.

Walmart is one of the leading indicators of technology adoption in retail. Hari mentioned that they were the first to introduce Satellites dishes in their store locations, barcode scanning, use of RFID and providing a direct view of store items to their suppliers.

It was a great talk. It was no wonder that we had an amazing turnout (more than 250 registrations). Hari answered all the questions patiently and in depth.

Discovering Information with a Little Help from Tweet Assistant

If you noticed a spike in my Twitter posts today, there is a reason.  I have been searching #machinelearning using Tweet Assistant (more on Twitter Assistant in a future post)

These tweets, hashtags, mentions and popular posts provide a gold mine of information.  However, there is more to analyzing tweets than getting news and opinions. Within minutes, I was following some cool people and started discovering awesome tweets and visualizations.

Machine Learning – Top 10 Mentions

 

Machine Learning – Top 5 Hashtags

I started retweeting some of the posts I found but felt that there are far too many posts and a discussion of these may fill a couple of blog posts. So that is what I am going to do.

 

 

Applications of ML – Improving Depth detection, Night Sight, Video Play and Notifications

Here are a few fascinating applications of ML. I mostly track business applications of ML so I was pleasantly surprised to see how Unsupervised Learning and Reinforcement Learning (two ML techniques that do not get much coverage) was being used by two of the biggies in the AI and ML space.

The first two – Depth detection and Nightsight are posts from Google AI Blog.  The following concepts were covered in these posts.

  • PSL Positive shutter lag
  • Motion metering
  • Exposure stacking
  • Astrophotography
  • Super-resolution
  • Auto white balancing and Learning based AWB algorithm
  • Ultra long shot photography
  • Optical Image stabilization

    Facebook Horizon article covered how they were improving user experience with Reinforcement Learning.

  • Improving video playing using Reinforcement Learning (RL)
  • Improving user notifications using Reinforcement Learning

Links to articles:

  1. Under the hood of the Pixel 2: How AI is supercharging hardware – Google AI 
  2. Google AI Blog: Night Sight: Seeing in the Dark on Pixel Phones
  3. Horizon: An open-source reinforcement learning platform – Facebook Code 

Little Bits of History of my Programming Journey

1972 – I wrote my first program in PDP-8 assembly language

1973-74 – Diagnostics for a clone of PDP-11 called TDC-16, early device drivers (they were called IOCS – input/output control systems). Early programs were written in Machine Language (coded in octal) since no assembler was available, keyed programs into the console using toggle switches (as binary code) and debugged

1974 – Had paper tape – ASR-35 later high-speed reader and mylar tapes

1974 – Debugged device drivers for magnetic tapes and discs, wrote memory diagnostics that detected noise in core memory (and required shielding)

1975 – My First commercial program in assembly language for Bombay stock exchange for matching buys and sells of stocks. The records were punched in cards and fed to the computer, stored in magnetic tapes and matches performed. The memory configuration was a whopping 16KB.

1976 – Taught, RSX-11M (a real-time operating system in PDP-11) at Tata Electric. Wrote first set of PDP-11 program in RSX-11M an operating system for PDP-11

1978 – Learned  operating systems (RT-11, 11M, IAS, RSTS/E) all PDP-11

1978-79 Built the first soap survey program on RSTS/E in Basic Plus (for IMRB)

1979 – Wrote first commercial applications in Cobol (mostly for training others) and several small Basic-Plus utilties. Worked on performance tuning of RSTS/E operating system.

 Patched RSTS/E corrupted disk writing programs in Basic-Plus

1980-81 Wrote commercial programs in Cobol for consulting at Ashok Leyland

1983 – Developed benchmarks in Cobol for Wipro in Cobol

1984 – First C program (a database schema analyzer in Decus C)

1984 – My first Comdex in Las Vegas

1985 – First relational database metadata design as part of Integra SQL development and wrote small C programs mostly for testing the database

1986 – Integra SQL Version -1 with no nested selects, designed and built entirely by reading C.J.Date’s book on Relational Database Systems

1986 – Licensed Integra SQL to SCO (Santa Cruz Operations)

1987 – 1990: C-Trieve (an ISAM file management system), Objectrieve – C-Trieve extended to support Blobs, Licensing of C-Trieve to the White Water Group (they called it WinTrieve)

1991 – Objectrieve/VB was born and exhibited at Comdex May 1991

1992 – DbControls a set of custom controls for building database applications

1993 – Integra VDB – The first relational database set of components. Got covered in the BYTE magazine

1994-1996 – Layered SQL on top of dBase, Paradox, Btrieve (the last one was a project for Varian systems). Most of the coding was writing small examples in C, VB.

1996-2008 Coding winter

2009-Now – Dabbling in Python, little bits of ML, Chatbots

Tim O’Reilly: It is up to us

Listening to this conversation with Tim O’Reilly, was one of the most rewarding experiences. In this conversation, Tim and Byron discuss several topics that make you think. I listed a few here.

  1. Fitness functions (see the quote below)
  2. On Amazon’s use of robots for business
  3. About Cognitively Augmented workers
  4. The law of conservation of attractive profits
  5. On Intelligence (human and artificial)
  6. How to pair humans with machines
  7. Step changes and their impact
  8. On anticipating  and countering the worst fears
  9. The Robustness Principle
  10. Agreement Protocols
  11. On Platforms and Eco-systems
  12. On doing meaningful work

I will pick a few of the topics (1-5) and take the liberty of quoting from the transcript. My goal is to kindle your interest enough to read the article and then the book.

On Fitness functions and how they focus companies on delivering value with technology (including AI).

If you look at Google; their fitness function on both the search and the advertising side is relevance. You look at Facebook; loosely it could be described as engagement.

On Amazon’s use of robots in their business:

an analysis of Amazon. In the same 3 years which they added 45,000 robots to their factories, they’ve added hundreds of thousands of human workers.

About Cognitively Augmented workers.

Then, Sidecar and Lyft figured out the other piece of the equation, because Uber was just black cars. They figured out that in order to have enough drivers to really fill out the marketplace, other than a small segment of well-off people, you’d get ordinary people to supply their cars. And you could do that because those drivers are cognitively augmented. It used to be that you had to be a professional driver, because [when] somebody says, “I want to go to such and such an address,” you’d need to really know the city. You [would] need to have a lot of experience to know the best routes. Well, guess what, with these apps [like] Google Maps and Waze, anybody can do it. So I started looking at that and [saw that] we have a marketplace of small businesses managed by algorithms to help them match up with customers.

Law of conservation of attractive profits:

Clay Christensen, back in 2004  talked about  “the law of conservation of attractive profits,” and that’s what helped me get from open source to web 2.0—[is] when one thing becomes a commodity, something else becomes valuable. So if self-driving cars commoditize driving, you have to ask yourself, what becomes valuable. And I think it’s going to be new kinds of augmentation for humans, new kinds of services that you’ll put on top of driving.

On Intelligence (artificial):

And so what would be something that we have today that would qualify or come close to qualifying as that in your mind?

You mean, in terms of machines?

Yes.

Nothing.

And why do you think that?

I’m with Gary Marcus on this, you know. He kind of talked about how the frontier of AI right now is deep learning, and it’s great, but you still have to train it by showing it a gazillion examples of something, and after you show it a gazillion examples, it can figure stuff out. That’s great, but it can’t figure that out without being exposed to those examples. So, we’re a long way from kind of just flicking the switch, having a machine take in its experience of the world, and basically come to conclusions about it.

Tim ends with this appeal and an inspiring message.

“Hey, we have a lot of things to worry about, we have enormous new powers, let’s put them to work, in the right, way, tackling the hard problems.”

I have been a big fan of Tim’s “Work on stuff that matters” and quoted him. It framed some of my decisions on what to spend time on.

Meta:

I have been following Tim for a couple of decades. More about that in a later post. I recently started listening to Byron’s One Minute AI podcast and the GigaOm AI podcast.