Deep learning

by RS admin@robinsnyder.com : 1024 x 640

This page is a brief look at deep learning. Any pattern recognition task requires some form of supervised machine learning, such as (the older) neural network technology or (the newer) machine learning technology. Note: The distinction is not exact but the neural network technology has essentially become part of the bigger and more general machine learning technologies.

Supervised leaning means that one must provide examples of what one is looking for (i.e., the recognition set) and then test it on other examples (i.e., the test set).

However it is well known in the machine learning field that one does not just provide examples to the machine.

As leading researcher Andrew Ng of Stanford says, with a laugh, on one of his YouTube videos: (paraphrased) "We all know that what we do is a lot of tediously developed feature extraction and only then can the machine learning take place".

Andrew Ng has been focusing on a field he helped create and develop - deep learning. For example, he (and some of his students) found a way to get 3D perspective from one 2D image. Since his office is within walking distance of the Google campus, this might be a technology used by Google. Ng himself, in the past year or so, has been working for Baidu, the main Chinese search engine.

For rectangle feature extraction (e.g., license plates, solar panels, roofs, swimming pools, etc.) that involves image processing, smoothing, edge detection, angle and corner detection, etc., and then the machine learning can take place. Here is an image found that depicts one way to do this process.

Any neural network (i.e., machine learning) system would use this feature extraction as input to the recognizer.

A YouTube video on deep learning (and machine learning) that is interesting is the following (about 2015).

https://www.youtube.com/watch?v=n1ViNeWhC24

It is by Andrew Ng, a leader in the field of machine learning, at Stanford.

He (Ng) starts to get technical after 20 minutes, but the first 20+ minutes might stimulate some ideas. He presents a good overview that puts into context the big picture, where the field has been, and where it is going. This appears to be the basis of the technology that Google is using for image recognition. He makes offhand references to Google, but since his office is within walking distance of Google headquarters, and Google does a lot with machine learning (much of which it does not make public) and, of course, both Google founders went to school at Stanford, I would assume that Google makes use of him (and his students).

Ng led the Google Brain project. Instead of the usual 10 million neuron connections, they used 1,000 million (1 billion) connections and are expanding to 10,000 million (10 billion) connections. With access to millions of YouTube videos, which included over 10,000 cat videos, the Google Brain learned to recognize the concept of a "cat", without, of course, knowing much else but that there are "cats" on the Internet. They used 16,000 computers running for 3+ days in the experiment. The media hyped this, but the Google Brain also learned to recognize faces, etc. Ng sees one of the many near-future uses of Deep Learning as that of natural language understanding. He has worked with Chris Manning (from Stanford, a leader of the field, I talked to him (Chris) for a few minutes at the Computational Linguistics in Baltimore) on some of these ideas.

Ng sees one of the many near-future uses of Deep Learning as that of natural language understanding. He has worked with Chris Manning (from Stanford, a leader of the field of natural language processing). The concept of "edge detection" appears to be important in eliminating what is less important from what is more important, but after that some patterns still need to be recognized. Looking for search terms, etc., appears to be a form of "feature extraction" that can be used to identify true positives in terms of false negatives (recall) and false positives (precision) in the search process. So the "Deep Learning" technique is a (deeply) layered neural network approach that recognizes and categorizes patterns. A neural network, loosely based on an analogy the neurons in the brain, consists of layers of "neurons" that have forward and backward feedback mechanisms that help in recognizing patterns. Such pattern recognition approaches are categorized under the general umbrella term of machine learning. The neural network technique is similar to linear and nonlinear regression techniques whereby one hypothesizes a linear or non-linear equation that fits the data. The differences are that instead of fitting data to pre-defined equations, neural networks separate patterns into groups and do not require a pre-defined equation - the "patterns" are "learned" by the neural network. The described "Deep Learning" appears to be an operational methodology of what Jeff Hawkins, inventor/developer of the Palm Pilot handwriting system, describes qualitatively in his book "On Intelligence" Hawkins, J. and Blakeslee, S. (2005). On Intelligence. St. Martin's Press.. (a very interesting read). Every new advance in technology makes possible new economies that are not at this time possible or feasible. One goal, then, is to identify those economies that can use these new technologies. A classic example is that, years ago, when Toshiba announced a tiny hard drive, Steve Jobs (and Apple engineers) decided that consumers would want their own enormous collection of songs on their own small personal device, even though no consumer had actually asked or even knew they would want such a device.

by RS admin@robinsnyder.com : 1024 x 640