Data Science, AI, Machine Learning, Deep Learning? Huh?
Article written by Tijn van der Zant PhD, CEO Datamaister
Finding the best model in machine learning is much like finding the highest peak in a mountain range. The challenge is: how do you find the highest peak and not merely some local hill.
Quite often when someone asks me what I do I say that I make computers smarter. Then I give some examples, like a talking computer assistant, an autonomous car, or an elderly care robot. When someone wants to know more, I have tried to explain the history of Artificial Intelligence (AI) and where we are now. Most of the time people start to zone out when I talk about Alan Turing in the ’40s. By the time I reach the Dartmouth Summer Research Project on Artificial Intelligence in 1956 (not that I am that old), I have lost most of the audience. On the surface, it seems to have little connection with what is often heard in the media, although on a deeper level it does… So I started explaining the buzzwords from the media, and that does connect with a lot of folks.
In a series of articles, I will give an explanation of the differences and similarities of some of the buzzwords. An initial selection is useful, and the selection I make is typically what is in the title of this article. These items are descriptions of parts of the field I have been working in for more than 2 decades. They do not describe the type of solutions that are possible. That would be a whole ’nother kind of article. This article flows from the broadest of the fields (Data Science) and zooms in on sub-fields, only to loop back when we arrive at Automated Machine Learning (AutoML) and Edge AI
In Machine Learning (ML) you use data plus an algorithm to create a model. This model then can be used to do classifications or predictions on similar data as you used to create the model.
There are many different algorithms that can create models and each has its strengths and weaknesses. Some do well on small datasets but poorly on big data whereas others require large amounts of data before it creates good models. One of the tasks of a Data Scientist is to find the best algorithms to create a model that can do what you want it to do on your data.
Some algorithms in ML are inspired by nature. For instance, Deep learning and Neural Networks are inspired by how the brain works. There are also many forms of algorithms that use principles from how evolution functions, with genomes, reproduction, selection etc. Another popular family of algorithms is based on Pavlov conditioning:in ML it is called Reinforcement learning. It is the combination of Reinforcement Learning with Deep learning that was used to build Alphago and Alpha Zero.
You might be wondering, “isn’t this what AI does?”. Well, yes and no. The reason is that ML is a subset of AI, which brings us to the second buzzword: Artificial Intelligence.
Artificial Intelligence or AI, is a very broad field which encompasses many disciplines, including Machine Learning, Deep Learning, and AutoML. At the Dartmouth Workshop in 1956, widely considered the birth of AI, the focus was on creating intelligent machines, and on the understanding of (human) cognition and putting that into algorithms. The first field led to areas such as machine learning and artificial neural networks (now called Deep Learning) and the second field led to expert systems and reasoning systems. Although the first field is widely accepted nowadays as being useful, the reasoning field is less famous.
As mentioned above, Machine Learning (ML) is a subset of Artificial Intelligence. Nowadays, at least in most cases, the words AI and ML can be used interchangeably. That is because AI is discussed in the setting of Advanced Data Analytics. That is where the algorithms in ML shine. AI does bring more to the table though. It aims at creating an intelligent system such as autonomous robots and self driving cars, which are more than a well tuned ML model with a data pipeline. In other cases it can focus on multiple agents cooperating to achieve a goal although this is largely still a research topic and not common outside the labs. But from time to time the research done under the flag of AI becomes common good. Take language translation for instance. For decades AI researchers have worked on creating good automated translators. Nowadays we all can enjoy the fruits of their labor and read foreign menus using our phones. It is so common we rarely think of it as AI anymore. Some people in the AI community joke that AI does not exist since when AI scientists figure something out and make it an algorithm, it becomes computer science.
Deep Learning is arguably the most well known ML algorithm. Deep Learning is a part of Machine Learning, just as Machine Learning is a part of AI. Deep Learning is inspired by how brains and their individual neurons work. Already in 1943, neurophysiologist Warren McCulloch and mathematician Walter Pitts wrote a paper on how neurons might work. They modeled a simple neural network using electrical circuits. These models were further formalized and put in computer code.
These Artificial Neural Networks were a popular research topic in the ’80s and ’90s, but interest faded in the first decade of this century due to their limited capability. It was just exceedingly difficult to get an Artificial Neural Network of any decent size to work on a regular computer. And if the networks are too small, they do not do a lot of interesting or useful things.
For several reasons, Deep Learning managed to overcome that hurdle: Prevalence of cheap computation power as GPUs become common good, the prevalence of data thanks to a combination of cheap storage and the internet (ImageNet for example). Last but not least several innovations like Relu activation functions, Pooling layers, and the Dropout mechanism networks became cheaper to compute, more robust in generalizations, and less prone to overfitting.
This resulted in much larger networks with quicker training procedures. Suddenly we were able to solve all sorts of difficult problems resulting in the rise of Deep Learning in the 10’s of this century. Nowadays the hype is a bit behind us. It has become a regular technology that is being used to solve real-world problems, such as detecting pedestrians, writing simple songs, predicting inventories or consumer behavior, anomaly detection, and predictive maintenance, to name a few.
Data science is quite a separate entity from the previous discussed buzzwords. It is a very applied field where a Data Scientist uses the tools of AI, Machine Learning, Deep Learning, AutoML etc. to build solutions based on data. In addition to using these algorithms a Data Scientist also uses data cleaning procedures, visualization tools, uses algorithms to create models, talks with end-users, does research, and creates interesting reports for other people in other departments. A good Data Scientist does not only know which algorithms and tools are available but is also proficient in using the right tool for the right job. Senior Data Scientists not only solve data-related problems but also understand the business for which they are creating solutions. That is why there are so few good senior Data Scientists, they must be technically savvy, have good social skills, and describe the solutions in a manner that other people understand, sometimes even in a glance of the eye.
Hope you enjoyed reading this article and that it clarified some of the concepts for you. In the next article I will discuss topics such as Edge AI, AutoML, Quantum AI and Quantum ML. See you then!