With loudly proclaimed promises faltering and new challenges on the horizon, is it time to embrace a new paradigm for Artificial Intelligence?
The current top dogs: Big Data and Deep Learning
For a fair amount of time during recent years, Big Data und Deep Learning were at the forefront of the debate among practitioners of Data Science and Data Engineering and received a lot of attention from the broader public with promises of automating and transforming central aspect of our everyday life, as in self-driving cars or autonomous service robots.
While Big Data and Deep Learning represent a set of very impactful technologies, they don’t come without pitfalls whatsoever and most of the confidently set goals are still way beyond reach.
Big Data is the endeavor to gather massive amounts of all sorts of data from various sources, and to structure and store them accordingly. Although resource-intensive, it is regarded as a crucial enabler of Deep Learning, as it provides the necessary base input for feeding the powerful, but data-hungry neural networks during their initial training periods.
Deep Learning essentially is a way of training a network of artificial neurons to recognize patterns in the data we feed into it. In general, it is at its best when there is an abundance of data suited for training and the task at hand can be narrowed down into a question of rather simple logic, like “Is there a person or a car in this image?”.
Yet, even with seemingly simple questions, the fanciest neural network architectures, lots of data and limitless computing power, there is still plenty of room for false estimates. That’s because AI in its current state is generally inept of actual contextualization and even big data sets often cannot factor in all the natural variance that is found in the distribution of the “real” data the model will be confronted with in production. For that reason, the web is buzzing with memes like the one below, jokingly alluding to its sometimes-comical failures:
These shortcomings of Deep Learning can be regarded as mildly amusing or a mere annoyance in non-critical consumer-facing applications, such as chat bots or recommender systems. However, they can induce new potential risks when applied unreflectingly for making critical decisions in sectors such as Aviation, Health Care or Law Enforcement.
The natural complexity of big neural models makes the occurrence of these errors often unforeseeable and hard to debug. Despite recent breakthroughs in the field of explainable AI, there still exists a considerable trade-off between performance (achieved through increasing complexity) and interpretability of an AI model. Due to the inherent fuzziness of the technology, a non-neglectable residual risk will probably always remain, even if we cling to a methodologically sound approach, which by the way is not followed nearly as frequently as one may hope for, as in most of the futile efforts to utilize AI for diagnosing covid-19.
Another reason for concern is that this erratic tendencies can be actively exploited by malicious actors to mess with a model’s performance through targeted manipulation of the data that is presented to it (adversarial attack).
The illustration below shows an example of an adversarial attack on an image recognition AI of a similar type as those being used in self-driving cars, where adding a minor amount of pixelated noise, completely imperceptible by the human eye, leads to a complete misinterpretation of what is being displayed in the picture.
So even under the most favorable circumstances, Deep Learning per se basically only provides a valid solution, when all we need are approximate results and the absence of absolute precision or reliability is tolerable.
If we want to unlock the potentials of Artificial Intelligence for more challenging scenarios, when data and computing power are scarce or specific errors must be avoided by any means, then we need to think beyond the currently prevalent paradigm of Big Data and Deep Learning. To advance into these areas we must consider alternative techniques instead of just focusing on the increasingly intricate optimization of a narrow set of tools, which sometimes might not even be the best fit for a given purpose.
The (not so) new kids: Smart Data and Hybrid Models
Of course, this does not mean that Deep Learning is out of the equation, as we can use many of its undeniably impressive achievements over the last years as a foundation to build upon in some cases where the initial amount of data would be insufficient for training a new model from scratch. Thanks to the strong spirit of open source and sharing of knowledge among the community, we can resort to a lot of readily available pre-trained models for a vast number of use cases today.
First, we select an appropriate model, then we make slight adjustments and fine-tune it with a smaller, but well-designed data set to fit a specific purpose, e.g. an image-based Quality Assurance check in manufacturing or for adding awareness for specific vocabulary in a model for analyzing sentiment in texts. This way, only minor changes to the codebase are required and most of the value is generated by activities revolving around thoughtfully putting together a data set that contains just enough and exactly the right kind of data that is required to enable the model to fulfill our requirements. This approach can be referred to as Smart Data or data-centric AI.
Depending on the problem we are trying to solve, we can get astoundingly good results by leveraging data-centric AI combined with various approaches from other disciplines such as statistical inference or first principles models (physics, biology, chemistry…). To guarantee a proper baseline for our model, we can provide it with guiderails from predefined rules and thresholds or enrich it with heuristics derived from the valuable input of experienced domain experts.
Another promising approach can be found in Neural-symbolic Cognitive Reasoning, systematically extending AI with the algebraic manipulation of symbolically represented entities. Although their hybrid nature often goes unrecognized, models built on composite architectures are ubiquitous and madly successful. Even Google Search, presumably the world’s most popular search engine, is based on a pragmatic mixture of symbol-manipulating techniques and Deep Learning.
What does that mean if you operate a small and medium-sized enterprise?
Good news: The basic idea of Smart Data and Hybrid Models is that you can tap into the potential benefits of AI, without the absolute necessity of having petabytes of ML-ready data or putting all your trust into an unpredictable black box system.
As an AI Venture Builder that specializes in Co-Innovation for SMEs in Europe, DDGhas a long history of navigating these territories. Our partners most often neither sit on a ton of data nor do they have the resources to build expensive Big Data-Architectures for training supersized Deep Learning models just for the sake of testing the feasibility of a possible venture case. Therefore, DDG has a field-tested set of methods to identify use cases and requirements, as well as for determining which approaches need to be combined for the highest chance of success.
So stay tuned! In the next post of this series, we will take a closer look at how to systematically explore the potentials of hybrid models and how to benefit from them in real-world applications.
Want to learn more about this topic?
Head of Innovation