The Pandemic & Its Implications on the Future of Machine Learning
For a moment, let’s ignore the abject tragedy of the Covid Pandemic and the demoralizing conditions through which we continue to persevere. Instead, let’s examine the state of affairs from a dispassionate position of scientific observation. Seismic shifts in human behavior are now underway, as the burden of the Pandemic forces societal transformation. Crippling inefficiencies in industry, and volatile projections of markets have led to unprecedented uncertainty. We have found ourselves in a hard reboot of civilization.
To fully address any one of the plethora of challenges we face would be exhaustive for this medium. However, the role Machine Learning (ML) will play in our circumstances is fascinating and worthy of discussion as the field will undoubtedly contribute to the impending metamorphosis. There are three periods of time to consider; the time before the Pandemic (BP), the time during the Pandemic (DP) and the time after the pandemic has *subsided* (AP).
Central to our understanding of these three epics are the two approaches to model human behavior through machine learning. We’ll call the first class of techniques Canonical ML (CML) as it represents traditional approaches in pattern recognition derived from highly structured and labeled data. This technique uses statistical models to explain the state of a system or predict its future state using overall explainable models. It fully depends on engaged scientists to contextualize the outcome of the model’s ascribed result. For example, CML could be used to predict the weather in a particular region by analyzing the features of the region over time (e.g. temperature, humidity, pressure, cloud cover), but each of those features in isolation fail to fully predict a future state. Furthermore, without ascribing context to each component of the model, a valuable output is not achievable (it will rain tomorrow, or it won’t rain tomorrow). When CML models are contextualized we’re able to explain exactly why predictions are made, i.e. “we know it will rain in Austin tomorrow given the regression and ensemble models we’ve developed for various weather features”.
We’ll call the second class of techniques Reinforcement ML (RML) as it deploys a very different method for modeling the complexities of human behavior, and it's derived from the field of reinforcement learning. Reinforcement learning deploys intelligent agents that optimize their individual actions in order to optimize the collective outcome. These models operate much more autonomously than CML and fully embrace early failures in favor of long term self-healing. An example of RML would be the DeepMind application: AlphaGo. This innovative project is currently the world champion for the game Go, a title once thought impossible for a machine to hold. While the models at the core of the technology are definable, the weighting and structure of the individual decision nodes within AlphaGo are not explainable. It is as though the DeepMind team set up an environment for a program to play the game Go, defined victory for the game, and let the machine try different combinations of moves until success was achieved. This is not the complete story, as AlphaGo also relies heavily on training data to function, but the concept is correct. It is thought by many in my field that RML represents a step towards a General Intelligence, however that is not my contention, nor is that discussion relevant to this piece. Only recently has RML become tenable for industrial deployments, and is still met with much trepidation due to unexplainable methods and little accountability. In other words, when RML models are correct it is nearly impossible to fully explain their precision, while techniques in CML suffer from lost precision but are often more explainable. Throughout my career the application of CML has represented the overwhelming majority of successful solutions, while RML has only begun to infiltrate the industrial world.
In 2019 (BP), CML was at its zenith. There were hardly any industries that hadn’t been disrupted to some degree by machine learning. From financial services to healthcare to automotive design to Defense, leaders in every sector had embraced the capabilities of a robust data science solution capitalizing on CML. Several years before the Pandemic, a colleague shared the findings of an experimental program developed at an international bank. The program sought to predict the emotion and behavioral predilections of their customers through transactional and third-party data interpolation. Their aim was to successfully predict consumer emotion and behavior based on their transactions and online presence. The models were fascinating, adept at even the most complex behavior, and the project became seminal research for financial data science. At the time, their eerie precision was derived from CML. Techniques in ensemble classification, hidden Markov chains, columnar and relational databases were all avant-guard for their time, but the technology needed to materialize the fledgling capabilities of RML and complex Bayesian networks simply didn’t exist. This case study was emblematic for many manifestations of CML before the Pandemic. Scores of historical data and heuristic behavioral modeling contributed to measurable ROI on data science initiatives. Many companies had deployed CML, and some had begun to experiment with RML. Companies around the world were transforming their industries through CML on their own terms. On the surface, the union between science and industry was thriving.
And then the planet was enveloped by a pandemic...
The virus spread aggressively across multiple vectors, and by the time the gravity of the situation became apparent, we realized how completely untenable the status quo would be. Abstracting from machine learning for a moment, we saw an incredible change in behavior at the outset of the pandemic. Constraints on infrastructure shifted from the roadways to suburban power grids. Grocers swapped luxuries for staples. Retail went from Omni-channel to digital only. Preventative healthcare and elective surgery came to a complete standstill. Digital-first leaders in the entertainment industry suddenly became single source interaction points for their users causing constraints on their services. While these observations are trite and intuitive for everyone reading this, what is less obvious is the implications this had on machine learning.
Overnight, the Pandemic obliterated the efficacy of millions of models. Models that had been developed for everything from recommendation engines to predictive network maintenance were now obsolete. Every sector that had benefited from CML was in a difficult position. Either trust that the ML models their business depended on would correct overtime, or reverse course and manually drive components of their business that CML had automated. For example, the aforementioned bank that had adopted CML to predict emotions likely has a completely useless model given the dramatic shifts in behavior incurred by the pandemic. They can no longer string a collection of inferences across data sets that describe the actions of their customer personas as they no longer have training data for user behavior under these circumstances. How can you describe the mood of a customer when your model depends on how much they spent at a coffee shop, a clothing retailer and on gasoline? In all likelihood the bank merely sees a series of transactions with online retailers which are opaque enough that inferencing emotion is futile. It is my contention that CML has failed many businesses across many industries, and we won’t be fully aware of these failures for quite a long time.
The companies that adopted RML before the Pandemic however may have an advantage over their peers, as techniques in RML are not dependent on highly tuned conditions from a scientist, but rather seeks to optimize for success as defined by said scientist. While RML requires exorbitant amounts of data for training, the increase in the frequency of data collection has eased that challenge in some cases. Had the bank adopted RML techniques, it is very possible that there would have been a delay in the RML adapting to the new conditions, but the algorithm would self-heal and adjust to provide more accurate predictions.
We are never going back to precisely how things were before the Pandemic, and that’s a good thing. Industry has identified far too many inefficiencies which firms and markets are now strategically capitalizing. A return to normalcy will never occur, but rather we will soon find ourselves in a world that is comfortably familiar and yet more efficient.
In this familiar, optimized world, companies that have adopted RML with major investments and strategic rigor will be rewarded as they will quickly optimize their models to the third phase of consumer behavior. This will happen with little additional effort from their science teams, as the definition of “success” for their KPI has likely remained the same, and the models will now adapt for a second time in 18 months. The Post-Pandemic era will likely resemble the Pre-Pandemic era but with a heavier slant to digital behavior, and customer behavior that has inculcated new habits and efficiencies identified during the Pandemic. Once industry realizes the advantage that has been afforded the firms that adopted RML before the Pandemic, there will be a race to catch up. Those that are successful in the adoption will survive and be competitive, those that are unable to will likely struggle.
But what of the explainability of RML? It is my contention that the pressures of the Pandemic will greatly shift industry’s willingness to deploy unexplainable or opaque models, or “black box models” as they’re often referred to. As the advantage for the few through RML becomes clear, many firms will forego the accountability of CML in favor of RML’s precision and adaptability. It is the AI equivalent of the adoption of tele-health or remote work during the Pandemic, and is arguably much more impactful. Scientists must now work to ensure that RML techniques that are deployed can be responsible and accountable or the integrity of their business could be compromised.
As we enter into the Post-Pandemic era, there are many reasons to be excited for the next frontier of civilization. We’ve upended our priorities, shifted relationships and in my experience we’ve removed tedium and distraction. ML will continue to play an integral role in industrial transformation, and as it adapts to the changes we’ve made in our own lives, I trust my colleagues and peers across industry to ensure that we develop this capability in a way that is inspirational, dynamic and responsible.