Machine Learning? Deep Learning? What Happened to Human Learning?

By: Michael Housman

February 24, 2017

The most exciting advance in data science over the past few years is the rapid emergence of deep learning algorithms coupled with the availability of cheap but high-powered GPUs able to support them. These models take an even more prominent step in the direction of “black box” approaches where machines are able to make increasingly accurate predictions but provide very little intuition as to how they arrived at their conclusions.

Although these deep learning models are gaining traction quickly because they require virtually no feature engineering and make better predictions, they miss a big piece of the puzzle: causal inference. Without understanding the causal mechanism underlying these predictions, they are unable to offer insight to human beings looking to make better decisions. In a “human in the loop” process where the humans and machines together are able to achieve an outcome better than either one alone, this is a critical problem.

Does that mean that deep learning techniques are fundamentally flawed? Should we return to our econometric roots to stick with older models that produce coefficients and standard errors? Or is there a combination of transparent and opaque data science techniques satisfy our desire for accuracy and simultaneously provide guidance around our own decision making?

At RepportBoost.AI, we believe the answer isn’t an either/or but rather a combination of both older and newer data science techniques to produce a better outcome from humans working together with machines. It’s a combination of data science to understand what predicts the outcome, econometric analysis to build some intuition around the relationships that exist, an engine that surfaces these recommendations in humans in real-time to A/B test different approaches, and then learning that occurs when we observe how those strategies influence the predicted outcome.

Let me offer up a relatively straightforward example. For one of our clients, we ingested over 2.8M of their customer service inbound and outbound messages and applied a variety of algorithms to engage in message clustering, keyword extraction, and timing calculation. In predicting the feedback score that visitors had given to the chat agent, many of the results confirmed their expectations and some insights surprised them, but one of the most surprising things we discovered was the fact that when agents sent a message with one of these keywords, that conversation had a 56% higher likelihood of ending with a dissatisfied customer (p<0.05): inconvenience, apologize, sorry, delay, confusion, apologies, difficulties, hassle, trouble, caused.

What would a machine deduce from this finding on its own? My guess is that a random forest or recurrent neural net would discover that conversation with an apology produces a negative outcome and it might avoid apologizing at all costs. But as human beings, we know that it’s probably not the apology that caused the customer to walk away unhappy; it’s probably the fact that they were upset in the first place, which necessitated an apology. Human beings can instead use this intuition – apologies are a bad sign – to A/B test different responses to the situation. What if the agent offers the customer free shipping? What if the agent empathizes and about how frustrating the situation is? How might the right reaction depend on the type of customer that’s on the other side of the chat window?

As with most artificial intelligence technologies, a combination of human + machines – or something we like to call “Human in the Loop” – yields a better outcome than either one alone. Even after Deep Blue beat Kasparov in chess roughly 20 years ago, we subsequently discovered that humans utilizing a computer were better than either one alone. One is capable of analyzing mountains of data to generate heuristics about what works best in any given situation while the other uses this information to make the ultimate decision and can provide a sanity check against actions that may seem optimal numerically but don’t necessarily conform with the optimal strategy. It leverages the biggest strengths of each party and our data suggests that it yields a far superior outcome than pitting one against the other.

To access the full-length article, click here.