About the AI hype

As the black-box algorithm, neural networks, taking over image and natural language processing, the hype of near-human level AI is unprecedented. Driveless car, human-level chat robot, show suddenly the future of human history is approaching at a phenomenal speed. All the old school modeling techniques, such as mathematics and statistics are dead. Is it the end of modeling? Is it the end of human intellegence? As I study more and more about neural networks, especially through two projects Kaggle ARC and experiment of transformer for time series, it becomes clear to me that the hype is hoping too much ‘unrealistic fantasy’. Here are some of the reasons I learned from an experiment.

Deep learning has not been successful in simple logic inference!

  • Abstraction and Reasoning Challenge(ARC) is a Kaggle competetion hosted by François Chollet, creator of the Keras neural networks library. The competetion introduced a simple logical inference problem mixed with computer vision that human can do easily but computers can not. In the competition, participants needs to create an AI that can solve reasoning tasks it has never seen before. The result is not suprising, no one has able to come up with an AI to solve this problem. The winner of the competetion icecuber used hand-crafted pattern tranformations coded in more than 10k C++ program yet only 21% of test data is solved by this solution. Many participants including me, tried CNN networks, decision tree, etc. but none worked better than the solution from icecuber which hard coded predefined pattern transformations recogonized by human. We, human, are far more advanced than any computer in dealing with abstraction and reasoning. The ARC task is ‘too uncertain’ for computers to learn and perform well.

Language can be memorized but time series can not!

  • From a transformers experiment, we can confirm that Neural network in general does not outperform the classical wisdom of decoposing time series. In the experiment, one consistently see Arima outperform Neural networks. So what is going on?
    1. Why does deep learning fail to help us? There are several reasons.
      • The bottom line is learning language is very different from learning to predict time series. Time series data come naturally from dynamics of a certain real world system and it is usually continuous and volatile which made it is hard to capture the pattern completely. On the other hand, language can be learned completely and characterized 100% by human. With minimal degradation, we encoded language with letters and symbols so we can communicate long range both in time and space. Due to the very large capacity of a transformer-type neural network model, it can simply memorize this encoding and make ‘human-natural’ predications of language. While time series does not have the luxuary of ‘memoriz-ability’ and usualy exhibits chaotic behavior.
      • Continuous vs discrete. Language naturally is discrete and can be easily embeded into a low dimensional space. It is possible to memorize the seperation of discrete signals. On the other hand, continous time series signal is more adapted to a decomposition type modeling such as Fourier and wavelet transformations.
    1. Is calssical statistical method the only way to model time series?
      • The answer is No. When dataset is large and complex, classical method is too simple and naive which losses much of the details in the data. Then there is an alternative solution is to handcraft features and use tree-type structure discovering models, for example Gradient Boosting Tree.

Where is deep learning going; from where near-human AI is coming?

  • Deep learning has to make it understandable in order to improve itself.
    • Deep learning usually use large neural networks with complex structure involving thousands of parameters. It requires to feed in large amount of data and train a very long time. Then with good luck (initialization, good regularization, etc.), a good predicting blackbox is obtained from all this hard work. However, do we really understand the prediction? Do we have a sense of what the model really can do and can not? NO! The answer is a hard NO! Right now, deep learning is a black-box. There is no way to understand what each parameter really does and why the model performs sometimes. We only solved the original prediction problem partially, and in fact very superficially. We don’t really understand what helped to solve the problem and what will lead to a failure.
    • So far, deep learning has only produced a promising engineering prototype, without knowing the science behind the tool. Major theoretical developments are still awaiting. Deep learning right now is very similar to the milestone manned flight by Wright brothers. Partial engineering success encourages more and more talented minds jump in the creation of areospace enginerring and related fields. Many lives died to test and improve the aviation technology in the last century. There might be a similar path for creating near-huamn AI from deep learning. But without true understanding, that dream can only hang in the air.

Reference:

  1. Time series
  2. Kaggle ARC


Written on June 14, 2021