Deep Learning: A Critical Appraisal on ShortScience.org

arxiv.org
scholar.google.com

Deep Learning: A Critical Appraisal
Gary Marcus
arXiv e-Print archive - 2018 via Local arXiv
Keywords: cs.AI, cs.LG, stat.ML, 97R40, I.2.0; I.2.6
more

Summaries/Notes 1

[link] Summary by Pavan Ravishankar 5 years ago

Deep Learning has a number of shortcomings.

(1)Requires lot of data: Humans can learn abstract concepts with far less training data compared to current deep learning. E.g. If we are told who an “Adult” is, we can answer questions like how many adults are there in home?, Is he an adult? etc. without much data. Convolution networks can solve translational invariance but requires lot more data to identify other translations or more filters or different architectures.

(2)Lack of transfer: Most of claims of Deep RL helping in transfer is ambiguous. Consider Deepmind claim of concept learning in Breakout such as digging a tunnel through a wall which was soon proved false by Vicarious experiments that added wall in middle and increased Y coordinate of paddle. Current attempt of transfer is based on correlations between trained sequences and test scenario, which is bound to fail when current scenario is tweaked.

(3)Hierarchical structure not learnt: Deep learning learns correlations which are non-hierarchical in nature. So sentences like “Salman Khan, who was excellent driver, died in a car accident” can never be represented as major clause(Salman Khan) and minor clause(who was excellent driver) format. Subtleties like these cannot be captured by RNN even though hierarchical RNN tries to capture obvious hierarchies like (letters -> words -> sentences). If hierarchies were captured in Deep RL, transfer would have been easy in Breakout which is not the case.

(4)Poor inference in language: Sentences that have subtle differences like “John promised Mary to leave” and “John promised to leave Mary” are treated as same by deep learning. This causes major problems during inferencing because questions related to combining various sentences fail.

(5)Not transparent: Why the neural network made the decision in a certain way can help in debuggability and prove to be beneficial in medical diagnosis systems where it is critical to reason out methodology.

(6)No priors and commonsense reasoning: Humans function with commonsense reasoning(If A is dad of B, A is elder to B) and priors(physics laws). Deep Learning does not tailor to incorporate this. With heavy interest in end to end learning from raw data, such attempts have been discouraged.

(7)Deep Learning is correlation not causation: Causality or analogical reasoning or any abstract concepts of left brain is not dealt by deep learning. (8)Lacks generalization outside training distribution: Fails to incorporate scenario in which nature of data is varying. E.g. Stock prediction. (9)Easily fooled: E.g. Parking signs mistaken for refrigerators, turtle mistaken as rifle.

This can be addressed by:
(1)Unsupervised learning: Build systems that can set their own goals, use abstract knowledge(priors, affordances as objects can be used in any way etc) and solve problem at high level(like symbolic AI).

(2)Symbolic AI - Deep Learning does what primary sensory cortex does of taking raw inputs and converting it into low level representation. Symbolic AI builds abstract concepts like causal, analogical reasoning which is what prefrontal Cortex does. Humans make decisions based on these abstract concepts.

Your comment:

Write your summary here (You can use $\LaTeX$ and markdown syntax):

Anon Private