Response to Norvig and Chomsky debate

This debate is, in my opinion, quite useless as we can see it’s the difference in the way this 2 men define the world of science. Chomsky believe that the study of Science means to give an insight of the world while Norvig support the ideas that we must using models, which portray the insight, and then deduce the insightful theory subsequently. However, in my opinion, it is a nobody win situation where their ideas can go on forever.

First of all, the dispute over the “insight” and “description” is meaningless because Chomsky himself also declared in the interview that “some of the modules may be computational, other may not be” and it is exactly the counter of his theory. Some part of the science can be “insight”, while other may not be and thus, scientists need some replacements, something can help them to observe as accurately as possible to come up with the final “insight” and the “description” is there. The other way round is easier as Norvig himself also claims the importance of “insight” in science and just want Chomsky to admit the importance of “description” in science.

Secondly, It’s quite clear that, despite learning language is an innate ability, learning language is a process of absorbing knowledge. Most studies in the world have shown that it is better to learn a language through listening and reading- that is, we do not learn a language by hard studying and applying it, or memorizing and producing it. Rather, we try to understand the knowledge we are taking in, through the comprehensible input collected through interacting with the world, and develop our vocabularies and grammars on those bases. That’s why, in my opinion, the difference between an adult and a child is not how their brains treat the language (as a puzzle or as a language, according to Chomsky), but their learning environments. One truth is that children are better at mimicking and not preoccupied by the learned languages like adults, but those reasons only help them better than adults at pronunciation, which is irrelevant to computers. Adults have averagely, more challenging environment than what most children have. A child, since it was born, has nothing to do but leaning a language while an adult would have other concerns, together with learning a language, take over his life. The devotion is much less while the expectation is more, adult’s standard of fluency is much higher than a child’ standard, and thus lead to the presumption of a child is better at learning a language. Computers, likewise, acts as an infant demanding the feeds of knowledge- the set of parameters- before being able to communicate.

The question,then, is to find out the human learning process that we can mimic for computer.From my experience, most of my friends who live in English speaking countries will score better than the foreign students at grammatical multiple choice questions, but seem less likely to be able to explicitly explain the reasons behind their choices.The answer mostly received would be “because they/it sound(s) familiar”. Where does that “familiar” come from? Studying language naturally when you are growing up is a progress of hearing-data inputting- and recalling when heard again- internally processing of data. Thus, it is not wrong for Chomsky to criticize the ineffectiveness of old Markov’s chain, because it fails to recognize the familiar trends which help us differentiate the grammatical versus ungrammatical. However, as Norvig said, it is a 50-years-old mechanism that I believe, in this fast changing world, will soon replaced by better probabilistic and statistic mechanism. On the other hand, Norvig has also shown that even with the old Markov’s chain, probabilistic models are still better than Chomsky’s theory at the degree of sensibility of a sentence than while Chomsky’s is better at differentiating the grammatical/ungrammatical. The irony here is, the algorithmic system (probabilistic  model) is used to judge a intricate set of words for its meaning while the unsystematic algorithm (Chomsky’s theory) is used to judge how well a sentence has followed a set of rules made by a language.






3 thoughts on “Response to Norvig and Chomsky debate

  1. It is interesting that you mention that the debate seems irrelevant and moot. The Chris Anderson “Wired” article seems to assert that both sides of the argument no longer matter, and all that is important at this stage is drawing conclusions from a large mass of data. According to the article, J. Craig Venter did not follow the scientific method in his sequencing method, and rather than sequencing one organism at a time sequenced whole eco-systems. Instead of restricting variables, he threw caution to the wind and tried to handle and classify big data, and from that was successful in discovering many new species. Not only a faster method, it accomplished a lot more and according to the article “Venter has advanced biology more than anyone else of his generation.” Venter also didn’t consider why these organisms exist, but rather seemingly worked backwards. He reached his conclusion first, and then why this all exists is an afterthought. According to Chomsky, this is the problem with modern science. It has all of these theories and ideas, but not a strong foundation of why. The era of big data analysis seems to go against both Norvig and Chomsky, as both a proper method as well as discovering the “why” is overlooked in the name of efficiency.

  2. On Sam’s comment:

    Anderson’s article posits that Chomsky’s stance is invalid as sequencing is merely applying all probabilistic outcomes through brute force. Human mind is incapable of conceiving immaculate spectra* hence any human construction is represented -as Chomsky denotes on linguistics- discrete units. Probabilistic outcomes of discrete units is finite, and sequencing is the iteration of these finite possibilities.

    On Tri’s response:

    You posit that insight is superfluous as brute force of our contemporaneity can achieve impeccable mimicing technology, as a computer that can mimic human learning.

    Suppose a human is a black box and the learning process is latent for any external observers. A computer is wrought to imitate this learning process by polling inputs and outputs of this black box. Algorithm-wise, if complexity of learning process within the black box is O(n)***, the complexity of imitation would be O(n^n) -superpolynomial- since imitating computer must compare all viable outcomes.

    The computers in our contemporaneity are capable of working in this utter inefficiency for the limited input they are presented with. If these computers are to be conveyed to real-time application, this inefficiency will imperatively fail as real-life application convey self-perpetuation hence -as for the survival of any commodity released for real market- the technology must adapt to increasing input size.

    Only insight can efface this inefficiency. Optimizations of any algorithm is an application of insight, and as n goes to infinity, only insight would be able to survive. In accord with Chomsky’s stance, relinquishing the aim of insight would be, thus, perdition of human development as human would be bound to be puppeteered by self-actualization of data accumulation.


    * Derived from Tolstoy’s romantic** narrations in War and Peace
    ** Romantic, as in literary romanticism
    *** This is big-O notation for time complexity, which denotes the order of relation between growth of input and the growth of computing time.

  3. While I agree that the debate over the importance of experiment and theory seems irrelevant because both are required for scientific progress. I disagree that big data analysis will radically change the way that scientific research is done. I think that it is more accurate to look at big data as a new tool for experiment instead as a replacement for it. What Venter did was use big data to get more exact readings of an ecosystem, while he did this without conforming to the traditional trappings of experiment; few variables, controlled conditions, ect, the results are the same: data was gathered and new clues to the inner workings of a system were collected.

    While it is tempting to say that big data analysis will replace the old theory based system of science. After all, why bother to create complicated prictive theories about what will happen to X when Y happens in a system when you can just make it happen and see? One must remember that a similar pattern of thoughts becomes prevalent every time a new advance in experimental techniques occurs, for example the tunneling microscope was supposed to give us a unifying model of atomic and subatomic interactions.

    In reality, what we are seeing is the lag between experiment and theory. This new technique in experiment will soon be followed by new practices in theory to even out the two halves of science once more. While I decline from making predictions as to what this new practice will look like, I speculate that big data analysis will play a part, providing theorists a new tool, just as it did for experimental scientists.

Leave a Reply to martinez.jose1 Cancel reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>