Next: 3. Machine Translation Roentgenized Up: Machine Translation in Practice Previous: 1. Introduction Contents

Subsections

2. Common Misunderstandings about Language

2.1 Bloomfield's Secondary and Tertiary Responses to Language

I'd like to start with a quotation of Bloomfield.

Utterances about language may be called SECONDARY RESPONSES to language. [...] our culture maintains a loosely organized but fairly uniform system of pronouncements about language. Deviant speech forms in dialects other then the standard dialect are described as corruptions of the standard forms ('mistakes', 'bad grammar') or branded as entirely out of bounds, on a par with the solecisms of a foreign speaker ('not English'). The forms of the standard dialect are justified on grounds of 'logic'. [...]

[Bloomfield 1944]

Bloomfield mentions several wide-spread misconceptions about language. The problem with understanding language is that everybody believes to understand it. But using something all-day does not imply to really understand it. There is a gap between theory and application and it is very difficult to deduce from application to theory, which, in my opinion, is the only way to approach language². The following statements are almost always plain wrong upon closer knowledge of language and linguistics.

Language A is more ...... than language B ('logical', 'profound', 'poetic', 'efficient', etc. fill in the blank yourself).

The structure of language C proves that it is an universal language, and everyone should learn it as a basis for studying other languages.

Language D and language E are so closely related that all their speakers can always easily understand each other.

Language F is extremely primitive and can only have a few hundred words in it.

Language G is demonstrably 'better' than languages H, J and L.

The word for '......' (choose almost any word) in language M proves scientifically that it is a worse -- better, more 'primitive' or 'evolved', etc. -- language than language N.

Any language is easy to master, once you learn the basic structure all languages are built on.

[Gross 1992, p. 106-7]

It looks like languages cannot be categorized easily. It takes years to study one language and build a theory about it. Not to talk about theories about several or even all languages.

2.2 The Chomskiyan Approach

Less than ten years after Bloomfield's death, new observations were made which "shared a certain philosophical groundwork with computational linguistics, constitute the credo of the Chomskiyan approach." [Gross 1992, p. 107] Some of them would probably have been regarded as secondary responses by Bloomfield as they do not emphasize studying languages but aim for generalizations.

All languages are related by a 'universal grammar'.
It is possible to delineate the meaing of any sentence in any language through knowledge of its deep structure and thereby replicate it in another language.
A diagram of any sentence will reveal this deep structure.
Any surface-level sentence in any language can easily be related to its deep structure, and this in turn can be related to universal grammar in a relatively straightforward manner through a set of rules.
These and related statements are sufficient to describe not only the structure of language but the entire linguistic process of development and acculturation of infants and young children everywhere and can thus serve as a guide to all aspects of human language, including speech, foreign-language training and translation.
The similarity of these deep- and surface-level diagrams to the structure of computer languages, along with the purported similarity of the human mind to a computer, may be profoundly significant.

[Gross 1992, p. 108]

The Chomskiyan approach to language seems overly optimistic. At least points 4-6 are highly questionable. The first point can be assumed to be true since the "universal grammar" only needs to be huge enough but this is of no practical significance since the resulting grammar would probably be useless for computers³.

To point 6 it is to say that the human mind is not similar to a computer. They are fundamentally different: A computer usually takes one instruction after another, computes the result and stores it somewhere. It works with discrete values, mathematically correct. On the contrary, the human brain works massively parallel -- millions of neurons simultaneously take incoming signals, process them, possibly outputting signals themselves. The signals are analog, each neuron is connected to various others, sometimes crossing half of the brain. This "chaotic" shape should not be called similar to the orderly architecture of a computer⁴.

Footnotes

... language ²: Aproaching language from theory to practice seems to work for languages in general but not for natural languages in special.
... computers ³: Chomsky did theoretical research on grammars and languages which is important for Computer Science. His classification of languages -- Chomsky's hierarchy -- is part of theoretical computer science. It has been proven that there are semi-decidable grammars producing semi-decidable languages. For these grammars, a computer will only be able to figure out that a sentence is valid for the given grammar but it will never be able to tell that it is not (the program runs infinitely long in this case).
... computer ⁴: Research in Artificial Intelligence (AI) led to so-called Neural Networks. They imitate the way the human brain works but they have a capacity orders of magnitude lower than the brain. It has been found that a human does not seem to be able to understand a neural network although the basic principles are not too complex. Neural networks "memorize" information the same way the human brain does -- distributed over the whole network whereas computers store information at discrete locations. It looks like the human brain might never be able to completely understand itself as it has problems with processing distributed external(!) information.

Next: 3. Machine Translation Roentgenized Up: Machine Translation in Practice Previous: 1. Introduction Contents

Tino Schwarze, 2001