Next: 3. Machine Translation Roentgenized
Up: Machine Translation in Practice
Previous: 1. Introduction
  Contents
Subsections
I'd like to start with a quotation of Bloomfield.
Utterances about language may be called SECONDARY RESPONSES to
language. [...] our culture maintains a loosely organized but fairly uniform
system of pronouncements about language. Deviant speech forms in dialects
other then the standard dialect are described as corruptions of the standard
forms ('mistakes', 'bad grammar') or branded as entirely out of bounds, on a
par with the solecisms of a foreign speaker ('not English'). The forms of
the standard dialect are justified on grounds of 'logic'. [...]
Bloomfield mentions several wide-spread misconceptions about language. The
problem with understanding language is that everybody believes to understand
it. But using something all-day does not imply to really understand it. There
is a gap between theory and application and it is very difficult to deduce
from application to theory, which, in my opinion, is the only way to approach
language2.
The following statements are almost always plain wrong upon closer knowledge
of language and linguistics.
Language A is more ...... than language B ('logical', 'profound',
'poetic', 'efficient', etc. fill in the blank yourself).
The structure of language C proves that it is an universal language,
and everyone should learn it as a basis for studying other languages.
Language D and language E are so closely related that all their speakers can
always easily understand each other.
Language F is extremely primitive and can only have a few hundred words in it.
Language G is demonstrably 'better' than languages H, J and L.
The word for '......' (choose almost any word) in language M
proves scientifically that it is a worse -- better, more 'primitive' or
'evolved', etc. -- language than language N.
Any language is easy to master, once you learn the basic structure all
languages are built on.
It looks like languages cannot be categorized easily. It takes years to
study one language and build a theory about it. Not to talk about theories
about several or even all languages.
Less than ten years after Bloomfield's death, new observations were made
which "shared a certain philosophical groundwork with computational
linguistics, constitute the credo of the Chomskiyan approach." [Gross 1992, p.
107] Some of them would probably have been regarded as secondary
responses by Bloomfield as they do not emphasize studying languages but aim
for generalizations.
- All languages are related by a 'universal grammar'.
- It is possible to delineate the meaing of any sentence in any language
through knowledge of its deep structure and thereby replicate it in another
language.
- A diagram of any sentence will reveal this deep structure.
- Any surface-level sentence in any language can easily be related to
its deep structure, and this in turn can be related to universal grammar in
a relatively straightforward manner through a set of rules.
- These and related statements are sufficient to describe not only the
structure of language but the entire linguistic process of development and
acculturation of infants and young children everywhere and can thus serve as
a guide to all aspects of human language, including speech, foreign-language
training and translation.
- The similarity of these deep- and surface-level diagrams to the
structure of computer languages, along with the purported similarity of the
human mind to a computer, may be profoundly significant.
The Chomskiyan approach to language seems overly optimistic. At least points
4-6 are highly questionable. The first point can be assumed to be true since
the "universal grammar" only needs to be huge enough but this is of no
practical significance since the resulting grammar would probably be useless
for computers3.
To point 6 it is to say that the human mind is not similar to a computer.
They are fundamentally different: A computer usually takes one instruction
after another, computes the result and stores it somewhere. It works with
discrete values, mathematically correct. On the contrary, the human brain
works massively parallel -- millions of neurons simultaneously take
incoming signals, process them, possibly outputting signals themselves. The
signals are analog, each neuron is connected to various others, sometimes
crossing half of the brain. This "chaotic" shape should not be called
similar to the orderly architecture of a computer4.
Footnotes
- ...
language2
- Aproaching language from theory to practice seems to work
for languages in general but not for natural languages in special.
- ... computers3
- Chomsky did theoretical research on grammars and
languages which is important for Computer Science. His classification of
languages -- Chomsky's hierarchy -- is part of theoretical computer
science. It has been proven that there are semi-decidable grammars
producing semi-decidable languages. For these grammars, a computer will only
be able to figure out that a sentence is valid for the given grammar but it
will never be able to tell that it is not (the program runs infinitely long
in this case).
- ... computer4
- Research in
Artificial Intelligence (AI) led to so-called Neural Networks. They imitate
the way the human brain works but they have a capacity orders of magnitude
lower than the brain. It has been found that a human does not seem to be
able to understand a neural network although the basic principles are not
too complex. Neural networks "memorize" information the same way the human
brain does -- distributed over the whole network whereas computers store
information at discrete locations. It looks like the human brain might never
be able to completely understand itself as it has problems with processing
distributed external(!) information.
Next: 3. Machine Translation Roentgenized
Up: Machine Translation in Practice
Previous: 1. Introduction
  Contents
Tino Schwarze, 2001