-
Notifications
You must be signed in to change notification settings - Fork 0
/
Analytic_SyntacticStructures_1957.txt
5016 lines (4429 loc) · 257 KB
/
Analytic_SyntacticStructures_1957.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
Syntactic Structures
W
G
DE
Mouton de Gruyter
Berlin • New York
Syntactic Structures
by
Noam Chomsky
Second Edition
With an Introduction by David W. Lightfoot
Mouton de Gruyter
Berlin • New York 2002
Mouton dc Gruyter (formerly Mouton, The Hague)
is | Division of Walter de Gruyter GmbH & Co. KG, Berlin.
First edition published in 1957. Various reprints.
@ Printed on acid-frcc paper which falls within the guidelines
of the ANSI to ensure permanence and durability.
ISBN 3-I1-OI7279-8
Bibliographic information published by Die Deutsche Bihliothek
Die Deutsche Bibliothck lists this publication in the Deutsche
Nationalbibhografie; detailed bibliographic data is available in the
Internet at <http://dnb.ddb.de>.
© Copyright 1957, 2002 by Walter dc Gruytcr GmbH & Co. KG, 10785
Berlin.
All rights reserved, including those of translation into foreign languages. No
part of this book may be reproduced in any form or by any means, electronic
or mechanical, including photocopy, recording, or any information storage and
retrieval system, without permission in writing from the publisher.
Printing & Binding: Werner Hildcbrand, Berlin.
Cover design: Sigurd Wendland, Berlin.
Printed in Germany.
Introduction*
Noam Chomsky's Syntactic Structures was the snowball which began
the avalanche of the modern "cognitive revolution." The cognitive perspective originated in the seventeenth century and now construes modern linguistics as part of psychology and human biology. Depending
on their initial conditions, children grow into adults with various language systems, some variety of Japanese if raised in Tokyo and Cornish English if raised in the village of Polperro. Linguists seek to describe the mental systems that Japanese or Cornish people have, their
language "organs." These systems are represented somehow in human
mind/brains, are acquired on exposure to certain kinds of experiences,
and are used in certain ways during speech comprehension or production and for a variety of purposes: communication, play, affects,
group identity, etc. Linguists also specify the genetic information, common to the species, which permits the growth of mature language organs in Cornish. Japanese. Dutch, Kinande and Navaho children.
The snowball has gained bulk and speed along these naturalistic
lines over the last fifty years. The variety of languages, the developmental patterns manifested by young children, the ways in which mature systems arc underdetermined by childhood experience, have provided a wealth of discoveries and of empirical demands for theories to
meet, opening the prospect for more empirical demands as we begin
to understand the brain mechanisms that might be involved in understanding and producing speech. This kind of work on the growth of
an individual's language capacity has influenced people studying other
aspects of human cognition, where the empirical demands on theories
are partially different and where it is harder to tease apart the contributions of nature and nurture. Philosophers, psychologists, neuroscientists and even immunologists (see Jerne 1967 and his 1985 Nobel
Prize address) have engaged with this work. That is the avalanche and
it has affected many parts of the cognitive mountainside; Anderson
and Lightfoot (2002) provides a recent survey of some of the work set
in motion and Chomsky (2000) gives his current views
It is interesting to look back from here on the snowball. Snowballs
always begin small and few people write books of just 118 short pages.
However, what is striking about this little book is that it contains
VI
Introduction by David W. Lightfoot
nothing on cognitive representations, nothing on grammars as mental
systems triggered by childhood exposure to initial linguistic experiences. Chomsky arrived at some conclusions and developed some lines
of thought which naturally provoked a radical rc-thinking of the status
of grammatical descriptions, but. to judge from the earliest texts, it
appears that he did this without any particular concern for cognitive
representations.
The best discussion of these early years is the introduction to the
version of the dissertation which was published in 1975 (77ie logical
structure of linguistic theory, henceforth LSLT). There Chomsky, writing in 1973, said that there would have been little notice of Syntactic
Structures in the profession if Robert Lees had not written an extensive
review, published in Language more or less simultaneously with the
appearance of the book. But that review touched only briefly on the
matter of mental representations. Furthermore, Chomsky's judgement
may be too modest; the book was well-received in a number of reviews, and the work had a major impact quickly, notably through
Chomsky's presentation at the Third Texas Conference in 1958 (published as Chomsky 1962), although not everything went smoothly: the
dissertation was rejected for publication by the Technology Press of
MIT and an article was rejected by the journal Word.
Syntactic Structures itself consisted of lecture notes for undergraduate classes at MIT, which C. H. van Schooneveld offered to publish
with Mouton. "a sketchy and informal outline of some of the material
in LSLT" (Chomsky 1975: 3). So these arc the three central texts from
this period: LSLT, Syntactic Structures, and Lees' review. It is also
useful to look at the earliest analytical work of the new paradigm:
Klima (1964), Lees (I960) and (Lees and Klima 1963), for example.
However, one should keep in mind in addition that Chomsky was
working on his review of Skinner's acclaimed Verbal behavior (Chomsky 1959); the review was submitted in 1957 and sketched a way of
looking at psychological behavior quite dirtcrcnt from the prevailing
orthodoxies.
The book was "part of an attempt to construct a formalized general
theory of linguistic structure ... by pushing a precise but inadequate
formulation to an unacceptable conclusion, we can often expose the
exact source of this inadequacy and, consequently, gain a deeper understanding of the linguistic data" (p.5). This was a foretaste of a
strategy that Chomsky has pursued throughout his career, always will-
Introduction by David W. Lightfoot
VII
ing to formulate proposals in accurate detail in order to sec where
the weaknesses lie, then reformulating, sometimes in radical fashion,
moving much snow on the mountainside; one thinks of the filters of
Chomsky and Lasnik (1977), the indexing conventions in the appendix
of Chomsky (1980), and the features of Chomsky (1995: ch.4), which
sought precision over elegance and biological plausibility and then
gave rise to major reformulations of linguistic theory. The immediate
goal of the new work was to formulate precise, explicit, "generative"
accounts, free of intuition-bound notions.
The fundamental aim in the linguistic analysis of a language L is to separate the grammatical sequences which are the sentences of L from the ungrammatical sequences which are not sentences of L. The grammar of L.
will thus be a device that generates all of the grammatical sequences of L
and none of the ungrammatical ones. (p. 13)'
Lees and others were impressed with the outline of what they took to
be a truly scientific perspective, and these were days of much concern
about the nature of science. Lccs viewed the book as
one of the first serious attempts on the part of a linguist to construct within
the tradition of theory-construction a comprehensive theory of language
which may be understood in the same sense that a chemical, biological
theory is ordinarily understood by experts in those fields. It is not a mere
reorganization of the data into a new kind of library catalog, nor another
speculative philosophy about the nature of Man and Language, but rather
arigorousexplication of our intuitions about language in terms of an overt
axiom system, the theorems derivable from it. explicit results which may
be compared with new data and other intuitions, all based plainly on an
overt theory of the internal structure of languages. (I.ees 1957: 377-8)
Chomsky begins Syntactic Structures, then, by aiming to construct
a grammar that can be viewed as a device of some sort for producing the
sentences of the language under analysis. More generally, linguists must be
concerned with the problem of determining the fundamental underlying
properties of successful grammars. The ultimate outcome of these investigations should be a theory of linguistic structure in which the descriptive
devices utilized in particular grammars are presented and studied abstractly, with no specific reference to particular languages. One function of
this theory is to provide a general method for selecting a grammar for each
language, given a corpus of sentences of this language, (p. II)
Vlll
Introduction by David W. Lightfoot
The issue of selecting a grammar in this formulation was one for analysts comparing theories, not for children. The celebrated discussion in
chapter 6 about the goals of linguistic theory, the distinction between
discovery, decision and evaluation procedures, is often cited as a discussion about what a child might be expected to do in the process of
acquiring his/her grammar. However, the text concerns the goals of an
analyst and combats the structuralist goal of seeking a discovery
method for grammars, whereby an analyst would follow the prescriptions of a manual, "mechanical procedures for the discovery of grammars" (p.55, n.6), and arrive at the correct description of some language. Chomsky argued, in contrast, that it was too ambitious to expect such a methodology and that the most realistic goal was to find
a way of comparing hypotheses for generating a particular corpus of
data. No talk of children but an effort to thwart the positivist notion
that one could discover a predefined path to scientific truth (cf. Popper
1959). "One may arrive at a grammar by intuition, guess-work, all
sorts of partial methodological hints, reliance on past experience, etc
... Our ultimate aim is to provide an objective, non-intuitive way to
evaluate a grammar once presented" (p.56).
In particular, there was no reason to expect a discovery method
whereby a successful phonetic analysis would permit a successful phonemic analysis, which would allow a good morphological analysis and
then a good syntactic analysis.
Once wc have disclaimed any intention of finding a practical discovery
procedure for grammars, certain problems that have been the subject of
intense methodological controversy simply do not arise. Consider the
problem of interdependence of levels, (p.56)
If units are defined by laxonomic procedures, then they need to be
constructed on lower levels before higher-level units are constructed
out of those lower-level units. However, once the goals are restricted
to achieve an evaluation procedure, one may have independent levels
of representation without circularity of definitions. Indeed, Chomsky
argued that analysis at higher levels (of syntax) might influence lower
(e. g. morphological) levels of analysis, and therefore that work on
syntax could proceed even though there may be unresolved problems
of phonemic or morphological analysis (p.59), perhaps to the advantage of the phonemic analysis.
Introduction by David W. Lightfoot
IX
This was the major METHODOLOGICAL innovation and the claim to a
genuinely scientific approach was based on the rigor of the formal,
explicit, generative accounts and on the move away from seeking a
discovery procedure in favor of an evaluation procedure for rating
theories.
Any scientific theory is based on a finite number of observations, and it
seeks to relate the observed phenomena and to predict new phenomena by
constructing general laws in terms of hypothetical constructs such as (in
physics, for example) "mass" and "electron." Similarly, a grammar of English is based on a finite corpus of utterances (observations), and it will
contain certain grammatical rules (laws) stated in terms of the particular
phonemes, phrases, etc., of Fnglish (hypothetical constructs), (p.49)
The TECHNICAL innovation was to motivate different levels of analysis
and representation, which were related to each other formally by the
device of a "transformational rule." That involved various claims
about the nature of grammars, that their primitives were independently defined, not a product of more basic semantic, functional or
notional concepts (chapter 2), that they could not be formulated
through finite-state Markov processes (chapter 3), and that restricting
rule schemas to those of phrase structure grammars yielded clumsiness
and missed insights and elegance which would be facilitated by operations relating one level of analysis to another, the so-called transformations (chapters 4, 5 and 7).
Chapter 5 offered three arguments for extending the expressive
power of grammars beyond that of unadorned phrase structure grammars, one relating to conjunction reduction, another relating to activepassive relations. The showcase analysis, however, the part of Syntactic Structures that excited the author (LSLT, 30-31) and had the
greatest effect on readers, was the new treatment of English auxiliary
verbs (section 5.3). Chomsky proposed a simple Auxiliary Transformation, later dubbed "affix hopping," whereby an affix like -ing, -en or
an abstract tense marker could be moved to the immediate right of
an adjacent verb (29.ii). This ingenious transformation, mapping one
abstract level of representation into another (not sentences into other
sentences), avoided hopelessly complex phrase structure rules and
yielded an elegant account for the distribution of the "periphrastic
Jo," which could be characterized now as occurring with "stranded"
affixes, which had no adjacent verb to hop over (p.62). He observed
X
Introduction by David W. Lightfoot
that "the grammar is materially simplified when we add a transformational level, since it is now necessary to provide phrase structure directly only for kernel sentences" (p.47).
Chapter 7, entitled Some transformations in English, extended transformational analysis to negative, interrogative and other sentencetypes, yielding further simplifications over pure phrase structure grammars. The transformations themselves provided evidence for certain
constituents (p.83) and the same units recurred in a number of operations, suggesting that genuine generalizations were being captured.
The fundamental aspects of the analysis of auxiliaries have survived
extensive discussion of almost 50 years. In current formulations a
central parameter of grammatical variation lies in how verbs may be
connected to their tense markers, either as a result of a syntactic operation raising a verb to a higher functional category containing the tense
marker (as in French, cf. Emonds 1978) or what is now seen as a
morphological operation lowering the tense marker on to an adjacent
verb (as in modern English, cf. Lightfoot 1993, Baker 2002), Chomsky's (1957) Auxiliary Transformation - Lasnik (2000) offers detailed
discussion of this distinction and its relation to the proposals of Syntactic Structures,
Always concerned to formulate as precisely as possible, Chomsky
pointed out that the analysis entailed an ordering in the application
of transformational rules (p.44) and a distinction between optional
and obligatory rules (p.45). That yields precision but also problems, if
one views grammars as acquired by children exposed only to primary
data. If two rules must be ordered or if a rule needs to be classified as
obligatory, then somebody viewing grammars as aspects of mature
cognition would wonder how that could be triggered in a child. If the
two rules are misordered or if an obligatory rule applies optionally,
the grammar would generate non-occurring sentences or structures.
Those non-occurring sentences constitute the linguist's evidence for the
ordering or the obligatory character of rules, but that evidence is not
available to young children. If the sentences don't occur, they can't be
part of the primary data, not part of what a child experiences, and
we have a grammar which cannot be triggered and is "unlearnable".
However, this was not an issue in 1957, when precision was the overriding goal and matters of learnability had not yet been raised explicitly.
The last substantive chapter of Syntactic Structures deals with syn
tax and semantics, a relationship which has been widely misunder-
Introduction by David W. Lighlfool
xi
stood. Chomsky argued that grammars are autonomous and independent of meaning in the sense that their primitives are not defined in
semantic terms (p. 17). That "should not, however, blind us to the fact
that there are striking correspondences between the structures and elements that arc discovered in formal, grammatical analysis and specific
semantic functions" (p.101). So the units of syntactic analysis, syntactic constituents, are almost identical to the units of semantic analysis:
the ambiguity of / saw the man with a telescope is reflected in two
syntactic structures, one where a man with a telescope is a constituent
and one where it is not. The work assumes a use-theory of meaning, that
grammars arc embedded in a broader scmiotic theory which USES the
grammar to determine the meaning and reference of expressions. There
are striking correspondences between syntactic and semantic properties and the study of "the structure of language as an instrument may
be expected to provide insight into the actual use of language" (p. 103);
to argue that syntactic primitives are not defined scmantically is not
to deny connections between form and meaning (for discussion, see
LSLT. 18-23 and Lees 1957: 393-5).
Syntactic Structures, of course, reflected the ambient intellectual culture of the mid-1950s in some ways. Chomsky offered operational definitions of well-formed sentences of a kind that a behaviorist psychologist could understand: they did not need to be "meaningful" or "significant" in any semantic sense (p. 15), not statistically frequent; they
could be read with normal intonation contours, recalled readily, and
learned quickly. He carried over the notion of KERNEL sentences from
his teacher Zellig Harris (1951), reformulating the notion as referring
to sentences which had undergone no optional, only obligatory transformations (p.45); LSLT(4\ -45) oilers detailed discussion of the relation between Harris' transformations and Chomsky's early work. Indeed, one might argue that Syntactic Structures reflected existing practice in its silence on matters of cognition: there is reason to believe that
structuralists were concerned with matters of cognition and wanted
analyses which were psychologically plausible, but the concern was
implicit.
The methodological innovations have endured, and likewise many
of the technical proposals. Chomsky (1995) has revived the distinction
between singulary and generalized transformations, the former affecting single structures and the latter embedding clauses within other
structures. That distinction was abandoned after Syntactic Structures.
Ml
Introduction by David W. Lightfoot
replaced in Chomsky (1965) by the principle of the cyclic application
of rules, affecting most deeply embedded domains first, and then
working up through less deeply embedded domains sequentially.
One can identify three phases in work on generative grammar. The
first phase, initiated by Syntactic Structures and continuing through
Aspects of the theory of syntax (Chomsky 1965). elaborated the expressive power of grammars to include different levels of representation
(Syntactic Structures) and a lexicon (the major technical innovation of
Chomsky 1965). The second phase, beginning in the 1960s and culminating in Government and Binding models, sought to constrain the
expressive power of derivations, so that operations became very general, along the lines of "Move something,'' and general principles of the
theory of grammar ("Universal Grammar" by the 1960s) constrained
such operations to apply appropriately. The third phase has sought
substantive economy principles beyond the methodological clippings
of Ockham's razor, under the Minimalist Program of Chomsky (1995).
The technical advances have been considerable (far too considerable
to outline in a brief introduction) and we have learned vast amounts
about the properties of individual languages, about the developmental
stages that children go through, about the kind of variation that grammatical parameters permit (Baker 2001 insightfully analogizes the
parametric approach to work on the periodic table of elements which
underly all chemical substances).
These developments have taken place as linguists have taken seriously the cognitive perspective, viewing grammars as mental systems
which grow in children on exposure to primary data, subject to the
prescriptions of a genetically specified theory of grammars which permits only certain structures and options. And they have taken place
as linguists have taken their mathematical models seriously. A clear
example of this was an ad hoc principle which was required with the
introduction of a distinct Binding Theory (Chomsky 1981). The ungrammatically of 'They expected that each other would leave was ascribed to Principle A of the new Binding Theory, leaving no account
for the non-occurrence of 'They were expected would leave, featuring a
displaced they; this had formerly been attributed to the same indexing
conventions which had excluded the sentence with the reciprocal each
other. The response was a handful of snow to block the gap, precisely
shaped and called the RES-NIC, the residue of the earlier Nominative
Island Constraint, which accounted for the mismoving they. This irreg-
Introduction by David W. Lightfool
XIII
ular snowball gave rise, in turn, to its own avalanche, the Empty Category Principle, a condition on the positions from which elements
might be displaced. That principle yielded many new discoveries about
a host of languages over the course of the 1980s (see, for example,
Rizzi 1990). It is not clear whether single ideas can be extrapolated
from general systems and be awarded a prize for being the most productive, but the ECP would be a candidate and, in any case, illustrates
the productiveness of taking the models seriously as predictive mechanisms.
What is remarkable about Syntactic Structures is how easily its
claims were translatable into claims about human cognition, as
Chomsky was to make explicit in his review of Skinner and then, famously, in the first chapter of Chomsky (1965). There he redefined the
field in more fundamental fashion and linked it to work on human
psychology; from then on, matters of acquisition became central to
linguistic theorizing. The easy translation is the reason that the little
book, having no discussion of matters of cognition, is nonetheless
plausibly seen as the snowball which started it all. The discussion
about the goals of linguistic theory, for example, were straightforwardly translated point-for-point into criteria for a theory of language
acquisition by children: the theory provides an evaluation metric by
which children rate the success of candidate grammars for the
purposes of understanding some finite corpus of data embodied in
their initial linguistic experiences, converging ultimately on the most
successful grammar.
Before he wrote the introduction to LSLT, Chomsky had come to
view grammars as representations of fundamental aspects of the
knowledge possessed by a speaker-hearer, i.e. as claims about psychology (LSLT, 5). Furthermore, there was a precise analog between the
methodological claims of Syntactic Structures and LSLT dnd psychological claims about human cognition.
The construction of a grammar of a language by a linguist is in some
respects analogous to the acquisition of language by a child. The linguist
has a corpus of data; the child is presented with unanalyzed data of language use. {LSLT, II)
The language learner (analogously, the linguist) approaches the problem
of language acquisition (grammar construction) with a schematism that
determines in advance the general properties of human language and the
general properties of the grammars that may be constructed to account for
linguistic phenomena. (LSLT, 12)
XIV
Introduction by David W. Lightfoot
We thus have two variants of the fundamental problem of linguistics, as
it was conceived in this work: under the methodological interpretation, the
problem is taken to be the justification of grammars: under (he psychological interpretation, the problem is to account for language acquisition
Under the methodogical interpretation, the selected grammar is the linguist's grammar, justified by the theory. Under the psychological interpretation, it is the speaker-hearer's grammar, chosen by the evaluation procedure from among the potential grammars permitted by the theory and
compatible with the data as represented in terms of the preliminary analysis. (LSLT, 36)
The "psychological analog" to the methodological problem of constructing linguistic theory was not discussed in LSLT, but Chomsky
wrote that it lay in the immediate background of his own thinking:
"To raise this issue seemed to me, at the time, too audacious" (LSLT,
35). So, the "realist" position was taken for granted but not discussed.
because too audacious. Hence the ease of taking the general theory as
a psychological theory that attempts to characterize the innate human
"language faculty." 2
There is plenty of internal evidence that Chomsky was interacting
at this time with Eric Lenneberg, who was to pursue the biological
perspective on language earliest and furthest (Lenneberg 1967). However, the only explicit discussion of the cognitive/biological perspective
in our central texts was the final section of the review, where Lees
raised these issues in a kind of mystified optimism.
Perhaps Ihc most baffling and certainly in the long run by far the n o d
interesting of Chomsky's theories will be found in their cohesions with the
field of human psychology. Being totally incompetent in this area. 1 shall
allude to only one possible consideration, but one which 1 find extremely
intriguing. If this theory of grammar which we have been discussing can
be validated without fundamental changes, then the mechanism which we
must attribute to human beings to account for their speech behavior has
all the characteristics of a sophisticated scientific theory. (Lees 1957: 406)
... If we are to account adequately for the indubitable fact that a child by
the age of five or six has somehow reconstructed for himself the theory of
his language, it would seem that our notions of human learning are due
for some considerable sophistication. (Lees 1957: 408)
And considerable sophistication there has been. Most subsequent work
on language learning has followed Syntactic Structures in seeking theories which evaluate grammars relative to a corpus of utterances. That
Introduction by David W. Lightfoot
x\
is true of Chomsky (1965). Chomsky and Halle (l%8), and of more
modern work. Robin Clark's Fitness Metric measures precisely the fitness of grammars with respect to a set of sentences (I).
1. Fitness Metric (Clark 1992)
( 2 J . , *j + * 2 " . ,*, + <• S"_, e,J - (v, + bSi + Ofj)
t"-l>(?. 1 * + '?'-.*> + <2j-,«)
where
i the number of violations signaled by the parser associated with a given
parameter setting:
s, = the number of superset settings in the counter; ft is a constant superset
penalty > I;
tt = the measure of elegance (= number of nodes) of counter i; r < 1 is a
scaling factor
The central idea here is that certain grammars provide a means to
understand certain sentences and not others; that is, they generate certain sentences but not others. The Fitness Metric quantifies the failure
of grammars to parse sentences, the "violations", v. The sum term,
sigma, totals all the violations of all grammars under consideration,
perhaps five grammars with a total of 50 failures or violations. We
then subtract the violations of any single grammar and divide by the
total \iolations (multiplied by n - 1). This provides a number which
grades candidate grammars. For example, if one candidate has 10 violations, its score is 50-10, divided by some number; if another candidate has 20 violations, its score is 5 0 - 2 0 , divided by that number, a
lower score. (There are two other factors involved in the equation, a
superset penalty s and a measure of elegance e, but they are subject to
a scaling condition and play a lesser role, which I ignore here.) I have
sketched Clark's Fitness Metric because it is the most sophisticated
and precisely worked out evaluation measure that I know. What it and
other such evaluation measures do is rate grammars against a set of
data, as outlined in 1957.
Although most work on learnability has followed the goals laid out
for linguistic theories in Syntactic Structures, future work may permit
a different approach to acquisition, providing a kind of discovery pro-
xvi
Introduction by David W. Lightfoot
ccdure: children may seek certain elements of grammatical structure,
cues. These cues would be prescribed at the level of UG in a kind of
menu from which children select. Some cues represent parameters of
variation (i.e. occurring in some grammars but not others) and all
cues would be found in the mental representations which result from
parsing phrases and sentences that children are exposed to (Dresher
1999. Fodor 1998, Lightfoot 1999). Some of those phrases and sentences would "express" cues, requiring the cued structure for understanding. Under this view, children acquire a mature grammar by accumulating cued structures and children (and linguists) have a kind of discovery procedure.
Nearly fifty years ago Chomsky argued for explicit rigor, for various
levels of representation provided by a theory of grammar, and for
seeking a precise evaluation metric to compare grammars. After almost fifty years of enrichment, we can revisit this matter and many
others in the light of subsequent work developing theories of grammar
and spelling out the details of Universal Grammar, seen now as defining the language faculty. The ongoing investigation of this part of
human cognition has been immensely productive; we have witnessed
the "considerable sophistication" that Lees predicted. We have learned
a vast amount about many languages and we have learned different
kinds of things: generativists adopting the biological, cognitive perspective have focused on where generalizations break down, because
that reveals poverty-of-stimulus problems which illustrate information
conveyed by the linguistic genotype, UG (see The Linguistic Review
19.1, 2002, for extensive discussion of the role of poverty-of-stimulus
arguments). The avalanche has also given rise to new kinds of studies
of language acquisition by young children, new experimental techniques, explorations of language use in various domains, new computer models, new approaches to language history, all shaped by the
core notion of a language organ. It has also spawned new approaches
to old philosophical questions, notions of meaning and reference, and
Chomsky has taken the lead in this area (for example, Chomsky 2000).
The cognitive analysis of language was re-energized by this remarkable
snowball and it will continue for much longer, because much remains
to be done now that the perspective has been widened.
Georgetown, July 2002
DAVID W. LIGHTFOOT
Introduction by David W. Lightfool
xvii
Notes
' 1 am graieful to colleagues who have commented on an earlier draft of this introduction Noam Chomsky, Morris Halle. Norbcrt Hornstein. and Neil Smith They are all
old enough to remember 1957, appearances to the contrary.
1 Bear in mind that these were notes for undergraduate classes. Students were not familiar with Markov processes and needed to be introduced to string generating mechanisms. This material docs not feature in LSLT, presumably because it was remote
from empirical issues The idea was to show (hat much richer systems were inadequate
for natural language and that the goal should be the strong generation of structures,
there being no grammatical-ungrammalkaJ distinction, as elaborated in l~S/.7"(Chapter 51
2 Linguists who restrict themselves to mathematical descriptions might be said, generously, to be following the example of Syntactic structures, hoping that their analyses
will be as automatically translatable into psychological claims as the ideas of 19S7.
However, they deprive themselves of the wealth of empirical demands and guidance
which come specifically from the psychological dimension, matters of Icamahility, etc.
It is worth noting in the shade of a footnote that linguists pursuing the cognitive
perspective postulate genetic information needed to solve acquisition problems. By
discovering more and more about the genetic information required, we hope one day
to learn something about the appropriate form of that information, a different and
more ambitious matter. Hypotheses abound but nobody is under any illusion that
current ideas come anywhere near what will be needed to unify our hypotheses with
what is known about genetic structure (if that is ever possible). It may meanwhile be
correct to say that certain information is required, but that just means that at this
stage of understanding we arc doing Mendelian genetics.
References
Anderson. S. R. and D. W Lightfool
2002
The language organ: Linguistics as cognitive physiology. Cambridge:
Cambridge University Press.
Baker. M. A.
2001
The atoms uf language. New York: Basic Books.
2002
Building and merging, not checking: The nonexistence of (Aux)-S-V-O
languages. Linguistic Inauirv 33.2: 321-8.
Chomsky. N.
1959
Review of B. F. Skinner Verbal Brharior. Language 35: 26-57.
1962
A transformational approach to syntax. In A. Hill, ed . Proceedings of
the Third Texas Conference on Problems of Linguistic Analysis in English
Austin: University of Texas Press.
1965
Aspects of the theory of syntax. Cambridge, MA: MIT Press.
1975
The logical structure of linguistic theory. New York: Plenum.
1980
On binding. Linguistic Inauiry I I I : I -46.
1981
Lectures on government and binding. Dordrcchr
XVI11
1995
2000
Introduction by David W. Lightfoot
The Minimalist Program. Cambridge, MA: MIT Press.
Mm horizons m the study of language and mind. Cambridge: Cambridge
University Press.
N and M. Halle
The sound pattern of English. New York: Harper and Row
N. and H. Lasnik
Filters and control. Linguistic Inquiry 8.3: 425-504.
Chomsky.
1968
Chomsky.
1977
Clark. R
1992
The selection of syntactic knowledge. Language Acquisition 2: 83-149.
Dresher, B. E
1999
Charting the learning path: Cues to parameter setting. Linguistic Inquiry
30.1:27-67.
Emonds. J.
1978
The verbal complex V V in French. Linguistic Inquiry 9: 151-75.
Fodor. J. D.
1998
Unambiguous triggers, Linguistic Inquiry 29.1: I-36.
Harris. Z S.
Methods in structural linguistics Chicago: University of Chicago Press.
1951
Jen*. N. K.
1967
Antibodies and learning' Selection versus instruction In O, C. Quarton,
T Melncchuk and F. O Schmitt, cds.. The neurosciences: A stutu
gram. New York: Rockefeller University Press.
1985
The generative grammar of the immune system. Science 229: 10571059.
Klima. E. S.
1964
Negation in English. In J. Fodor and J. Kalz. cds.. The structure of
language. Englewood Cliffs. NJ: Prentice Hall
Lasnik. H
2000
Syntactic Structures revisited Contemporary lectures on classic transformational theory Cambridge, MA: MIT Press.
Lees, R.
1957
Review of Syntactic Structures. Language 33 3: 375-408.
1960
The grammar of English nominalizalions. Bloomington: Indiana University Press.
Lees. R. and E. S. Klima
1963
Rules for English pronominalization. Language 39: 17-28.
Lenneberg, E. H.
1967
The biological foundations of language. New York. John Wiley.
Lightfoot, D W
1993
Why UG needs a learning theory: triggering verb movement. In C.
Jones, cd„ Historical linguistics: Problems and perspectives. London:
Longman [reprinted in I Roberts and A. Battye. eds., Clause structure
and language change. Oxford: Oxford University Press].
I M9
The development of language: Acquisition, change and evolution. Oxford:
Blackwell
Popper, K. R
1959
TV logic of scientific discovery. London: Hutchinson.
Ri/zi, L.
1990
Relativized minimality. Cambridge, MA MIT Press.
PREFACE
This study deals with syntactic structure both in the broad sense
(as opposed to semantics) and the narrow sense (as opposed to
phonemics and morphology). It forms part of an attempt to construct a formalized general theory of linguistic structure and to
explore the foundations of such a theory. The search for rigorous
formulation in linguistics has a much more serious motivation than
mere concern for logical niceties or the desire to purify well-established methods of linguistic analysis. Precisely constructed models
for linguistic structure can play an important role, both negative
and positive, in the process of discovery itself. By pushing a precise
but inadequate formulation to an unacceptable conclusion, we can
often expose the exact source of this inadequacy and, consequently,
gain a deeper understanding of the linguistic data. More positively,
a formalized theory may automatically provide solutions for many
problems other than those for which it was explicitly designed.
Obscure and intuition-bound notions can neither lead to absurd
conclusions nor provide new and correct ones, and hence they fail
to be useful in two important respects. I think that some of those
linguists who have questioned the value of precise and technical
development of linguistic theory may have failed to recognize the
productive potential in the method of rigorously stating a proposed
theory and applying it strictly to linguistic material with no attempt
to avoid unacceptable conclusions by ad hoc adjustments or loose
formulation. The results reported below were obtained by a
conscious attempt to follow this course systematically. Since this
fact may be obscured by the informality of the presentation, it is
important to emphasize it here.
6
PREFACE
Specifically, wc shall investigate three models for linguistic
structure and seek to determine their limitations. We shall find that
a certain very simple communication theoretic model of language
and a more powerful model that incorporates a large part of what
is now generally known as "immediate constituent analysis" cannot
properly serve the purposes of grammatical description. The investigation and application of these models brings to light certain
facts about linguistic structure and exposes several gaps in linguistic
theory; in particular, a failure to account for such relations between
sentences as the active-passive relation. We develop a third,
transformational model for linguistic structure which is more powerful than the immediate constituent model in certain important
respects and which does account for such relations in a natural way.
When we formulate the theory of transformations carefully and
apply it freely to English, we find that it provides a good deal of
insight into a wide range of phenomena beyond those for which it
was specifically designed. In short, we find that formalization can,
in fact, perform both the negative and the positive service commented on above.
During the entire period of this research 1 have had the benefit of
very frequent and lengthy discussions with Zellig S. Harris. So
many of his ideas and suggestions are incorporated in the text
below and in the research on which it is based that I will make no
attempt to indicate them by special reference. Harris' work on
transformational structure, which proceeds from a somewhat
different point of view from that taken below, is developed in
items 15, 16, and 19 of the bibliography (p. 115). In less obvious
ways, perhaps, the course of this research has been influenced
strongly by the work of Nelson Goodman and W. V. Quinc. I have
discussed most of this material at length with Morris Halle, and
have benefited very greatly from his comments and suggestions.
Eric Lcnneberg, Israel Scheffler, and Yehoshua Bar-Hillel have read
earlier versions of this manuscript and have made many valuable
criticisms and suggestions on presentation and content.
The work on the theory' of transformations and the transformational structure of English which, though only briefly sketched
PREFACE
7
below, serves as the basis for much of the discussion, was largely
carried out in 1951 - 55 while I was a Junior Fellow of the Society of
Fellows, Harvard University. I would like to express my gratitude
to the Society of Fellows for having provided me with the freedom
to carry on this research.
This work was supported in part by the U.S.A. Army (Signal
Corps), the Air Force (Office of Scientific Research, Air Research
and Development Command), and the Navy (Office of Naval
Research); and in part by the National Science Foundation and
the Eastman Kodak Corporation.
Massachusetts Institute of Technology,
Department of Modern Languages and
Research Laboratory of Electronics,
Cambridge, Mass.
August 1, 1956.
NOAM CHOMSKY
TABLE O F C O N T E N T S
Introduction to Second Edition by David W. Lightfoot . . .
Preface
1. Introduction
2. The Independence of Grammar
3. An Elementary Linguistic Theory
4. Phrase Structure
5. Limitations of Phrase Structure Description
6. On the Goals of Linguistic Theory
7. Some Transformations in English
8. The Explanatory Power of Linguistic Theory'
9. Syntax and Semantics
10. Summary
11. Appendix 1: Notations and Terminology
12. Appendix II: Examples of English Phrase Structure and
Transformational Rules
v
5
11
13
18
26
34
49
61
85
92
106
109
Bibliography
115
Ill
1
INTRODUCTION
Syntax is the study of the principles and processes by which sentences are constructed in particular languages. Syntactic investigation
of a given language has as its goal the construction of a grammar
that can be viewed as a device of some sort for producing the
sentences of the language under analysis. More generally, linguists
must be concerned with the problem of determining the fundamental underlying properties of successful grammars. The ultimate
outcome of these investigations should be a theory of linguistic
structure in which the descriptive devices utilized in particular
grammars are presented and studied abstractly, with no specific
reference to particular languages. One function of this theory is to
provide a general method for selecting a grammar for each language,
given a corpus of sentences of this language.
The central notion in linguistic theory is that of "linguistic level."
A linguistic level, such as phonemics, morphology, phrase structure,
is essentially a set of descriptive devices that are made available for
the construction of grammars; it constitutes a certain method for
representing utterances. We can determine the adequacy of a
linguistic theory by developing rigorously and precisely the form of
grammar corresponding to the set of levels contained within this
theory, and then investigating the possibility of constructing simple
and revealing grammars of this form for natural languages We
shall study several different conceptions of linguistic structure in
this manner, considering a succession of linguistic levels of increasing complexity which correspond to more and more powerful
modes of grammatical description; and we shall attempt to show
that linguistic theory must contain at least these levels if it is to
12
INTRODUCTION
provide, in particular, a satisfactory grammar of English. Finally,
we shall suggest that this purely formal investigation of the structure
of language has certain interesting implications for semantic
studies '
1
The motivation for the particular orientation of the research reported here
is discussed below in § 6.
2
T H K I N D E P E N D E N C E OF G R A M M A R
2.1 From now on I will consider a language to be a set (finite or
infinite) of sentences, each finite in length and constructed out of a
finite set of elements. All natural languages in their spoken or written
form are languages in this sense, since each natural language has a
finite number of phonemes (or letters in its alphabet) and each
sentence is rcprcsentable as a finite sequence of these phonemes (or
letters), though there are infinitely many sentences. Similarly, the
set of'sentences' of some formalized system of mathematics can be
considered a language. The fundamental aim in the linguisticanalysis of a language L is to separate the grammatical sequences
which arc the sentences of L from the ungrammatkal sequences
which are not sentences of L and to study the structure of the
grammatical sequences. The grammar of L will thus be a device
that generates all of the grammatical sequences of L and none of the
ungrammatical ones. One way to test the adequacy of a grammar
proposed for L is to determine whether or not the sequences that it
generates are actually grammatical, i.e., acceptable to a native
speaker, etc. We can take certain steps towards providing a behavioral criterion for grammaticalness so that this test of adequacy can
be carried out. For the purposes of this discussion, however,
suppose that we assume intuitive knowledge of the grammatical
sentences of English and ask what sort of grammar will be able to
do the job of producing these in some effective and illuminating
way. We thus face a familiar task of explication of some intuitive
concept — in this case, the concept "grammatical in English," and
more generally, the concept "grammatical."
Notice that in order to set the aims of grammar significantly it is
sufficient to assume a partial knowledge of sentences and non-
14
SYNTACTIC STRUCTURES
sentences. That is, we may assume for this discussion that certain
sequences of phonemes arc definitely sentences, and that certain
other sequences are definitely non-sentences. In many intermediate
cases we shall be prepared to let the grammar itself decide, when the
grammar is set up in the simplest way so that it includes the clear
sentences and excludes the clear non-sentences. This is a familiar
feature of explication.1 A certain number of clear cases, then, will
provide us with a criterion of adequacy for any particular grammar.
For a single language, taken in isolation, this provides only a weak
test of adequacy, since many different grammars may handle the
clear cases properly. This can be generalized to a very strong condition, however, if we insist that the clear cases be handled properly
for each language by grammars all of which are constructed by the
same method. That is, each grammar is related to the corpus of
sentences in the language it describes in a way fixed in advance for
all grammars by a given linguistic theory. We then have a very
strong test of adequacy for a linguistic theory that attemps to give a
general explanation for the notion "grammatical sentence" in terms
of "observed sentence." and for the set of grammars constructed in
accordance with such a theory. It is furthermore a reasonable
requirement, since we are interested not only in particular languages,
but also in the general nature of Language. There is a great deal
more that can be said about this crucial topic, but this would take
us too far afield. Of. § 6.
2.2 On what basis do wc actually go about separating grammatical
sequences from ungrammatical sequences? 1 shall not attempt to
1
Cf., for example, N. Goodman, The structure of appearance (Cambridge,
1951), pp. 5 6. Notice that to meet the aims of grammar, given a linguistic
theory, it is sufficient to have a partial knowledge of the sentences (i.e., a
corpus) of the language, since a linguistic theory will state the relation
between the set of observed sentences and the set of grammatical sentences,
i.e., it will define "grammatical sentence" in terms of "observed sentence,"
certain properties of the observed sentences, and certain properties of grammars.
To use Quine's formulation, a linguistic theory will give a general explanation
for what 'could' be in language on the basis of "what is plus simplicity of the
laws whereby wc describe and extrapolate what is". (W. V. Quinc, From a
logical point of view [Cambridge, 1953], p. 54). Cf. § 6.1.
THE INDEPENDENCE OF GRAMMAR
19
give a complete answer to this question here (cf. §§ 6.7), but I would
like to point out that several answers that immediately suggest
themselves could not be correct. First, it is obvious that the set of
grammatical sentences cannot be identified with any particular
corpus o( utterances obtained by the linguist in his field work. Any
grammar of a language will project the finite and somewhat accidental corpus of observed utterances to a set (presumably infinite)
of grammatical utterances. In this respect, a grammar mirrors the
behavior of the speaker who. on the basis of a finite and accidental
experience with language, can produce or understand an indefinite
number of new sentences. Indeed, any explication of the notion
"grammatical in L" (i.e., any characterization of "grammatical in
L" in terms of "observed utterance of L") can be thought of as offering an explanation for this fundamental aspect of linguistic behavior.
2.3 Second, the notion "grammatical" cannot be identified with
"meaningful" or "significant" in any semantic sense. Sentences (1)
and (2) arc equally nonsensical, but any speaker of English will
recognize that only the former is grammatical.
(1) Colorless green ideas sleep furiously.
(2) Furiously sleep ideas green colorless.
Similarly, there is no semantic reason to prefer (3) to (5) or (4)
to (6). but only (3) and (4) are grammatical sentences of English.
(3) have you a book on modern music?
(4) the book seems interesting.
(5) read you a book on modern music?
(6) the child seems sleeping.
Such examples suggest that any search for a scmantically based
definition of "grammaticalness" will be futile. We shall sec, in fact,
in § 7. that there are deep structural reasons for distinguishing (3)
and (4) from (5) and (6); but before we are able to find an explanation for such facts as these wc shall have to carry the theory of
syntactic structure a good deal beyond its familiar limits.
2.4
Third, the notion "grammatical in English" cannot be identi-
16
SYNTACTIC STRUCTIRI S
fied in any way with the notion "high order of statistical approximation to English." It is fair to assume that neither sentence (I) nor
(2) (nor indeed any part of these sentences) has ever occurred in an
English discourse. Hence, in any statistical model for grammaticalness. these sentences will be ruled out on identical grounds as
equally •remote' from English. Yet (I), though nonsensical, is
grammatical, while (2) is not. Presented with these sentences, a
speaker of English will read (I) with a normal sentence intonation,
but he will read (2) with a falling intonation on each word; in fact,
with just the intonation pattern given to any sequence of unrelated
words. He treats each word in (2) as a separate phrase. Similarly,
he will be able to recall (1) much more easily than (2), to learn it
much more quickly, etc. Yet he may never have heard or seen any
pair of words from these sentences joined in actual discourse. To
choose another example, in the context "I saw a fragile—," the
words "whale" and "of" may have equal (i.e.. zero) frequency in the
past linguistic experience of a speaker who will immediately recognize that one of these substitutions, but not the other, gives a grammatical sentence. We cannot, of course, appeal to the fact that sentences such as (I) "might' be uttered in some sufficiently far-fetched
context, while(2) would never be, since the basis for this differentiation
between (1) and (2) is precisely what we are interested in determining.
Evidently, one's ability to produce and recognize grammatical
utterances is not based on notions of statistical approximation and
the like. The custom of calling grammatical sentences those that
"can occur", or those that are "possible", has been responsible for
some confusion here. It is natural to understand "possible" as
meaning "highly probable" and to assume that the linguist's sharp
distinction between grammatical and ungrammatical2 is motivated
by a feeling that since the 'reality' of language is too complex to be
described completely, he must content himself with a schematized
1
Below » c shall suggest thai this sharp distinction may be modified in favor
of a notion of levels of grammaticalness. But this has no bearing on the point
at issue here. Thus (11 and (2) will be at dilTcrcnt levels of grammaticalness even
if i l l is assigned a lower degree of grammaticalness than, say. (3) and (4i; but
they will be at the same level of statistical remoteness from English. The same is
true of an indefinite number of similar pairs.
THE INDEPENDENCE OF GRAMMAR
1?
version replacing "zero probability, and all extremely low probabilities, by impossible, and all higher probabilities by possible."3 We
see, however, that this idea is quite incorrect, and that a structural
analysis cannot be understood as a schematic summary developed
by sharpening the blurred edges in the lull statistical picture. If we
rank the sequences of a given length in order of statistical approximation to English, we will find both grammatical and ungrammatical sequences scattered throughout the list; there appears to be no
particular relation between order of approximation and grammaticalness. Despite the undeniable interest and importance of semantic
and statistical studies of language, they appear to have no direct
relevance to the problem of determining or characterizing the set of
grammatical utterances. I think that we are forced to conclude that
grammar is autonomous and independent of meaning, and that
probabilistic models give no particular insight into some of the
basic problems of syntactic structure.'
• C. F. Hockett, A manual of phonology (Baltimore. 1955), p. 10.
We return to the question of the relation between semantics and syntax in
§§ 8, 9, where we argue that this relation can only be studied after the syntactic
structure has been determined on independent grounds. I think that much the
same thing is true of the relation between syntactic and statistical studies of
language. Given the grammar of a language, one can study the use of the
language statistically in various ways; and the development of probabilistic
models for the use of language (as distinct from the syntactic structure of
language) can be quite rewarding. Cf. B. Mandelbrot, "Structure formcllc des
textes et communication: deux etudes," Word 10.1-27 (1954); H. A. Simon,
"On a class of skew distribution functions," Biomeirika 42.425-40 (1955).
One might seek to develop a more elaborate relation between statistical and
syntactic structure than the simple order of approximation model we have
rejected. I would certainly not care to argue that any such relation is unthinkable, but I know of no suggestion to this effect that does not have obvious flaws.
Notice, in particular, that for any n, we can find a string whose first n words may
occur as the beginning of a grammatical sentence St and whose last n words may
occur as the ending of some grammatical sentence 5 } , but where S, must be
distinct from S». For example, consider the sequences of the form "the man
who ... are here," where ... may be a verb phrase of arbitrary length. Notice
also that we can have new but perfectly grammatical sequences of word classes,
e.g., a sequence of adjectives longer than any ever before produced in the
context "I saw a — house." Various attempts to explain the grammaticalungrammatical distinction, as in the case of (I), (2), on the basis of frequency of
sentence type, order of approximation of word class sequences, etc., will run
afoul of numerous facts like these.
4
3
AN E L E M E N T A R Y LINGUISTIC THEORY
3.1 Assuming the set of grammatical sentences of English to be
given, we now ask what sort of device can produce this set (equivalently, what sort of theory gives an adequate account of the
structure of this set of utterances). We can think of each sentence
of this set as a sequence of phonemes of finite length. A language is
an enormously involved system, and it is quite obvious that any
attempt to present directly the set of grammatical phoneme sequences would lead to a grammar so complex that it would be practically
useless. For this reason (among others), linguistic description
proceeds in terms of a system of "levels of representations."
Instead of stating the phonemic structure of sentences directly, the
linguist sets up such 'higher level' elements as morphemes, and
states separately the morphemic structure of sentences and the
phonemic structure of morphemes. It can easily be seen that the
joint description of these two levels will be much simpler than a
direct description of the phonemic structure of sentences.
Let us now consider various ways of describing the morphemic
structure of sentences. We ask what sort of grammar is necessary to
generate all the sequences of morphemes (or words) that constitute
grammatical English sentences, and only these.
One requirement that a grammar must certainly meet is that it be
finite. Hence the grammar cannot simply be a list of all morpheme
(or word) sequences, since there are infinitely many of these. A
familiar communication theoretic model for language suggests a
way out of this difficulty. Suppose that we have a machine that can
be in any one of a finite number of different internal states, and
suppose that this machine switches from one state to another by
AN ELEMLNTARY LINGUISTIC THEORY
19
producing a certain symbol (let us say, an English word). One of
these states is an initial state; another is a final state. Suppose that
the machine begins in the initial state, runs through a sequence of
states (producing a word with each transition), and ends in the final
state. Then we call the sequence of words that has been produced a
"sentence". Each such machine thus defines a certain language;
namely, the set of sentences that can be produced in this way. Any
language that can be produced by a machine of this sort we call a
finite state language; and we can call the machine itself a finite state
grammar. A finite state grammar can be represented graphically in
the form of a "state diagram".' For example, the grammar that
produces just the two sentences "the man comes" and "the men
come" can be represented by the following state diagram:
We can extend this grammar to produce an infinite number of sentences by adding closed loops. Thus the finite grammar of the
subpart of English containing the above sentences in addition to
"the old man comes", "the old old man comes", .... "the old men
come", "the old old men come", ..., can be represented by the
following state diagram:
1
C. E. Shannon and W. Weaver, The mathematical theory of communication
lUrbana. 1949), pp. 15f.
20
SYNTACTIC STRUCTURES
Given a state diagram, we produce a sentence by tracing a path from