Generalized Phrase Structure Grammar

EJB So GPSG was undoubtedly a reaction to prevailing syntactic theory, but going back to what we were saying earlier it is not clear to me that you were always as negative to TG as you clearly were by 1979. ``Heavy parentheses wipe out rules, OK'' (Gazdar 1978a) starts with a very Chomskyan passage - it's tongue in cheek - but what is interesting to me about it is that you start off setting up finite state grammar as a kind of joke for use in linguistics. You make an analogy between the way that Katz & Langendoen (1976) argue about pragmatics and the assumption that FSG is completely inappropriate as a tool for use in linguistics: the notion of semantic presupposition has about as much relevance in language as does FSG. Then in the same paper there is some stuff about Conjunction Reduction which endorses Conjunction Reduction as a rule of syntax.

GG Oh, I don't think it does. I'll react to the FSG point in a moment but I think your second point is wrong. What I say is ``since (14) is standardly derived from the same underlying representation as that of (13) via Conjunction Reduction''. This (true) observation is simply part of the argument - I am drawing attention to an assumption that Karttunen (and Katz & Langendoen) would have signed up to at that time and showing that they are misrepresenting Karttunen's analyis of these examples. There is no personal endorsement of Conjunction Reduction. I don't know exactly what my views on Conjunction Reduction were when that paper was written, but I imagine that they were not particularly enthusiastic since it was the failings of Conjunction Reduction that led to GPSG, at least in my mind.

EJB That's why I picked that out.

GG But, apropos the first point on FSG, I agree with you. The opening paragraph is indeed an ironic parody but it does presuppose that FSG is inadequate as a theory of natural language grammars. That was my view then, as now. The mathematical linguistic arguments that lead to that conclusion only relate to well-formedness - they do not entail that FSGs cannot be sensibly used in NLP applications. Thus I don't think there is any conflict between those arguments and the use of FSG in NLP today, for example.

When Geoff Pullum and I became aware that the arguments relating to CFG and well-formedness, arguments which were around then and which had been accepted for ten or fifteen years almost without question, were basically junk, we did then go back and review the FSG arguments as well. We thought that, if there was one load of junk out there, there might well be another. We wrote some papers on it. One of them appeared in a Japanese computing journal (Gazdar & Pullum 1985) and reviewed all those questions but we concluded that some of the arguments against the adequacy of FSGs for natural language well-formedness were indeed correct.

EJB Sticking with the intellectual history and attitudes to generative grammar, in the review (Gazdar 1976) of The Form of Language (Sampson 1975), you endorsed the methodology, sometimes pejoratively known as the `armchair methodology', of generative research work. Much of that review is an attack on Sampson for adopting a behaviourist but ultimately incoherent stance. Would you also still stick with that?

GG Yes, oh yes. That is not to say that I don't think that corpus work can't be useful, even in theoretical syntax. For example, the discovery by Partee of a crossing coreference sentence in the Los Angeles Times was a classic case where attestation provided considerable additional support for the existence and grammaticality of what would otherwise be a very exotic class of example. If you don't have an attestation for an example like that then, unfortunately, even though such examples are clearly grammatical you leave the way open for various sorts of linguistic nutter to claim that they are not and that humans have no intuitions about them. If you can find one, one ought to embrace it.

EJB It is the more liberal view that intuition is a very good starting point but not the only way to do this.

GG Yes. But if you are concerned with John loves Mary then there is just no point in searching a corpus for John loves Mary or any of its counterparts - you'll find thousands of them. And if you are working on an exotic language when you know nothing about the grammar at all, you need to know about the John loves Mary sentences and the way to find out about them is to talk to an informant. My view on that is essentially the traditional one - this is how pre-Chomsky linguists would work with informants. There is nothing novel about it - Chomsky didn't invent the methodology. He simply gave a defence of it in terms of the superior status of intuitive views of one's own language and so on. My defence of it is much more pragmatic really - or it would be now.

EJB When you came into the field you obviously read Chomsky and were taught generative grammar of that era, and you've essentially remained a generative grammarian throughout your career. But I think you have referred in several places to the fall from grace of Chomsky, from the early years in which the formalization of the grammars was taken seriously to the later years when hand waving was substituted for formalization. So, was Chomsky quite an important influence in terms of you embarking on linguistics and at what point did you really start to part company?

GG I don't know whether he was or not actually. If one was starting linguistics one had to read a lot of Chomsky and it is true that I found the earliest stuff of interest and there was material in Aspects (Chomsky 1965) that was of interest to me. There is very little, if any, after Aspects. My interest in Chomsky stopped, not in terms of my history, but in terms of his history, in 1965.

EJB But in 1975 you presumably had read ``Remarks on nominalization'' (Chomsky 1970) and at that stage was your attitude that he had already gone off, or was that something that came somewhat later in your intellectual history.

GG I'm afraid ``Remarks'' had left me unmoved. Actually, I was never even an enthusiast for Syntactic Structures (Chomsky 1957), at least not for the descriptive component of it. It has always seemed to me that that analysis of auxiliary verbs was basically barmy.

EJB Baroque.

GG And I've never been able to understand why that analysis was such a selling point. I cannot figure out why anyone would think that an analysis like that was one to sell you a whole view of language, which is what is seems to have been.

EJB Well, presumably the truth is that it wasn't really that analysis so much as the prospects for doing syntax in a different way that sold it. Most people coming into the field were probably far less sophisticated than you were about formalism, and so the possibilities given by a framework like that, baroque although it may be, were just very exciting at a more substantive level. I don't know.

GG You have to remember that in the late 1960s/early 1970s there was the whole generative semantics thing. My ideological sympathies, to the extent that I had them, were really with the generative semanticists - I found their work much more exciting to read than the Chomskyan stuff. I was very familiar with the work of Georgia Green , the Lakoffs, Jim McCawley , Jerry Morgan , Paul Postal , Haj Ross and Jerry Sadock . In terms of the way that I thought about language in the generative tradition, then it was those people whose work I found interesting.

EJB And that's clear in papers like the one on truth-functional connectives (Gazdar & Pullum 1976) and the general semantic orientation of your research - it put semantics at the centre of the picture.

GG Yes, my substantive thoughts on syntax were really all driven by coordination. I found coordination of interest because the semantics was easy. We had a good theory of the semantics of conjunction and disjunction courtesy of a hundred or more years of logicians' activities. But, if you tried to apply that semantics to English, then it only seemed to work for the sentential case - specifically declarative sentences. Most coordination, on any way of counting, it is not sentential - there is all this other coordination going on. It didn't seem to me intuitively that there was any problem about it semantically, but the logicians' standard account just didn't cover it. If you looked at what Montague did, - it's sort of mixed - in PTQ, he actually does something analogous to Conjunction Reduction (1973, T12 & T13). His own analysis of coordination really wasn't very interesting. But, nevertheless, if you thought about how Montague did semantics more generally, then actually the semantics for nonsentential coordination is trivial. You just use union and intersection - fiddling with it a bit because you've got functions rather than sets.

EJB So, it is all implicit in the lambda calculus.

GG Yes, you can just do it all very easily and I wrote a tiny paper spelling it out, that appeared in Linguistics & Philosophy in 1980 (Gazdar 1980a). Independently, Keenan & Faltz (1985) had essentially the same idea but expressed in boolean algebras. And Partee & Rooth (1983) offered yet another version of the same insight. There were three of us at the same time - there was no sense in which the other work depended on mine. So, obviously, it was not a big deal - it was an insight sitting there waiting to be had. Three of us had it in slightly different ways. But it has momentous consequences for one's thinking about syntax because if NPs, VPs, APs, and PPs can be directly interpreted when they are coordinated then there is no semantic motivation for Conjunction Reduction at all. Well, there are some ambiguous cases where, if you were really pushing for Conjunction Reduction, you would say then there are two derivations. There is a Conjunction Reduction derivation and there is a base generated derivation. These are for things like Kim and Sandy lifted the piano and the like. But this is clutching at straws.

EJB Even if you try to salvage it on the basis of such examples, you get into problems with cases like Some man smiles and snores. You don't want there to be a Conjunction Reduction derivation because there is no valid interpretation involving two distinct men, analogous to Some man smiles and some (other) man snores. Even at that sort of level of analysis, it is very hard to tell a coherent story. But people had known about those sorts of examples and those sorts of problems with the syntax/semantics interface since the late 1960s, surely?

GG But my point about this was that, until the three bits of work that I just mentioned, there was no demonstration that you didn't need Conjunction Reduction for semantic reasons. The generative semanticists had assumed that you had to have Conjunction Reduction because the only semantics available for coordination was the standard logical semantics for declarative sentences. Once those bits of work that I just mentioned were done, that whole motivation disappeared so there was no longer any real semantic justification for Conjunction Reduction. In addition, the fact that you could do the semantics so naturally for nonsentential constituent coordination, even for ``derived constituents'' like coordinated passive VPs, made it bizarre to describe these things by Conjunction Reduction. But, if you look at TG, either of the Aspects variety or any of the post-Aspects varieties, then you cannot not have Conjunction Reduction. It is absolutely crucial - if you remove it nothing else works. For example, if you want to coordinate a passive verb phrase with an active verb phrase then you have to have Conjunction Reduction. There is no other way to go about it because passive has to be sententially derived. TG was a house of cards in which Conjunction Reduction was the card right at the bottom of the house. If you pulled it out, then everything else broke. It seemed to me, as of about 1978, that it had to be pulled out. It was aesthetically and methodologically intolerable to live with Conjunction Reduction. Basically, GPSG sprang from that insight.

EJB So, a house of cards - once you pull Conjunction Reduction out, then the whole system needs reconfiguring.

GG By the late 1970s, Brame and Bresnan had shown that the cyclic rules were not an issue. They argued that the cyclic rules were lexical and you should just base generate all that stuff. The residual case was unbounded dependency constructions.

EJB Right, so these days we would call the cyclic rules bounded dependencies, and those are all now treated as lexically governed.

GG So I didn't need to do any work there, in a sense. It had been done by Brame and Bresnan.

EJB Although Bresnan's work had some influence on the mainstream, Brame was really somewhat written off as some kind of eccentric and ignored. I know that you wrote a long review of Brame's (1978, 1979) books in Journal of Linguistics (Gazdar 1982) trying to recommend them to a wider readership, but I don't think they really did have much of an impact except perhaps post hoc when the GPSG papers came out and referenced Brame's work.

GG They may not have had much of an impact but what mattered to me were the arguments. The arguments were available to me in Brame's and Bresnan's work - I didn't care about the impact. It was just a question of whether the work on the cyclic rules had been done and the work had been done.

I can remember talking to Geoff at that time and he said ``well, the cyclic rules, they aren't an issue'', which I guess I half knew but he confirmed it. He said ``what you've got to worry about is the unbounded dependencies - how on earth could they possibly be done without transformations?'' So that was the key question.

EJB Which the two of you set out to answer? Or you independently?

GG I don't think we set out to answer it. I just worried about it.

EJB There was also the tangential question of the arguments against CFG.

GG Yes, that's what Geoff and I worried about.

EJB And you worked on that with Geoff and that came prior to the work on the unbounded dependencies?

GG Pretty much in parallel, as I recall. I think that at the beginning it wasn't obvious to me that CFG was the way to go. I had always liked CFG because it is such a straightforward formalism particularly if you throw away that ridiculous string mapping interpretation and substitute McCawley's tree-based semantics. That just makes it conceptually much cleaner and you are dealing with the right sort of objects: trees not phrase markers. So I liked CFG and I was predisposed to go that route, but I also knew quite a lot about categorial grammar and dependency grammar.

EJB Aravind Joshi published the first tree adjoining grammar (TAG) papers in the 1970s, although I'm not sure whether he had his extended locality approach to unbounded dependencies at that stage. But certainly the notion of treating grammars in terms of tree admissibility conditions was there in the early work. Were you aware of TAG?

GG Oh, yes. I was familiar with the 1970s ``tree adjunct grammar'' work and I met Aravind in 1978 and kept in touch with him. The earliest of the unpublished pre-GPSG papers, ``Constituent Structures'', actually proposed rules that introduced chunks of tree directly. It wasn't exactly a TAG - they weren't intermediate bits of tree, they were terminal tree fragments. They were abandoned in subsequent work because their use isn't consistent with a rational theory of coordination.

EJB I was going to say that the weakness of that view is that you are forced to embrace some version of Conjunction Reduction again.

GG You could restrict use to idioms but even for idioms it was a not a good idea because if you have an idiom where the verb inflects then a terminal tree fragment is no help as you still have to get the inflection on the verb. Anyway, yes, I did know all about TAG, but I think I wasn't particularly interested in it because at that stage they didn't have what I took to be a satisfactory solution to unbounded dependencies.

EJB OK. So that wasn't there then.

GG It was in the context of the work that I was doing with Ewan on comparatives that the slash categories stuff arose. I'd got a grant from ESRC to employ Ewan. Ewan had run out of his studentship and he didn't have a job, but I had a regular lectureship post and was therefore in a position to apply for a grant. So we wrote a grant proposal together to do the syntax and semantics of comparatives. It was basically a vehicle to employ him at Sussex for two or three years. He wanted to work on the semantics of comparatives but we said that we would do both the syntax and the semantics. Of course, comparatives involve unbounded dependencies, although that's not something that people normally think about. So I was under a grant requirement to do something about them. We could just have said that there was a Bresnanian unbounded movement rule and given its specification. We could have fulfilled our mandate by doing that. But, by that stage, such a move was anathema to me since I had concluded that such rules were inconsistent with the only rational theory of coordination. It was Ewan's role to work on the semantics so it fell to me to do something about the syntax. And the only obviously difficult bit of the syntax was the unbounded dependency construction.

EJB The work on comparatives was also quite innovative as well wasn't it because there wasn't even within the old framework a particularly satisfactory story about comparatives. A lot of GPSG involved recapitulation, a restatement of analyses of known data which was pretty well described, but the comparative work was pushing the range of things covered - at least in anything but the most hand waving kind of way.

GG I think that's right. Comparatives is a lovely topic because there are all sorts of exotica lurking in there including the Bowers (1975) paired complementizer observation, which is of a kind not seen anywhere else in English and which I only had something to say about much later (Gazdar 1988). But, at that stage, we were primarily concerned with the straightforward facts (Gazdar 1980). Even so, the literature on the comparatives was very distributed and very much at the margins of syntactic theorizing. I think that we pulled that together and that Ewan did useful things on the semantics of adjectives and thus of comparatives (Klein 1981).

EJB The other substantive way in which GPSG extended the database of things that could be analysed sensibly was really in the interaction of unbounded dependencies and coordination. The old story simply did not work there.

GG It was just a patch - Ross's constraints were essentially a collection of patches. He had brilliant insights about the data but the story as a whole was not plausible or coherent.

EJB And otherwise it was really just a question of reconstructing what was known. So the GPSG story about passive doesn't really extend the range of constructions covered, but attempts to cover everything that was purportedly covered by the transformational treatment reconstructing it as a phrase structure treatment. Or is that unfair?

GG That's fair

EJB I'm putting this up as something that you can knock down if you want to.

GG No, that was exactly it. I was not primarily a syntactician and I wasn't much interested in discovering new facts. It seemed to me that there were very good syntacticians out there like Bresnan and, in the earliest single-authored GPSG papers, I looked at their work and renotated it, essentially. I couldn't see any reason to do otherwise. I wasn't much interested in debating the empirical details of the analysis because my line on it was ``well, I took some expert grammarian's analysis and reconfigured it, so if you have problems with the analysis then you have problems with someone else''. There are always problems with analyses so that didn't seem like an interesting aspect of the enterprise. No analysis in linguistics is immune to problems - it's just par for the course. The issue for me was whether the apparatus one has does a good job of encoding the analysis.

EJB So your emphasis was methodological. If you are going to propose a formalism it should not be patched at every point and it should have a clear semantics - no notation without denotation.

GG Absolutely, that was certainly my perspective. However, when the full four person GPSG enterprise began to roll and the four of us starting consciously collaborating, then Ivan and, to a lesser extent, Geoff would go data hunting. Ivan would want to find things that our analyses could do that nobody else's analyses could do. Then he could go and sell them to linguists on that basis, because that is the only sales pitch that many linguists recognize and Ivan knew that extremely well. I never found that sort of sales pitch very plausible. I did get interested when things popped up in relation to the interaction of coordination and unbounded dependencies. One would say ``hey, look doesn't that predict that ...? yes, it does, is that right? yes, it does seem to be right''. Occasionally that would happen and that would be interesting but it wasn't primary for me. It was a bonus if it happened.

EJB There is much in the Chomskian methodological discussion that says that is essentially what you are trying to do with generative grammar. You are trying to explore the predictive consequences of a theory, but I guess that the truth was by that stage in mainstream TG that it predicted almost anything and the game was to invent a constraint which stopped it predicting most things. So there was a big methodological lesson in that ...

GG But not one which has been learnt.

EJB What has been the lasting effect of GPSG on the field?

GG God knows. GPSG and related and largely contemporaneous activities or programmes clearly had a lasting impact in NLP, notwithstanding the subsequent statistical revolution. But as for linguistics proper, well, I'd be quite hard pushed to think of any lasting effects, I guess. But I don't follow linguistics anymore.

EJB I was struck reading Ivan Sag & Tom Wasow 's textbook (1999), which, in many ways, is an excellent book, that the legacy was almost more the substantive analyses rather than the methodology. The textbook is essentially a recapitulation of many of the analyses of GPSG: the analysis of auxiliaries is identical and the analysis of unbounded dependencies is essentially identical. But the methodological message is not too well instantiated. Possibly for pedagogical reasons they are very loose about their definition of a what a context free grammar is - in fact, they get it wrong. They are also pretty loose about the formal underpinnings of their chosen theory: they use phrase structure rules and notation which involves unbounded enumeration over categories in describing coordination. This kind of stuff has no formal underpinning in a constraint based, type feature structure system. So one might almost say that the lasting impact has been more in terms of the substantive linguistics than in terms of the methodological points. Does that upset you? At the time, as a PhD student, I felt that the thrust of the public relations exercise, the persuasive attempts to get people to look at GPSG, and to perhaps move to doing analyses in it were primarily to raise the methodological game and to get people to realize what a genuinely formal treatment was. And not, as you say, to get them to adopt a particular analysis of passive.

GG Yes, that was the thrust and it wasn't really our innovation. It may have been something of an innovation to have pushed it in the linguistics community, but it was just something that Montague, for example, took for granted. He did it in PTQ and in ``English as a formal language''. It didn't occur to him that one could do syntax sloppily. He had to do the syntax and he wrote down a set of formal rules. Although we were using a different formalism, it was exactly the same methodological message.

EJB But Montague was never a linguist and presumably it never entered his head to try and persuade anybody to do anything at an institutional or research programme level.

GG But I don't think that we added anything methodologically to what Montague simply did naturally.

EJB No, indeed. And, in many many other fields what you were trying to say would have been taken as given.

GG Montague was important because he was doing it for natural language. Although linguists were often rude about his analyses, I think they were only rude for the sociological reason that he wasn't a linguist. I don't think his analyses are at all bad if you compare them with the rather low standards that linguistics has set. They are very traditional.

EJB Although his choice of example, and the genre that he chose to work on, are bizarre: the theorem such that I proved yesterday is hardly likely to inspire someone who has been brought up on the non-prescriptive view of grammar.

GG You asked me if I was disappointed?

EJB Yes.

GG There was a point around 1980 where the four of us thought that we might actually be able to change the field. Ivan probably thinks that even to this day. I'm sure Geoff doesn't think it any longer but he thinks it his duty to act as if he still thought it.

EJB Fighting an acerbic rearguard action, perhaps. It seems a lot less positive than it was for a period - say from 1979 through to 1985.

GG I had despaired of linguists some time before the book was published. I also despaired of the politics of the relevant people in the field coalescing together to achieve a change by weight of numbers. There was an unwillingness by people who were basically sympathetic to one another theoretically and methodologically to sink their differences in public. It seemed to me that the methodological similarities between LFG, the various extended categorial grammars that were emerging, the Karttunen & Peters version of PSG, and so forth should have been sufficient for a common programme that could have been offered to the field. The technical differences that existed, such as whether you do unbounded dependencies one way rather than another, could have been aired at workshops and so on where people could argue sensibly with each other. What was not sensible was to go out to some much more general forum and rubbish competing analyses by people you largely agreed with. That struck me as foolish, basically.

EJB So there was a point, which perhaps coincided with the end of generative semantics, where there could have been a linguistic paradigm, involving monostratalish syntax with a proper logical representation used for the semantics and everything carefully formalized, that could have brought together a wide range of linguists around 1980. But that never happened

GG No, it didn't happen. And it wasn't down to me that it didn't happen.

EJB How did you wind up marketing GPSG as a branded grammar framework?

GG I never really wanted to initiate a grammar framework. My view of things around 1979/1980 was that Geoff and I were having some interesting things to say about the arguments relating to context free languages - that was one thing - and that I and subsets of the others either had interesting technical innovations to offer, like the slash category stuff, or interesting reanalyses, like the auxiliary verbs paper. I thought these were a bunch of things that stood on their own and one could give interesting papers about them but not wrap them into a package and say ...

EJB ...here's a new framework - with all that comes with that.

GG Yes, I was very resistant to doing that but it was rather taken out of my hands. Not by the other three, but rather by the field because linguists like to put a name on things. The field is not happy with a bunch of techniques. It is different from computing. There, you don't have to sell a package, you can just say ``here is a neat algorithm''. Linguistics doesn't work that way. It wants a whole package like a package holiday and people started calling what we were doing ``Gazdar Grammar''. I don't like having my rather strange surname booted around more than it has to be anyway. But this, in particular, was really gross - I certainly didn't want reference books to contain the phrase ``Gazdar Grammar'' ten years hence. Besides, it was manifestly unfair to the other three because we were working as a team by then. Anyway, Emmon Bach gave a talk at a conference in Holland called ``Generalized Categorial Grammar'' and at least three of us, possibly all four of us, were there in the audience sitting in a row, as we sometimes did, and one of us looked at the others and said ``if we're going to have to have a name then why don't we use `Generalized Phrase Structure Grammar' - we can just copy Emmon''. Which is what we did. So we said ``if you want to give what we do a name, call it GPSG''. Emmon's own title, ``Generalized Categorial Grammar'', sank without trace but our bit of plagiarism survived.

EJB It sank so successfully that, in Mary McGee Wood 's (1993) book on categorial grammar, she used that phrase as a generic term for all of the categorial grammars that go beyond the classical AB categorial grammar. GG Well, that is probably a good use for ...

EJB Probably a good use for the word ``generalized'' but it must have been a little upsetting for Emmon.

GG Anyway, that's the history of how we came to market a package. I guess Ivan was quite comfortable marketing a package but I really wasn't. However, I could see that, given the way the linguists work, we had no option but to do that, so then I threw myself into it.

EJB I was a first year PhD student here in Cambridge when you ran a workshop at UCL in 1981, I think it was. I remember Nigel Vincent saying ``I'm going to this, do you want to come?'' and I came. I think that was probably the first time I saw you in action. I had already read some of the papers - and I was already learning what linguists were like - I had a reasonable understanding of the formalism, probably a better understanding of the formalism than I had of syntactic theory at the time, actually. And I was surprised to see that others understanding of it was quite the opposite. They didn't grasp details of the formal stuff that I thought were quite easy and straightforward. But when I came to that seminar, what struck me then was the enormous amount of effort that was being made to market GPSG, and to market it in a way that would be acceptable to such people. It was a new insight for me into how academia might work and I was very struck by that. At the time it was a remarkably successful endeavour: a lot of very good linguists worked within the broad framework.

In the UK, people like Bob Borsley , Ronnie Cann , Connie Cullen, Steve Harlow , Geoff Horrocks, Graham Russell , Larry Trask , Nigel Vincent and Anthony Warner . And outside the UK, Jan Anward , Mike Barlow , Chris Culy , Sandy Chung , Mary Dalrymple , Donka Farkas , Dan Flickinger, Takao Gunji, Erhard Hinrichs, Tom Hukari, Mark Johnson , Naneko Kameshima, Bob Levine , Joan Maling , Jim McCloskey , Michael Moortgat , John Nerbonne , Jessie Pinkham , Mamoru Saito, Peter Sells , Susan Stucky, Hans Uszkoreit , Annie Zaenen , Arnold Zwicky and many more. For a while, it seemed like that strategy was working. It did work, but by 1985 it seemed that the heart had gone out of continuing to promote it in that way, to continue to spend a lot of time in California and to zip round the landscape doing those kind of seminars. So it is not obvious to me that it was failing in the early 1980s but, by 1985, it seemed like the core people, the four of you, had had enough of that.

GG Well, we certainly had had enough of the book (Gazdar et al. 1985), which was extremely painful to produce. It was the kind of standard software scenario where the more programmers you have, the longer the program takes and the worse the bugs. In the end, getting that book out was gruesome. The other thing was that HPSG was beginning to emerge via the Hewlett Packard NLP project. So there was, as it were, a successor product around and certainly Ivan's many marketing skills were going to be devoted to that product. Geoff had worked at Hewlett Packard so he had a bit of ownership of it but not very much, and Ewan and I didn't really have any. But we couldn't very well go on marketing GPSG since one of us had now introduced HPSG . Also a lot of the reasons for keeping GPSG the way it was had been the context free claim and that claim had been falsified. There was thus no longer a reason to keep GPSG context free equivalent. We couldn't go and say ``but ours is context free equivalent'' to an audience because they would immediately say ``but we now know that natural languages are not context free''. HPSG had jumped that barrier so they were going to allow themselves to do whatever they needed to do. I wasn't very interested in that. From my point of view, it was just another unification grammar. It was one I was quite sympathetic to as it took over so much from GPSG. But, to the extent that HPSG embodied GPSG technology, then I had made my contribution - I didn't particularly want to produce some more technology for a particular brand of unification grammar. And, since I was not primarily a descriptive grammarian, I did not see much point in me developing analyses within HPSG . I just thought it would be more sensible if I did something else.

Of the four of us, we kind of paired up. Geoff and I constituted a pair and Ewan and Ivan constituted a pair. Geoff has always had many strings to his bow, numerous things that he does: South American Indian languages, the phonology-syntax interface, mathematical linguistics, and a whole lot else, so Geoff was never short of things to do. Ewan got involved in various developments at Edinburgh - unification categorial grammar and so. We went our separate ways. There was no animosity. Ewan and Geoff and I never came out and attacked HPSG or anything. We thought that HPSG was the successor to GPSG and that Ivan was in charge of it, and that was that.

EJB A rather honourable thing to do - almost unique in the history of syntactic theory to actually say that this theory has served its purpose and we will move on.

GG I'm not sure we said it - but that was what we thought.

EJB So you felt that the book was painful because there were a lot of programmers or cooks. It also presented a version of the stuff which was significantly more complex formally than the earlier papers had been. I guess the goal was to produce something that was not only correct in the details but also declarative, and that made it hard to understand. One reason you might formalize something is because it allows the predictive consequences to be worked out. But if the formalization is so complex that people find it hard to actually work out those predictive consequences, it becomes difficult to see what the practical value of the formalization is. Is that another reason why the book was painful? Was it a good point to stop because it was time to step back and do a rational reconstruction of what was worthwhile?

GG I did actually do a bit of rational reconstruction. I gave some lectures at the Max Planck Institute in Nijmegen on a formal semantics for the GPSG formalism. The lectures provided a rational reconstruction of the various components like defaults, feature cooccurrence restrictions and so on. That was me doing some cleaning up for my own mental hygiene, after the fact. I still have the lecture overheads somewhere but they never got written up or published. The formalism in the book is a mess.

EJB But the earlier papers were very much cleaner, and still arguably perfectly formal, and the predictive consequence were very very clear, by comparison with your syntactic competitors, anyway.

GG The book is a mess. I really don't want to have to say this in print, but I'll say it to answer your question - we may have to put a sanitized version in anything that gets printed. Ivan always wanted to be able to deal with every example that someone might conceivably raise when he gave a talk. The areas he mostly worked on were the verb phrase, and things to do with control and agreement. Ivan would find a counterexample to every formulation we came up with - about a week after we thought we were done with it and could move on. Ivan is a brilliant syntactician, I have never known anyone like him for thinking through the consequences of an analysis. Once he had found a counterexample, he would then insist that the existing formalization be changed to deal with it. So the control agreement principle is dreadful - it embarrasses me that I have my name on a book with that in it. It is just a complete kludge. There are other bits that are difficult like the formalization of FSDs. The formalization of that is inelegant, but that is due to formal incompetence rather anything inherently ugly about what we were trying to formalize. There is something there that is quite formalizable and what we were doing was not conceptually difficult. But if you try to formalize it and you're not very good at formalizing nonmonotonic feature theory, then you can end up with the kind of hard to grasp formulation that there is in the book. So that was a prime candidate for rational reconstruction. There are no hacks or kludges in it - it is just cumbersomely done. Similar remarks could be made about the head feature convention which became defaulty for the first time in the book. Again there are no kludges in that, unless you think that the whole enterprise of making it defaulty is a kludge. It is quite hard go understand but, conceptually, there is nothing problematic about it. If there are faults in the formalization of the HFC, they are just to do with it being ineptly formalized. The problems with the CAP formalization weren't to do with competence - they are due to the fact that the conceptual basis of it collapsed as it was repeatedly hacked about to cover newly arising data. It was horrendous, we just couldn't get a final version of it. I think it was even changed in proof. There was another bit of the book that was changed in proof that was thereby scrambled: there were two pages of formal nonsense in the middle of that book.

EJB Let me read you something from Neil Smith 's contribution which I think is quite an interesting remark. ``So my concentration on Chomskyan syntax was replaced by GPSG for a couple of years. I taught it to all undergraduates and MA students for two years until I became convinced that its concentration on descriptive rigour was bought at the cost of a lack of explanatory insight'' and he goes on in the vein of `things that I did other than push Chomsky's latest theory'. The lack of explanatory insight is the usual kind of rhetoric but I think in the first part he was reflecting a fairly general comment that might be made by someone like Steve Harlow, who also taught GPSG for a while. So in retrospect do you think that you got that it right? If you were doing it all again and there still were areas of the formalization that were hard to get right on all fronts would you change the way that you played it?

GG I'm not quite sure what the question is. On the explanatory thing I have never understood what linguists mean and my suspicion is that, when you have things fully formalized with all your undefended assumptions laid bare, then once somebody understands it, the mystery goes out of it. They see how, when you turn the handle, that makes some claim. But because they can see all the workings, it somehow loses its ability to explain things for a linguist. It is no longer magic. Whereas when the machinery is hidden from you and somebody says ``ah, the so-and-so principle covers this fact'', when the so-and-so principle has never even been written down in a single piece of connected English prose that everyone signs up to, then linguists say ``wow!''. But when somebody formalizes the so-and-so principle, all the magic drains away and things are no longer being explained. There are no more magic moments. I think that is purely a matter of linguist psychology, not a scientific matter.

EJB And the other part of it ...that the formalization got in the way?

GG Oh, I'm sure that the formalization gets in the way of the book from a pedagogical perspective. Horrendous - except for some graduate students who were truly dedicated. One would have to be a masochist or mad to teach from that book. Although, really, people didn't have much choice for a while until several good text books appeared. With linguistics undergraduates, any formalization of any framework gets in the way because, as you say they, come from language backgrounds.

EJB And yet the way that you enter into syntactic theory appears to be that you are taught a particular notation and then you work within it and that makes you an LFGer or GBer.

GG You are not even taught a notation in GB as it has no notation. Just a collection of buzz words whose appropriate use you are socialized into.

EJB At the time you were developing GPSG, Bresnan & Kaplan were talking about things like the strong competence hypothesis, the idea that the grammatical or syntactic framework should be embeddable in some account of performance. I guess that phrase didn't emerge till 1980 or 1982 but the general idea was around in Bresnan's earlier work, and there is discussion on issues like parsability and CFness in some of the GPSG papers. How important was that to what you were doing? It's a slight extension to the notion of what syntactic theory is setting out to achieve.

GG The four of us were not of one mind on this. I had essentially become a Platonist - by the time the book was finished I was a raving Platonist. Ivan wasn't. Geoff was showing strongly Platonist tendencies but not as extreme as mine. I'm not sure where Ewan stood. So, in terms of the cognitive science side of Bresnan & Kaplan's desiderata, I wasn't really interested in that. It seemed to me that English and other languages were interesting abstract objects that it was a linguist's duty to describe. What the brain did with these abstract objects, or their concrete counterparts, was a separate question for me. So the four of us couldn't have had a consensus view on that which could have emerged from the book because Ivan and I wouldn't have agreed. We didn't argue about it but we knew that the problem existed. Geoff and I basically wrote the introduction to the book and we had to be rather careful as we didn't want to say anything that Ivan couldn't sign up to. Actually we could easily have got away with it since Ivan wouldn't have read the draft introduction if we hadn't held it under his nose and made him read it. But if we had taken advantage of that, then Ivan would just have spent the rest of his career going round saying that he'd been cheated by a couple of sleazy limeys. We didn't want that to happen. On the computational side, I guess my views floated about a bit. On the one hand, if you have a grammar formalism that's equivalent to some sort of CFG then, at least at first pass, you have something that was a pretty good candidate for natural language processing by machines. On the other hand, over time I was persuaded by Shieber that it was more important for NLP purposes to have a decent formalism that was well founded and so on, like PATR, than it was to worry about Turing equivalence.

EJB So, following up on the question about strong competence and parsability, you'd probably say much the same about learnability?

GG Yes. Which is not to say that there aren't interesting and important questions to ask about natural language learnability in the light of Angluin (1983) , Gold (1967, 1978), Valiant (1984) and other work of that general ilk. On the other hand, I find it hard to get excited by learnability work that is predicated upon Chomsky's bizarre assumption that there are only a finite number of possible human languages.