Dr. Dobb's TechNetcast
h o m e
s c h e d u l e
a r c h i v e s
f o r u m
c h a t
f a q
t o o l s
a b o u t

dr. dobbs journal
 

TechNetCast @ SD'98 East

Play Audio

An Interview with Allen Holub

We catch up with Allen Holub as he is working on a three-volume writing project that will focus on compiler design in Java.  In this interview he speaks his mind about Java VMs and interpreters, C++, object-orientation, generic programming, Java threading and Microsoft.

Allen Holub is a design consultant, programmer, educator and author specializing in Object-Oriented Design, Java, C++, and systems programming (NT, Windows95, and UNIX). He has worked in the computer field since 1979—as an independent consultant since 1983—on diverse projects ranging from operating systems and compilers to various applications programs.   Allen is also a regular contributor to many computer magazines. He has written for Dr. Dobb’s Journal, Microsoft Systems Journal, Programmers Journal, BYTE, Windows Tech Journal, Mac Tech Journal, C Gazette and others.  He was a contributing editor for Dr. Dobb’s Journal, Dr. Dobb's Sourcebook, and Programmer's Journal. His popular "C Chest" column, which appeared monthly in Dr. Dobb's Journal from 1983 to 1987, provided many people with their first introduction to C. His many books include Compiler Design in C (Prentice Hall, 1990).
 

This interview is part of a series of discussions with leading programmers broadcast live from Software Development 98 East, August 18-20 1998. Interview by Philippe Lourier.

Post and read comments about this interview on the TechNetCast forum
 


 
 

TNC: Hello Allen, welcome to the program. I just mentioned some of your books. I actually spotted the most recent, "Enough Rope to Shoot yourself in the Foot", at the show bookstore. They have a small selection, but its there…

AH: Oh, good. Glad to hear it…

TNC: But programmers are probably more familiar with one of your other books, "Compiler design in C".

AH: That's right. I'm rewriting it right now as "Compiler Design in Java".

TNC: And I read that three volumes were announced… Three volumes?

AH: Three volumes, yeah. Well, the last one was almost a thousand pages, which is getting a little overboard. We actually needed more material in the Java version because we need to talk about VMs and other topics, and I just couldn't make a 1500-page book…

TNC: Will all three volumes come out at the same time?

AH: No, they're going to be strung out over about a year. I'm just about finished with the first one now, which I'm co-writing with my wife, Deirdre. It's going to be a full Java compiler implementation in Java, //using the Java CC -- Sun's yacc in Java.

TNC: What is the point of writing this compiler? Is it a proof of concept?

AH: Well, compilers are just generally useful programs, first of all. The technology used by compilers is present everywhere, particularly character recognition and related technologies, and also parsing technology and that sort of thing. The vast majority of programs have to figure out some kind of meaning when given some kind of string that somebody types in, and that's what compilers do. Compilers are also translators, of course, probably the programs that weren't covered by the first batch are all translators. They translate one kind of input into another kind of output.

One of the points of the book was that there are no good examples that I know about well-done, realistic object-oriented designs actually taken all the way through to implementation. One secondary objective of the whole project is that is to show good OO design of a non-trivial application implemented in Java so people can see some examples of how the language is used.

TNC: Did you find that Java was a good language to implement object-oriented design?

AH: Absolutely, yes. I've been doing nothing else for about three years now. I look at C++ as a bad dream gradually receding into memory…

TNC: Of course, you have quite a history with C++. How do you look back on C++ now in the light of your experience with Java?

AH: Well, I never really liked C++. I used it because there weren't really any other options. C++ is too complicated for what it does. Stroustrup is well-intentioned, but he's a mathematician. And when he was designing the language he tended to always provide general solutions to problems which often didn't need general solutions. Operator overloading is a case in point. Operator overloading is in C++ because you need an operator= function. But Stroustrup then generalized and said, "Okay, if we're going to do operator=, we should do all of them," and I disagree with that. I don't think we should have done all of them. And the end result, particularly after the ANSI committee got finished with it, was a language that was just too big and ungainly to be usable.

TNC: Have you kept up with the language and with the standard? In some ways, C++ is now being promoted as a new language --safer, more efficient.

AH: Yeah, but I don't buy it. You know, it's more complicated than it ever was, and the over-complexity of it was one of its main drawbacks. I really am a hard-core OO design and programming kind of guy, but C++ was just giving too much grief. I found that I can get everything done in Java that I would do in C++, without any problem at all, and I can do it in half the time.

TNC: Has performance been an issue for you? Many users with a C++ background criticize Java for its speed.

AH: The first thing I come up with when somebody asks that is the question, "Have you benchmarked it?" Because I have. And in my own benchmarks I haven't seen much of a speed hit. This depends admittedly on what the code is doing. But I've found that heavily algorithmic code that uses primarily primitive type objects and does not use too many virtual functions runs, if anything, faster than C++ does. 

TNC: How is that possible? 

AH: It's possible because of the way a JIT compiler works. Java has not been

interpreted in about two years now. The original 1.0 VM was an interpreter, but now we have the Symantec, Asymmetrix and other JIT-style VMs. They are not interpreters at all. They are compiler back ends. They taking the byte code as input, regenerate a parse tree from it, traverse the parse tree and generate native instructions. So once you've gotten over the initial burble caused by the optimization phase, what you are executing is native instructions for the native platform. 

It's exactly the same code that you would get out of the back end of a compiler. The main problem is that the JIT VM walks a tightrope. If it optimizes as aggressively as it possibly can, that would slow down the load time on the classes. If it does not optimize enough, then it doesn't get enough speed. So, typically JITs compromise and the problems are mostly not that they couldn't do better but, rather, they're not willing to do better because of the time it would take to do the optimizations. Sun's new Hotspot VM should fix that if it ever comes out, but -- well, I'm sure it will come out eventually, but --

TNC: When does the just-in-time compiler come in? When the class is loaded?

AH: No, it's the virtual machine itself. "Just-in-time compiler", that's really a misnomer, it's marketing hype. It's really part of the class loading process. What happens is that instead of just loading the byte code in and interpreting it directly, the byte code is transformed into native instructions that can be executed more efficiently.

TNC: And the JIT is called again to produce code for classes that are loaded dynamically as the application runs.

AH: Right. That's the big difference between this Sun's proposed Hotspot VM and the existing batch. The existing batch of JIT VMs just optimize everything. The idea behind the Hotspot VM is that it would profile the application at run-time, as it is executing, and then only optimize that part of the application that is actually running. There would be no time wasted optimizing code that never executes.

TNC: Would it also cut down on compile time and not compile parts of the code that never execute-- functions, for example, that are never called?

AH: Yeah, exactly. Hotspot should fix that. So that I'd expect some improvement there. The main improvement I think is that since Hotspot will not waste time optimizing everything, it can optimize what it optimizes much more aggressively. So I'm expecting better optimizations out of it.

TNC: Now, let's talk a little about the book a bit. Three volumes. What will take up so much space?

AH: Remember, there are a thousand pages in ["Compiler Design in C"], so it's not that much space. The first volume will be a full Java implementation of Java. So it's going to talk about class files, and compilers and parsing and grammars --the basics.

The second volume will have all the compiler tools in it. And this time, instead of just doing a dumb implementation of yacc, or an exact implementation of yacc, I'm just designing a decent tool to begin with that will produce honest-to-god classes that define parser objects and language objects -- but in an automated way so that you could still be inputting your grammar into the system but what would come out is a set of legitimate classes that could be used in an OO system to do a parser. So it will include all the tool functionality that was in the first book, it will include a version of Lex and cover expression parsing, irregular expressions and compiler tools.

The third volume will be a VM implementation. And we haven't really decided on the language yet for the third volume. The first two volumes will be entirely Java, but real VMs implemenations are written in C. Since the point of the book is not so much to write a VM as it is to teach people how VMs work, I might write the third book in Java too, just to make it easier to read the code.

TNC: One interesting evolution in programming tools recently is the growing importance of component architectures. One characteristic of Java is that it is built to integrate into a networked, distributed environment.

AH: The Java packages are vastly better than anything in C++, certainly better than Microsoft's MFC, which is an incredible dog. I've wasted too much of my life farting around with MFC, trying to get simple things done. I used MFC mostly for OLE programming and its linking and embedding support. I could program just about anything else much faster in C than messing around with MFC. MFC was just too poorly designed.

TNC: MFC is very idiosyncratic.

AH: It's not idiosyncratic, it's bad. There's a difference. It was not designed poorly, it was not designed at all. And they kept adding word after word after word on top of it and it just made it worse and worse and worse and more and more complicated. Bugs were never fixed properly, it was just a mess. And I compare that to the Java packages, which are designed really, really well. I teach a lot of OO design classes, for example, and the design pattern movement which is a big component of that. I can pull examples of all of the design patterns out of the Java packages without any problem at all. I look at Microsoft's stuff and it's just, it's just garbage,

TNC: I assume you would say that is true of most Microsoft APIs.

AH: There's two points here. One of them is bad design, though that's not always the case. I just read this great book actually, called "Barbarians Led by Bill Gates," written by a couple of ex-Microsoft people, which I assume informs their attitudes a little bit towards what they're talking about. There's a quote in there from Nathan Myrthold, Microsoft's visionary guy, and what he says I think is really telling. He says that Microsoft in order to keep its competitive advantage has to continuously change the set of APIs to Windows. 

The notion, I suppose, is two-fold: it's hard to clone Windows if Windows is a moving target, because by the time somebody gets it cloned then it will have changed and the clone won't be worth anything. The other issue has to do with the applications division and the cost of entry or the barriers of entry in terms of programming. It takes a team of 15-20 people to put together even a simple Windows application. So what Microsoft is essentially doing is setting up a situation where in order to compete with them in the applications realm, you've got to have a pretty good-sized company. And it takes a long time to learn what you need to learn and it's a very expensive process.

Now, working in Java, I can put together a really nice, fully-functional Windows application in a few days. The same would easily take me a month to put together if I was working with MFC and C++.

Part of this is sloppiness. This stuff is not designed. It just kind of grows. It's really a

good example of why you should be designing things -- you could end up with a mess like Windows. 

TNC: Now of course COM has become the preferred interface into Windows. Have you worked with COM at all?

AH: I have. It's too complicated. It's more complicated than necessary. It's poorly documented. COM is really an example of a design pattern that's called an abstract factory. The basic notion is that you instantiate an instance of some class without knowing exactly what the class is. And then you talk to that object through well-defined public interfaces. And that is good design. I see nothing wrong with that at all. What I don't like about COM is that the interfaces themselves are not particularly well designed. They are very poorly documented, which means that it's very difficult to get anything done because there's a lot of trial and error programming just to try and see if you can figure out what the documentation actually means. So the fundamental notion of COM is not particularly bad --what I don't like are the interfaces themselves.

Now, DCOM is another matter. DCOM, of course, is nothing but a layer around DCE. And my general feeling is that DCOM adds zero functionality to what DCE supports --so you should just use DCE and get it over with. I don't see that DCOM provides any extra stuff at all.

TNC: Now that we have JavaBeans, it is interesting to compare both technologies --COM and Beans-- and see how they offer different solutions and abstractions to the same problem.

AH: Everything In Java Beans is built into Java, the language itself. All the junk that you have to do with type libraries, all of the garbage interfaces that you've got to deal with in order to introspect an object and figure out what it does, that's all built into the Java.

TNC: In general, the abstraction model in Java is much simpler and intuitive than it is in COM.

AH: That's just because the language is simpler. COM is nothing but a way to simulate virtual functions in languages that don't support them.

TNC: Now, what other projects are you working on? I guess the book is taking up most of your time?

AH: Right now, because I'm desperate to get it finished. I've been doing a lot of teaching . If anybody is interested I've got details up on the Web site

I've been mostly teaching Java-related subjects for very advanced programmers --not introductory level material but really hard-core programming things, and a lot of it OO. I find that with OO design in particular there's a click factor. People take classes or they read books but they still can't do it because they've never applied it.

TNC: You’re big on object-oriented design. How about generic programming and generic algorithms? Generic programming is an interesting programming model whose aims sometimes overlap with OO --in its attempt to minimize code reuse, for example.

AH: It is interesting but I haven't done much with it. I've read some books but I

haven't done much with it. 

TNC: And it is true that C++ --because it's a multi-paradigm language or because it

doesn't impose any style of programming, for good or for bad -- naturally became a good language to use generic programming in.

AH: This is good, but I like structure.

TNC: What's going on with generics and Java? I think there's a generic keyword.

AH: There is a generic key word -- it's not implemented, though.

Templates provide a useful mechanism, and Java of course doesn't have them. But I'm not sure that Java needs templates so much as it needs to be able to do the sorts of things that templates can do. I had a talk about this with Jim Gosling a year and a half ago now at a JavaOne. He asked, and I think this is a sensible approach, "What do you use templates for?" The answer is that more often than not you need a version of a function that has a different return value than some other version of the same function, and the language doesn't support that. 

So he's saying, "I think it would make more sense to add a facility to the Java language that would allow different overloads of functions to return different types than it would to actually introduce templates into the language, because it's too easy to misuse templates." And

I think he's got a point.

TNC: Well, the idea behind generic programming is that you want to specify and isolate behavior --a sort algorithm, for example-- and apply it to different object types. 

AH: Right. Of course, any object-oriented system is generic in that sense. But there are lot of problems there. If you apply a sort algorithm to a hash table, it's going to have to [behave] in a very different way than if you supply a sort algorithm to a tree. And a tree doesn't have to do anything. Or a hash table is not inherently sortable. You have to turn it into an entirely different data structure in order to be able to sort the thing. I'm not a strong STL fan. The whole notion of separating the algorithms from the data structure looks nice on paper but it's hard to imagine that you can make it efficient enough to actually pull it off. You can do a lot with iterators, but there is a limit to what you can do with [them].

TNC: Did you attend Gosling's keynote speech?

AH: No, I didn't have a chance unfortunately.

TNC: He talked about Hotspot. What do you know about Hotspot?

AH: Just what's been in the Sun press stuff? Did he say when it was going to come out? That's the piece of information I want to have.

TNC: No, I'm reading the piece here in the the Dr. Dobb's daily. "'Hotspot is basically done,' said Gosling, adding that his team is really paranoid about quality assurance and is holding off release until critical bugs are eliminated."

AH: Well, I don't mind that. There's been too much buggy software being foisted on

the market lately, and I -- there's this notion in the software industry that first to market is more important

than anything, and I don't buy it. It's the Microsoft approach but Microsoft can get away with that. Microsoft has an advertising budget larger than the gross national products of most countries. And the issue there is that if you're going to compete with Microsoft, the only way that you can compete with them is by doing things that they are incapable of doing as a company. And I put software quality into that category, and I put good UI design into that category. Quicken is a good example of the latter. In terms of the former, I think that if somebody could come out with a word processor that had the sort of basic functionality that people need in a word processor but was rock solid reliable, that they'd have a competitive edge over Microsoft. 

And I just, with Java in particular, Java is at a place right now in its life where buggy stuff is just not going to work, is that a lot of people are really deciding right now whether to use it. And if Hotspot comes out and Hotspot is buggy, a lot of people are just going to decide not to use Java at all, and I think it's a good thing not to release it until it's ready. 

TNC: Faster synchronization is one of the optimizations that will be included in Hotspot.

AH: I'm doing a seven-month-long series on threads for JavaWorld. If anybody doesn't know about Java World they should. Java provides the primitive operations that you need to develop a decent threading library, but it doesn't develop it [further].

TNC: So you're not stuck with a threading model. You can actually build on top of the built-in primitives. You can build your own monitors, for example, and other concurrency models.

AH: Yeah, if you're going to do anything real you have to. The stuff that's in Java, the language itself, is so simplistic as to be unusable for anything more complicated than spreading your logic around.

TNC: Concurrency is not built into C++, for example, because it was decided that doing so would limit developers to a certain, fixed solutions.

AH: On the other hand I want, as much as possible, to have a platform independent way to do multi-threading, and Java does let me do that if I program with my eyes open. Threads are one place where Java's notion of platform independence breaks down. In order to get parallelism as compared to the appearance of parallelism, in other words, in order to use multiple CPUs in a program for multiple threads, you really have to go through to the underlying operating system's native thread model. As a result, different applications will run differently on different platforms.

TNC: For example, NT uses a worker pool model and Unix uses a totally different model.

AH: Well, that varies with the Unix environment. The article that I just put up on Java World talks about threading architectures on various systems. Solaris one is an extreme [case] because you can have cooperative and preemptive threads all running in the same application. It's a very complicated model. The point is that Java can't address any of that. So you can write platform-independent multi-threaded stuff in Java, but just because it's written in Java does not make it platform independent in terms of behavior. And so in order to really get control of that, you have to provide a layer around the outside that will let you have architectural solutions to problems that can't be solved by the language itself. 

TNC: You mention platform compatibility. Is it important right now that Java is maybe not as platform-independent as claimed or is speed more of an issue? 

AH: Well, Java is extremely platform-compatible in many ways. The networking stuff has always been platform-independent, and I've had no problems with that at all.

TNC: Well, a lot of the GUI stuff is not.

AH: Swing should fix that. Swing has problems, mostly in the speed department in some of the controls. The JTable, which is their grid control is the biggest one that has problems. But I'm hopeful there because I've seen it speed up. The early Beta for Swing was slow and then a Beta came out and it was much better than the early one. At JavaOne I talked to Amy Fowler about that, and she said essentially that they decided at some point to abandon the speed-up efforts in favor of just getting everything working. Once they get everything working they will go back to speeding it up again. Now, it remains to be seen whether they'll be able to speed it up enough. However, I have seen it speed up once so I'm hopeful that they'll be able to get it worked out. 

On the other hand, the cool thing about Swing of course is that 100 percent of the rendering is done in Java. You could write a full-blown Windows application with dockable toolbars and drag and drop and all of that stuff in Java using Swing, and zero of that code [will be Microsoft Windows code].

TNC: That's what Microsoft is afraid of.

AH: That's absolutely what they're afraid of, for good reason. But on the other hand, from my point of view, it's wonderful. It means I can write the application once and it literally will run anywhere. What Microsoft should really be doing with respect to Java is not try to kill it, but try to do the best possible implementation of it on the Windows platform. They should be modifying Windows so that it is the best possible host for Java. In other words, they should work with it. By trying to kill it I think all they're going to do in the long term is kill themselves.

TNC: They don't buy the core value proposition.

AH: Yeah, but I do. Microsoft has done things that are just wrong. They have defined UI as operating system functionality, for instance, and I don't buy it. I don't think the UI is the operating system. The operating system has to do with virtual memory, it has to do with the

network layer, it's got to do with the threading model, it's has nothing to do with UIs. And because of that they feel threatened by anything that does a UI, and I think that's a really shortsighted way to think about that. If NT5 is not rock-solid reliable when it comes out, Microsoft will lose whatever credibility it has in the enterprise, which is not much. They like to crow about how NT is being adopted by the enterprise but that's bull. It's being adopted on the desktop as a workstation platform butevery company that I know that has done an NT project at the server level has ended up abandoning NT because it was

too. Linux, on the other hand, costs $45.00, is rock-solid reliable, and it runs a server just fine. Microsoft is going to have to address that reliability issue if they're going to get anywhere. The only way they're going to be able to do that is if they give up on this UI junk and just focus on the real operating system, the core operating system.

TNC: Actually, it's interesting that they still have not separated the UI from the internals of the operating system.

AH: No, quite the contrary, they've put them in the kernel. Which is unstable.

TNC: To make it faster. 

AH: Well, to make it faster, so now I can run MYST on NT but NT crashes three times as often as it used to.

TNC: Okay, Allen, thank you very much for joining us today. We'll look out for your book coming maybe not so soon in bookstores.

AH: No, I'm supposed to be done with the writing in about a month and a half, so say beginning of the year.

TNC: Thanks, Allen.
 


Post and read comments about this interview on the TechNetCast forum
More TechNetCast @ SD'98 East Interviews

Copyright (c) 1998, Dr. Dobb's TechNetCast