|
h o m e s c h e d u l e a r c h i v e s f o r u m c h a t f a q t o o l s a b o u t dr. dobbs journal |
TechNetCast 1998-08-07
C++: New and Improved!
In his keynote speech at C++ World this week Bjarne Stroustrup urged
users to take a new look at the language
in light of its recent adoption by ANSI and ISO.
With C++ standards committee member Nathan Myers we examine
why Standard C++ is a fundamentally different language and we a look at techniques
and programming styles developers can use to take advantage of its new features.
Post and read feedback about Nathan's comments This program is part of the State of C++ series, Interviews with Stan Lippman, Nathan Myers and Dan Saks |
|
Transcript TNC: I attended Bjarne Stroustrup’s C++ World keynote last week and took back a few interesting points. He reiterated some of some of the ideas that he previously developed in both the third edition of the C++ Programming Language and an interview published in IEEE Computer magazine recently. The main point of the speech was that C++ was fundamentally a new language and that users should go and have a look at the new features and start programming using these new concepts. The idea of course is that C++ supports different styles, but to really use the language efficiently you should use what he calls "new style C++". NM: The language as standardized is quite a lot more powerful than the language that was even described in the second edition of his book. It now supports programming approaches that really weren’t practical before. And I am pretty excited about that. TNC: Well he was certainly very upbeat about the language and also about the fact that it was closer to his ideals than any previous incarnation of the language. He also stated it was very close to his original design objectives for the language. [Ed. Note: These objectives are presented in "Design and Evolution of C++"]. Does the committee still exist now that the standard has been approved? NM: Yes. We are meeting again in October and have a couple of things up on the list. Of course there are bugs in the standard and we are going to hammer those. There are places where users have discovered that it’s not clear what it says in the standard. They have come up with alternative interpretations and they want to know which the committee thinks is right. TNC: Ambiguities that need to be clarified. NM: And we have plans for the future too. For instance there has been a request that the committee study and make recommendations for implementers who are doing compilers intended for embedded systems work. TNC: What kind of features would need to be added to the language to support embedded systems? NM: The main issue is that a lot of embedded systems programmers want to have features eliminated. They typically have very tight memory constraints. And some of them feel that they can’t afford to have exceptions. Or that they can’t afford to have the full library. TNC: They want a smaller feature set. NM: They want help from the compiler vendors to be able to make the programs smaller. And so part of the work of the committee is [to produce and evaluate] recommendations for how an implementer can produce a compiler that produces smaller executables. The easiest or the most trivial way to do that is to leave parts out. You can have a flag to turn off exceptions[, for example]. Then [results in] a smaller executable. A more competent or better informed implementer can just as well implement exceptions more efficiently. Or, have the compiler leave out parts of the library that aren’t being used, instead of having a library that doesn’t have that part they just have to not link it in. TNC: Are the executables produced with this version significantly larger than executables that we produced by earlier releases? NM: The standard doesn’t say they have to be. TNC: But do the new features cause the compiler to generate significantly more code? NM: If you write the code in the simplest possible way to meet the standard requirements, then, yes, things will be bigger. So compiler vendors have a responsibility to implement things cleverly in a way that it doesn’t make things [significantly larger]. TNC: So dropping features will then necessarily make executables smaller? NM: Right. The thing is that optimization is something that takes intelligence and hard work. And that is something you expect your compiler vendor to do. TNC: Rather than leave exceptions out entirely, the compiler should be smart enough, if the application does not use exceptions or run-time facilities, not to generate code for these. NM: Mainly features like run-time type information. If you are not using those, why link them in? TNC: Okay, let’s get back to the standard... Because we finally do have a standard. One of the claimed benefits of Standard C++ is that it allows users to approach the language on a higher level. Would you agree? And what are some examples of that? NM: The most significant advances in the language is the array of template features. We had templates for quite a long time, but now we have number templates, we have partial specialization. We have template function overloading. TNC: Nathan, can you just very quickly define partial specialization? NM: In general specialization, you can define a special version of a class or part of the class or particular type. Now partial specialization means that the specialization itself can be parameterized. So, if a vehicle for example is a template, I can specialize on "vector of vehicle" and leave whatever the vehicle is parameterized on unspecified. That wasn’t possible with the old specialization. This becomes really powerful. Suppose that you can implement a set of strings differently than you would do a set of some other type. You don’t want to have to say this is specialized for string of char only. You want to be able to specify that you want to do this special operation on any kind of string. And to do that you need partial specialization. Templates function overloading is the counter part to that for functions. TNC: How would that work? NM: As an example, suppose you have the general function template in the library copy that will copy from any range of iterators to an output iterater. If you happen to know something about those iterators, for instance that this is a stream iterator, then instead of using the generic copy definition, which would for example get a symbol character from the iterater, write it out to the other iterater, you can specify that this is a stream iterator and be able to take the a whole range of characters or objects from that stream and write them out all at once. And not have to go through this regular loop. TNC: All these features are documented in "The C++ Programming Language". NM: That’s right. They are in two Stroustrup’s 3rd edition. TNC: What other language features facilitate a higher level approach? NM: The other features include namespaces. Namespaces are a lot more powerful than we thought at first. TNC: Namespaces are a way to avoid name collisions. But, they are also a way of organizing variables and classes. NM: Right. Namespaces let you group them. It turns out that they are useful for two other things. This was not really obvious at first. If you put all your code in a namespace, you are protected against what other people dump into the global namespace. And that helps. So it’s not just a matter of being able to have say the standard library in a namespace and other libraries you are using and be able to get those names out. You can actually use those namespaces in a program that you don’t expect to relate as a library. Secondly, namespace names become part of the linkage name of any object that’s in it. This is really, really handy. That means that if you compile a module against a library that’s in a namespace then you can, if you are clever when you first named that namespace, have a later version of that same library linked at the same time. So this allows you to do transition from one version of the library to another and actually still have the old library accessible. TNC: Were there any discussions among the committee to have distributed namespaces? NM: You are thinking of the equivalent of Java. TNC: Right. NM: You know, that’s going to cause the Java people a real problem. Because the domain names change and the domain names get assigned from one company to another. Project names change, company names change. If there is a company name in the domain, when the company changes its name, somebody can come along and says you’ve got to change that and all these class header files. TNC: That could also happen with C++ namespaces. NM: Right. That means that you should be very, very careful never to put a company name or a division name or a trademark product name in a namespace. Because those things get changed. They get transferred to other people. For reasons that are not really technically sound they get changed and then people get ordered to break all the codes that depend on that name. And this has happened already in Java by the way. TNC: Let’s talk about implementation. What compilers out there support the C++ standard, if any, at this point? NM: Well, the situation in UNIX land is quite a bit better than in PC land. Most of the compilers and UNIX are built on top of the Edison Design Group front end. It basically implements all the features of the standard --with one exception, exported templates. I can explain later what that is about. Hewlett-Packard has a compiler that traces its heritage back to Taligent that also implements the full language. And there is a new compiler coming out from Cygnus, it’s free. And it’s generally a free software project so you can get involved too. Version 1.03 is out now. It supports a lot of the new features and 1.1 is coming out this month. TNC: We should just mention that you actually work for Cygnus. NM: I am on contract for Cygnus now. And I am working on free implementation of the standard library. TNC: What is the name of that compiler? NM: We call it the egcs compiler. You can find out more about that on egcs.sygnus.com. The latest snapshot version implements everything except exported templates. You can get the snapshot version from the web page. It’s not the officially released version yet, but, it’s thoroughly tested and that’s the one I use. TNC: What are exported templates? NM: Exported templates is one of the last major issues resolved by the committee. It’s about being able to use a template without actually showing the compiler the sources to the template member functions inline. [Previously,] the only portable way you could do templates was to put the entire template definition in a header file and then include the entire header file everywhere you use the template. TNC: And that results in code replication. NM: It means that if you have 100 files, then you have to show all your templates to the compiler 100 times over. Exported templates are a feature that [make it possible to] export a template declaration. The class body and the actual member function definitions are to be found elsewhere and you don’t have to show those to the compiler. TNC: We talked a bit about available compiler implementations. How about STL? What is the state of STL implementations? The original implementation was the HP implementation. NM: Right. We started with HP. Basically everybody copied that one and then started making changes to it. Alex Stepanov who invented the STL along with Meng Li moved over to Silicon Graphics and released another version there. And it’s since been worked on mainly by Matt Austin at SGI. They have released several versions. Since the current version from SGI is version 3.11 and you can get that off of SGI’s web page. The better commercial implementations are based on that one. Some of the commercial implementations are based on previous versions of that one. Thread safety is one the places where they tend to vary. Thread safety is new in SGI version 3.11. The others have more or less thread safety. TNC: There was an article in C++ report recently that compared different STL implementations and tested them for thread safety. And I think the result was that the Rogue Wave implementation had some problems. The Microsoft implementation was slow because of how it used critical sections. But, apparently that’s been fixed. NM: Yes. Rogue Wave says their implementation is now thread safe. I don’t know if Microsoft is now faster. I haven’t tried it. The other major area where they tend to differ is in memory management. The HP had fairly rudimentary memory management. And the Rogue Wave/Plauger/Microsoft implementation made some changes to that that weren’t necessarily improvements. The current SGI implementation does quite good memory management. But, I don’t know how much of that has been copied to the other ones. So generally you’ll get best results if you don’t use the one that is built in to your compiler and instead go out and get the SGI one. TNC: Performance is certainly going to improve as different releases are put out. But how do these implementations compare as far as compliance to the standard? NM: They are all pretty much pretty well there as far as what the standard requires. Some of them have various extensions. The biggest outstanding problem they have is that it is not always clear what is an extension and what is really standard. I am working with Matt on the SGI implementation to make sure that everything that is not standard is marked as being not standard. Now, it may be that in another five years, in later version of the standard, some of these features will be adopted. But probably a lot of them wont. And it’s a good idea to know which things you are using are actually standard. TNC: Nathan, let’s talk a bit about what you are doing at Cygnus right now. NM: There is an ongoing project to replace the C++ library that comes with the egcs compiler. The current version shipping with the egcs compiler is the same one that has been going out for two or three years. TNC: Is this the STL? Or is just the old C++ library? NM: Well the standard library includes a lot of things. It includes the STL, it includes IO streams. It includes a string [class], it’s got some numeric features complex and valarray. The standard library is quite a lot more than the STL. The current egcs compiler ships with what can accurately be described as an obsolete version of the library. Cygnus has the goal to have a standard library. TNC: What parts of the library you are working on? How do you approach the work? What do you do on a day to day basis? NM: The project has the goal to produce a complete, conforming standard C++ library. And we have begun by adopting the most current SGI STL and we’ll be keeping up with that. Along with that is, of course, the conforming iostreams. Gabriel Dos Reis contributed a valarray implementation. This is an open free software project, what is now called an open source project. TNC: GPL? NM: It’s actually not GPL. It’s on a looser, a freer licensee than that. Which means that any company that wanted to adopt this library as their standard library would have the right to do that and also distribute locally modified versions of the library. TNC: Specifically what parts are you working on? What difficulties do you encounter implementing some of the specifications that you and other people on the committee put into the standard? NM: Well, it’s not very hard to write a fully conforming library. It’s a lot of work. But, it’s being done. The next stage is in writing an optimal library. A library that doesn’t impose unnecessary overheads on user programs. And so that’s where most of the effort is going. [Making sure] that a fully conforming fully flexible iostream is just as fast as standard IO in a mature C compiler or in a mature C library and produces executables that also are not substantially bigger. That is going to be important to Cygnus in particular because it’s heavily involved in the embedded systems industry. TNC: Since we are talking a bit about libraries and what belongs in the C++ library, not the STL but the C++ library, how about network sockets. What’s the case there for that not being part of the library? NM: Well the standard C++ or the C++ standard describes what has to be in all systems. There is an awful lot of systems that don’t have a network socket and don’t understand what a network socket is. The description of operating system facilities is something that has generally been undertaken by POSIX. This typically starts with an ISO standard and then gets extended with various facilities for memory management, sockets and files. TNC: At the conference this week Bjarne said that the language was now at a state where he hoped people would start developing specialized libraries for specialized applications. And I guess our sockets would be one of the first functionality that could be implemented. Using a new coding style and new techniques. Nathan, I am sorry. We have to take a quick break. NM: Now that there is a standard, the other standard groups can reference it and make bindings for operating system interfaces. So you could see, you could start to see POSIX standards defined in terms of C and C++ interfaces, optimized for C++ systems, at the same time. Instead of defining a function to create an object and another function to destroy it and another function to operate on it, the standard could actually have a class definition in it with destructors or with constructors and a destructor. Now that we have a standard, you are more likely to see a variety of commercial and freeC++ libraries. The opportunity for doing really powerful C++ libraries is outstanding. There are things you can do in C++ that just really aren’t practical to do in other languages. And I can give you an example. You have seen regular expression libraries in C. These regular expression libraries are always designed to work over only a particular kind of character, a particular kind of sequence, usually a pointer, a regular C array or a file stream. And the way that they present the results is fixed. You only have one way of looking at what you’ve got, and this makes them really hard to use. TNC: So how would you build a C++ regular expression engine that handled different data types? Not only chars but streams of other types? NM: The useful thing that you might want to feed into a regular expression engine would be user interface advance. You know, there is no reason to restrict it to chars. And maybe a typed character would be one of, you know, somebody entering a character from the keyboard would be one of those events. But there are lots of other kinds of events. Another example would be instructions. It’s traditional to write people optimizers in terms of regular expression matching. And there is no reason you couldn’t write a design a regular expression library which was actually useful for people optimization. Now that’s an example of where the interface abstraction, the interface that the regular expression ending presents, is something that you would want to be able to make it look like a filter. You feed in a sequence of characters or symbols one end and you get another sequence of symbols out the other end. There are also other ways that you might want to look at the results from regular expression such as a sequence of text positions, or regions that describe what sections that matched. TNC: The output of the regex engines in C is typically arrays of integer positions. The first integeris the start position of the match and the second integer is the length of the portion of text that is matched. What C++ features would you use to make your regular expression engine type independent? NM: The first thing I would do is organize it to conform with the STL inventions. This means that any old sequence whether it’s a file or a string or a C array, will work with it. You can come up with your own sequence abstractions and just attach it to them. An editor buffer would be a good example of something that you might want to attach regular expression to. The second thing is the output interface. You’d also want to be able to structure it so that users can plug in different kinds of abstractions. One such abstraction would look like an iterator. And you could iterate over matching sequences or characters that matched and were replaced by other characters perhaps. This is all done with templates of course. Everything really interesting is done with templates. But, the main thing that you need in a regular expression engine is speed. If it’s not fast enough then you just can’t really use it. And the way we get that in modern C++ is by specializing. So if you happen to know that you are working with an array of characters you can specialize for that case. TNC: This makes it possible to write code that that is optimized for different types. NM: Yes. And if you architect the library properly you allow users to be able to identify special cases for their own types and optimize for those as well. TNC: One of the claimed benefits modern C++ is that it’s easier to learn than regular C++. This is a point that Bjarne made during his speech. He also makes it repeatedly in the book. How true is that? Do you honestly believe that modern C++ programming is easy to learn? NM: Modern C++ is a language for professionals. TNC: Next week I’ll be at Software Development and I’m going to meet a lot of C++ and Java users. The Java people are going to tell me that C++ is difficult and that the learning curve is the biggest advantage of Java over C++. NM: Learning everything about C++ is something that takes longer than for other languages, Except perl perhaps. To learn everything about perl takes years and years. And the reason is that most of perl is not commonly used in most programs. The same is true with C++. C++ has RTTI features. Have I ever used RTTI features? Well, I have used I think one of them. TNC: Bjarne Stroustrup says that it is possible to program in C++ without knowing the entire language. He believes users should first focus on high-level constructs and not bother, for example, with the C subset of the language. Do you believe this to be true? NM: Yes, it is. And to make it practical requires that the people who write the libraries are careful. TNC: Okay. We have to leave it here. I just want to mention your website, www.cantrip.org. Interesting site where you can learn, for example, that plastic cutting boards provide a nearly ideal environment for bacteria. NM: That’s right. Somebody tried it and it happened. TNC: And also that human sperm comes in three varieties. NM: That’s what it says. I got that from a book called the Red Queen. TNC: Okay, our guest has been Nathan Myers. Nathan is a member of the C++ standard committee. Nathan thank you very much for coming back on the show today. NM: Thanks for having me. More TechNetCast @ SD'98 East Interviews Copyright (c) 1998, Dr. Dobb's TechNetCast |
|