For those that are not of the "really-boring-tech" genus, a virtual machine is a pseudo-machine that runs in software, opposed to actual physical hardware. The Java runtime environment is an example of this, as well as traditional emulators. The trick is to provide a platform on which programs execute, similar to normal hardware.
I found this book more than helpful while writing Klea, my own yet-to-be-finished virtual machine, even though our goals are much different. HEC is a much lower-level virtual machine than Klea, and is register based, opposed to stack based. Still, a lot of the concepts are universal. Blunden explains everything pretty clearly, and HEC is more than I expected from an 'example' virtual machine.
The book leads with a very interesting forward, describing the uses of virtual machines today and in the past, and why it's generally a good idea. His slightly off-topic historical notes are some of the most enjoyable parts of the book. I'd suggest this as reading for anyone who really wants a good understanding on how some of the "black magic" of today came to be.
Mr. Blunden describes each instruction meticulously. HEC's instruction set is not clutterred with meaningless once-in-a-blue-moon operations, but a minimal, functional set that is more than adequate for it's purpose as a learning tool. He also spends a few paragraphs on the implementation of the not-so-obvious instructions. He suppliments this with many working examples of HEC assembly programs that demonstrate most of the functionality of HEC.
The author explains in detail how stack allocation works, in easy to understand terms, but a little vague and undecided about the heap. VM design isnt an exact science, so you can live with this. Most core memory management topics are covered in depth, with a lot of nice illustrations to clarify exactly what (should) be going on.
One of the more interesting chapters concerns Interprocess Communication (IPC). Several mechanisms are explored, and rated in terms of portability, speed, and ease of use, with a bias towards sockets. The TCP/IP protocol is dissected enough to cleanly enable an interface to be made to the virtual machine. Sockets, like many "higher level" operations, such as file I/O, are handled via an interrupt handler, which shows a powerful but simple method for the reader to extend HEC to allow more advanced functionality without overcomplicating the instruction set.
Interfacing the virtual machine to native code is a complex topic. He discusses various problems with calling conventions, conversions, and type compatiblity when crossing the barrier between various languages and the HEC VM. Although I thought this area was slightly over-engineered, as he used XML as the interchange format, it was informative. The important part is it shows the basics of allowing a low-level machine to "talk" to higher level languages through a consistant method. Unfortunately, This entire topic's implementation is worthless unless you're on Windows, or want to edit a few core HEC files. This, sadly, is one of the features that was stripped from the Linux version, as discussed later.
He then discusses debuggers, and walks through his implementation of a hardware-level debugger, embedded into the VM runtime. This is somewhat crude, but is very easy to understand from the viewpoint of someone that does not intimately know the codebase. The time spent on the debugger was pretty short and to-the-point, in a good way. I, personally, think the debugging section is the most well-written and informative chapter in the book.
The entire HEC toolchain consists of the VM, debugger, assembler, and a few assorted 'inspection' utilities. It's clear how they all work together in the system, but the example code is often times redundant, inelegant, and contain enough macros to piss off your preprocessor.
A lot of wheel-reinventing goes on in this book, and I thought it was a little off-track for it to invest more than a couple of pages into common, easy tasks like command line parsing. The lack of focus sometimes hurts more than it helps. The author tends to look really in depth into a problem, and then usually solve it in a completely wrong way.
It is impossible for me to be completely impartial about some of the design decisions, as I have studied this area intensely for the last year or so, and came to some really different conclusions. But nonetheless, these are old, ongoing, holy wars, so I was not quick to judge.
I was really disappointed that he swept kludges and shortcomings under the "next version" rug. Especially seeing that he has no apparent online location for errata, fixes, and code updates. I felt really short-changed: I was expecting this book to provide a lot more concrete examples, rather than string me on to the revised edition, possibly years from now. For example, a lot of the source code, even printed in the pages of the book, have fragments of a broken threading feature, but he instructed me to ignore that, as it'll only be useful in future versions.
There are also times where it seems he implemented obviously unacceptable limitations (relating to dynamic allocation, for example) for the sake of brevity. It appeared he didn't want to get into the core of any real solution, usually making excuses and sour-grapes statements along the way. This is not what I expected from a book that spent 20 or so pages describing his own vector and linked-list structures, and even more over-explaining how inturrupts work.
Overall, he paints a clear picture of what is involved in virtual machine design, from both a maintainance and performance point of view. He wonders off the path often, but it's readable, and pretty informative.
There are, however a few things I found slightly misleading, and outright ignorant presented in this book.
The first problem is it's title. The HEC virtual machine is written entirely in C. I found the title of "Implementation in C/C++" very misleading. This is true, however, because standard C is, by definition, valid C++. The HEC assembler was, however truly written in "C/C++", meaning "C++ using unsafe C standard library functions".
I was amazed as he exclaimed that the project was unmanageable in C, therefore required a rewrite in C++; only moments later to write his own std::list and std::vector(extendable array). This guy is obviously coming from a C background, and writes in sketchy C++ using cstdio and iffy re-implementations of proven (and portable) STL types. This is the point I lost a bit of confidence in the book. I'm not an object-oriented freak, but using the improved, safe C++ library routines make a lot of sense in C++, if for nothing else, the clarity of teaching.
In the last chapter, Blunden spouts off half-truths and blatant misinformation about Linux. I think this stems from his fear that "technology is moving out from under him", like he described in the first chapter concerning his transition between DOS and Windows. Blunden boasts HEC's portability goals throughout the book, but I think this was mostly market-speak: I don't consider a half-working feature-ripped Linux port true "portability" like he claims. He describes his painful attempt to port HEC to Caldera OpenLinux, and complains about "Linux's lack of wide-character support" after "man wprintf" fails. He then goes on to say that if any distribution does have wide character support, it shows a "fragmentation", which he says, is even worse.
I don't consider what he's experiencing fragmentation, but "picked the bad apple" syndrome. There's a reason Caldera basically has nil market share now. The command "man wprintf" and associated headers are in place and working in the default install of even ancient, Mandrake, RedHat, and Debian systems. I doubt he'd do a DOS port on DR-DOS, but the real, working DOS. Why he doesn't extend the same effort to Linux is beyond me. Porting on a failing, market-trailing, mostly broken implementation of an OS is not the best way to get started, especially if you have no experience with the platform.
He then went on and on about how Linux does not support loading arbitrary libraries at runtime, which is a lie at worse, or "wrong" at best. dlopen(), dlclose(), and dlsym() work perfectly for me, and are the exact equivalents to Windows' LoadLibary(), FreeLibrary() and GetProcAddress() calls. These functions are not new, nor Linux specific, they are part of the Unix98 standard, and come from Solaris. I found it odd that an "ex-Unix programmer" would be so naive to think that Linux, or any modern Unix, doesn't support what boils down to something as simple as plugins. He then goes on to show how to build a static library, claiming it's a dynamic library, and how it fails.
The best attribute I can give Mr. Blunden on this statement is "unresearched". I knew about the dl* functions years before I even saw a Linux desktop, while pondering porting a Win32 app to Linux. I don't expect authors to know every nuance of every platform available, but the book says "Includes ports of the HEC Virtual Machine for Windows and Linux" directly on the cover. In reality, you have a (possibly) working Windows implementation, and a half-assed, feature-stripped, but somewhat working Linux version, only because the author didn't put forth the effort to research the platform.
What's even more amazing is that I have implemented the missing Linux features in less than an hour (after removing more than a few ^M's from the source and build scripts), and considered sending him the changes, but his only point of contact for corrections is a physical mailing address. This is pretty unorthodox in the modern technical publishing environment. I'm not about to pay postage to ship him a CDR with a 2KB diff on it. And for some reason I doubt he'd eat his strong words and implement it anyway.
Also, I found it completely inappropriate and inflammatory to include the following passage, in a book about virtual machines, no less:
"Here is a canonical usability test. Take someone who has had minimal exposure to computers (i.e. your mother). Allow them to play with Microsoft WordPad and then give them a chance to fiddle with vi. WordPad will win every time. Normal people realize that having to memorize dozens of obscure commands is a waste of time, and they would rather interact with a tool that is intuitive and easy to work with. Naturally, there will be those members of the audience who think otherwise. They would say, "But, but, but ... vi has far more powerful features than WordPad." These are the same people that yearn for the gold old days of the 25x80 dummy terminals. The vi editor is nothing more than a relic from the pre-GUI era."
The absurdness of this remark alone led me to question the merits of the previous chapters. The first question I had is "Why not compare vi to DOS' edit?", and then I saw the real problem: "Why compare WordPad, a friendly GUI editor for Windows, to vi: and unfriendly command-oriented editor, instead of Linux's friendly GUI editors, such as KWrite, Kate, AbiWord or OpenOffice?".
This review isn't from the perspective of a rabid Linux user's perspective, but a guy that wanted to learn about virtual machines, and bought a book with "Virtual Machine Design" in the title. I was expecting a little more information about VMs, and a lot less spouting at the mouth about false Linux problems, and stupid apple-to-oranges comparisons, stemming from the author's fear of having to switch platforms again.
If you can put up with the opinionated, super-ego writing style (amazingly only the last few chapters), a few blatantly wrong technical errors every now and then, and more than a few unjustified (in my opinion) design decisions, I would still recommend this book, not for it's actual code or implementation, but because of the thought process, and overall big picture of how virtual machines work, and how they can be implemented.
I know this review has come off as pretty negative, almost overly so. I guess that is what happens when you end an otherwise fine book with flamebait that insults a decent percentage of your readers, who paid real money for a book about implementing virtual machines. I assume he does realize that his primary audience includes the technically-oriented crowd, a lot of which may be running Linux in some form or another.
The HEC virtual machine is a pretty primitive animal, but gave me some invaluable information in areas I was sorta "iffy" about before. I mostly got validation of the ideas I am using in Klea, rather than ground-breaking concepts, so I don't know how it'd treat someone who hasn't been studying virtual machines for while. That warm, fuzzy feeling of knowing I was on the right path was worth the $45 US I paid for it, anyway.