Saturday, December 15, 2007

Best Practices

I do mostly C++ and stick to the portable side. So while you can go to non-portable things like Microsoft APIs or .Net, I avoid those. I'll give a runthrough of my opinion as to things that should be in modern C++ applications of reasonable size.

When programming C++, there's books by Scott Meyers, Herb Sutter and Andrei Alexandruscu which teach you many of the best practices in the language. As far as I'm concerned, it's no longer acceptable to simply have a working program. You have to make it flexible, code for reusability and use best practices. It does show in aspects of the end product. Use of the Boost library is also very helpful for a C++ program nowadays, as it fills in many of the holes that the standard has not gotten to. Knowledge of good object-oriented techniques as well as good template metaprogramming skills make code much more maintainable when done correctly.

Gang of Four Design Patterns is still not known by 9 out of 10 developers. That's insane to me. The book has been out over 10 years. Generally the very good programmers make use of design patterns. Head First Design Patterns is great to learn it. I see the Gang of Four book as more of a reference. I consider the grasp patterns just as important. These are described in one of Craig Larman's books.

Use of UML for modeling and describing behavior. Most developers on my team don't know UML. Another insane thing. UML is a great tool for describing a system and designing it out. Especially sequence diagrams. If there's one thing that should be used more, it's sequence diagrams.

Test-Driven Development is great for making more flexible code and actually having tests for individual pieces of a system. What I like is in order to write the test you have to decouple things. There's a few books on this which are great.

Refactoring techniques are another area not understood by many software developers. Moreso you have to understand what you eventually want the code to look like, and I think most developers see mush, see that it works and then are happy. Refactoring techniques and TDD play in very nicely together, because if you have the initial test, you can make "safer" refactorings, and keep running quick tests each time you make changes.

Separation of GUI and logic has always been a great idea. You never know if someday you might run the GUI separately or want to swap out the GUI entirely and not change other code. That's the basis of all good object-oriented design, minimizing impact to the entire system, localizing impact to specific areas. Finally, as far as separation of logic, separating it further into a scripting engine with external scripts, gives a nice mix of flexibility and speed. You can then write scripts which control specific parts of a system without rewriting the engine and waiting for recompilation all day long. Furthermore, non-developers can write scripts too. Huge advantage to a development team.

Some of these have been known as good practices for many years. They're even more publicized today as good practices. Why there is so much resistance to known good practices is beyond me. Those are my feelings anyway. There's certainly other good practices that I haven't mentioned here.

Friday, December 14, 2007

Why COM Is So Dreadful and Outdated

I have two coworkers that absolutely love COM. Let's face facts; COM was designed to solve a specific type of problem and it not capable of solving others. COM is also very outdated at this point.

Let's go through the details:

COM was designed to work around a few things: First that C++ does not have a standard ABI (C does). The advantage here is that DLLs will be compatible between debug builds, release builds, and builds from different compilers, even different programming languages. Microsoft developed a whole library of components based on this (we'll can get into how truly bad MS's APis are some other time).

Now, the negatives:

1) Registration. Evil, evil, evil, evil. You mean to tell me that you can dynamically change the behavior of my app from right under my nose? This is worse than your regular DLL hell because registration ensures that there is only one "registered" copy per system. Finally with the advent of registration from COM, MS has realized how bad this "hell" is. Nonetheless, it absolutely causes havok. Consider a developer working on multiple builds on the same system. You can't run both builds at once because of this restriction! You can't even build one and have one run at the same time because you can have them conflict. This is an insane requirement. It is the #1 reason why COM is so terrible. Remember, global is bad. Every time you make some global, remember this rule. Anything that is global is bad. When you make something global, you're saying it's universally applicable. Everything is universally applicable until the day we discover that it isn't. That is the same day that we are screwed.

2) Hack of the C++ type system; incompatibility with the rest of C++. This is an unforgiveable requirement of COM. How can you make something in C++ that is incompatible with the rest of the language? They hacked their own version of virtual tables, force internal casting all over the place, simulates exception handling, thus preventing actual C++ exception handling, and is incompatible with many features of C++. Yes, you can use these features internally if they don't leak out of implementation. Wonderful. The fact that I can't pass an STL string through a COM interface makes it unsalvageable for C++. I could live with these restrictions, but only purely on the boundary between language, not within the same language.

3) Casting hell and cosmic hierarchies. Barf. I know at some point, Java, COM, etc... all though that having all classes derive from a single point was a great idea (let's not even get into IDispatch), but it encourages bad programming. In COM it's directly encouraged. To cast to/from IDispatch/IUnknown to get around bad design is poor. It's common practice in COM. Yet, ideally in object-oriented programming, you make use of design patterns to avoid such cases where casting in necessary. In COM based programming, people think that they are following good design since there's always a base interface. Wrong.

4) Factory abuse. COM factories are an abomination. The factory pattern was never meant to be used this way; COM's version isn't the factory pattern. It's not the abstract factory pattern either. These patterns were used to make object creation generic, not to hop around the type system and use IDs to check the registry, pick which DLL to load, load it, call a method, get returned an object, do internal casts to make it the type it should be, and then return it. Abuse. This is the kind of thing that should get you arrested by the C++ police.

5) The anti-refactoring. COM code is excessively difficult to refactor. This is exacerbated by the never-change-an-interface mantra. When you release third party SDks, you have to take care in changing interfaces. Nonetheless, it does happen. Some changes are breaking; they have to be. Breaking changes are sometimes required to progress an API. Otherwise you may be stuck with an underpowered, incomplete API. That is a poor solution to the risk of breaking code.

6) The idea that you can bring an app to a new system, run it with a newer COM object and it still works. A) Wrong. B) Wrong. C) Wrong. This doesn't always happen in practice. This is a benefit of keeping clean interfaces; COM wasn't required to do this. However, you NEVER want to toss new code into an existing code without retesting the entire thing. Thinking you can do this is niave at best, and moreoften, harmful. What's worse, because the actual changes are more insulted, instead of crashing (which is admittely nasty), you instead get a subtle bug and things appear to work... until they don't.

6) Difficult to write; difficult to read. COM objects are excessively difficult to write. I easily find that the overhead of setting up a COM object in addition to the registration woes is rough. Furthermore, having to convert my types into COM-friendly types back and forth is extra unnecessary effort. Writing them is annoying enough. As a developer, I hate writing COM. This is second only to how much I hate reading COM code. Ugly. What happened to clean modular code? All of this extra work, marshalling types back and forth, just to pass the COM layer. All of this extra work to catch every exception possible, to translate it into an error code, only to later translate it back into a _com_error.

There are so many reasons why COM is bad. It's just a bad technology. XPCOM has its own issues, as you can read here: COM in Mozilla

Development Thoughts

Hey guys.

One thing I was thinking about today is that quality of the code is very undervalued in software. The code has to be maintained by developers on a daily basis. It can make things more difficult if some developers write code that make it tougher for others to work with.

My mentality to try to follow a set of established rules, and attempting to create code that is both readable (self-documenting) and modular. A clear separation of GUI and logic adds more flexibility between those layers. Writing tests and trying to following a process such as TDD can help turn difficult-to-find issues later into easy-to-fix issues now. Choosing the best tools for the job and avoiding the "golden hammer", while using known "best practices" I find extremely beneficial. I try to help things get organized so things that have been hinderances don't continue to be so. I also try to help increase communication as much as possible, share information.

There's some irony in this, because I've run into some very poorly written code before. One example off the top of my head is that I ran into this monster function that had 1000 lines of code. After tracking down the exception that was occurring based on an error log, I was astonished to find out that there were about 998 lines between that try and catch.

Development has a lot of dynamics; well-written code is a pleasure to work with. Poorly written code takes twice as long, and management doesn't always understand why since it "worked" before.