Chapter 3. Intermediate

Table of Contents

Personal Skills
How to Stay Motivated
How to be Widely Trusted
How to Tradeoff Time vs. Space
How to Stress Test
How to Balance Brevity and Abstraction
How to Learn New Skills
Learn to Type
How to Do Integration Testing
Communication Languages
Team Skills
How to Manage Development Time
How to Manage Third-Party Software Risks
How to Manage Consultants
How to Communicate the Right Amount
How to Disagree Honestly and Get Away with It
Judgement
How to Tradeoff Quality Against Development Time
How to Manage Software System Dependence
How to Decide if Software is Too Immature
How to Make a Buy vs. Build Decision
How to Grow Professionally
How to Evaluate Interviewees
How to Know When to Apply Fancy Computer Science
How to Talk to Non-Engineers

Personal Skills

How to Stay Motivated

It is a wonderful and surprising fact that programmers are highly motivated by the desire to create artifacts that are beautiful, useful, or nifty. This desire is not unique to programmers nor universal but it is so strong and common among programmers that it separates them from others in other roles.

This has practical and important consequences. If programmers are asked to do something that is not beautiful, useful, or nifty, they will have low morale. There's a lot of money to be made doing ugly, stupid, and boring stuff; but in the end, fun will make the most money for the company.

Obviously, there are entire industries organized around motivational techniques some of which apply here. The things that are specific to programming that I can identify are:

  • Use the best language for the job.

  • Look for opportunities to apply new techniques, languages, and technologies.

  • Try to either learn or teach something, however small, in each project.

Finally, if possible, measure the impact of your work in terms of something that will be personally motivating. For example, when fixing bugs, counting the number of bugs that I have fixed is not at all motivational to me, because it is independent of the number that may still exist, and is also affects the total value I'm adding to my company's customers in only the smallest possible way. Relating each bug to a happy customer, however, is personally motivating to me.

How to be Widely Trusted

To be trusted you must be trustworthy. You must also be visible. If know one knows about you, no trust will be invested in you. With those close to you, such as your teammates, this should not be an issue. You establish trust by being responsive and informative to those outside your department or team. Occasionally someone will abuse this trust, and ask for unreasonable favors. Don't be afraid of this, just explain what you would have to give up doing to perform the favor.

Don't pretend to know something that you don't. With people that are not teammates, you may have to make a clear distinction between ``not knowing right off the top of my head'' and ``not being able to figure it out, ever.''

How to Tradeoff Time vs. Space

You can be a good programmer without going to college, but you can't be a good intermediate programmer without knowing basic computational complexity theory. You don't need to know ``big O'' notation, but I personally think you should be able to understand the difference between ``constant-time'',``n log n'' and ``n squared''. You might be able to intuit how to tradeoff time against space without this knowledge, but in its absence you will not have a firm basis for communicating with your colleagues.

In designing or understanding an algorithm, the amount of time it takes to run is sometimes a function of the size of the input. When that is true, we can say an algorithm's worst/expected/best-case running time is ``n log n'' if it is proportional to the size (represented by the variable n) times the logarithm of the size. The notation and way of speaking can be also be applied to the space taken up by a data structure.

To me, computational complexity theory is beautiful and as profound as physics---and a little bit goes a long way!

Time (processor cycles) and space (memory) can be traded off against each other. Engineering is about compromise, and this is a fine example. It is not always systematic. In general, however, one can save space by encoding things more tightly, at the expense of more computation time when you have to decode them. You can save time by caching, that is, spending space to store a local copy of something, at the expense of having to maintain the consistency of the cache. You can sometimes save time by maintaining more information in a data structure. This usually cost a small amount of space but may complicate the algorithm.

Improving the space/time tradeoff can often change one or the other dramatically. However, before you work on this you should ask yourself if what you are improving is really the thing that needs the most improvement. It's fun to work on an algorithm, but you can't let that blind you to the cold hard fact that improving something that is not a problem will not make any noticeable difference and will create a test burden.

Memory on modern computers appears cheap, because unlike processor time, you can't see it being used until you hit the wall; but then failure is catastrophic. There are also other hidden costs to using memory, such as your effect on other programs that must be resident, and the time to allocate and deallocate it. Consider this carefully before you trade away space to gain speed.

How to Stress Test

Stress testing is fun. At first it appears that the purpose of stress testing is to find out if the system works under a load. In reality, it is common that the system does work under a load but fails to work in some way when the load is heavy enough. I call this hitting the wall or bonking[1]. There may be some exceptions, but there is almost always a ‘wall’. The purpose of stress testing is to figure out where the wall is, and then figure out how to move the wall further out.

A plan for stress testing should be developed early in the project, because it often helps to clarify exactly what is expected. Is two seconds for a web page request a miserable failure or a smashing success? Is 500 concurrent users enough? That, of course, depends, but one must know the answer when designing the system that answers the request. The stress test needs to model reality well enough to be useful. It isn't really possible to simulate 500 erratic and unpredictable humans using a system concurrently very easily, but one can at least create 500 simulations and try to model some part of what they might do.

In stress testing, start out with a light load and load the system along some dimension---such as input rate or input size---until you hit the wall. If the wall is too close to satisfy your needs, figure out which resource is the bottleneck (there is usually a dominant one.) Is it memory, processor, Input/Output, network bandwidth, or data contention? Then figure out how you can move the wall. Note that moving the wall, that is, increasing the maximum load the system can handle, might not help or might actually hurt the performance of a lightly loaded system. Usually performance under heavy load is more important than performance under a light load.

You may have to get visibility into several different dimensions to build up a mental model of it; no single technique is sufficient. For instance, logging often gives a good idea of the wall-clock time between two events in the system, but unless carefully constructed, doesn't give visibility into memory utilization or even data structure size. Similarly, in a modern system, a number of computers and many software systems may be cooperating. Particularly when you are hitting the wall (that is, the performance is non-linear in the size of the input) these other software systems may be a bottleneck. Visibility into these systems, even if only measuring the processor load on all participating machines, can be very helpful.

Knowing where the wall is is essential not only to moving the wall, but also to providing predictability so that the business can be managed effectively.

How to Balance Brevity and Abstraction

Abstraction is key to programming. You should carefully choose how abstract you need to be. Beginning programmers in their enthusiasm often create more abstraction than is really useful. One sign of this is if you create classes that don't really contain any code and don't really do anything except serve to abstract something. The attraction of this is understandable but the value of code brevity must be measured against the value of abstraction. Occasionally, one sees a mistake made by enthusiastic idealists: at the start of the project a lot of classes are defined that seem wonderfully abstract and one may speculate that they will handle every eventuality that may arise. As the project progresses and fatigue sets in, the code itself becomes messy. Function bodies become longer than they should be. The empty classes are a burden to document that is ignored when under pressure. The final result would have been better if the energy spent on abstraction had been spent on keeping things short and simple. This is a form of speculative programming. I strongly recommend the article ``Succinctness is Power'' by Paul Graham[PGSite].

There is a certain dogma associated with useful techniques such as information hiding and object oriented programming that are sometimes taken too far. These techniques let one code abstractly and anticipate change. I personally think, however, that you should not produce much speculative code. For example, it is an accepted style to hide an integer variable on an object behind mutators and accessors, so that the variable itself is not exposed, only the little interface to it. This does allow the implementation of that variable to be changed without affecting the calling code, and is perhaps appropriate to a library writer who must publish a very stable API. But I don't think the benefit of this outweighs the cost of the wordiness of it when my team owns the calling code and hence can recode the caller as easily as the called. Four or five extra lines of code is a heavy price to pay for this speculative benefit.

Portability poses a similar problem. Should code be portable to a different computer, compiler, software system or platform, or simply easily ported? I think a non-portable, short-and-easily-ported piece of code is better than a long portable one. It is relatively easy and certainly a good idea to confine non-portable code to designated areas, such as a class that makes database queries that are specific to a given DBMS.

How to Learn New Skills

Learning new skills, especially non-technical ones, is the greatest fun of all. Most companies would have better morale if they understood how much this motivates programmers.

Humans learn by doing. Book-reading and class-taking are useful. But could you have any respect for a programmer who had never written a program? To learn any skill, you have to put yourself in a forgiving position where you can exercise that skill. When learning a new programming language, try to do a small project it in before you have to do a large project. When learning to manage a software project, try to manage a small one first.

A good mentor is no replacement for doing things yourself, but is a lot better than a book. What can you offer a potential mentor in exchange for their knowledge? At a minimum, you should offer to study hard so their time won't be wasted.

Try to get your boss to let you have formal training, but understand that it often not much better than the same amount of time spent simply playing with the new skill you want to learn. It is, however, easier to ask for training than playtime in our imperfect world, even though a lot of formal training is just sleeping through lectures waiting for the dinner party.

If you lead people, understand how they learn and assist them by assigning them projects that are the right size and that exercise skills they are interested in. Don't forget that the most important skills for a programmer are not the technical ones. Give your people a chance to play and practice courage, honesty, and communication.

Learn to Type

Learn to touch-type. This is an intermediate skill because writing code is so hard that the speed at which you can type is irrelevant and can't put much of a dent in the time it takes to write code, no matter how good you are. However, by the time you are an intermediate programmer you will probably spend a lot of time writing natural language to your colleagues and others. This is a fun test of your commitment; it takes dedicated time that is not much fun to learn something like that. Legend has it that when Michael Tiemann[2] was at MCC people would stand outside his door to listen to the hum generated by his keystrokes which were so rapid as to be indistinguishable.

How to Do Integration Testing

Integration testing is the testing of the integration of various components that have been unit tested. Integration is expensive and it comes out in the testing. You must include time for this in your estimates and your schedule.

Ideally you should organize a project so that there is not a phase at the end where integration must explicitly take place. It is far better to gradually integrate things as they are completed over the course of the project. If it is unavoidable estimate it carefully.

Communication Languages

There are some languages, that is, formally defined syntactic systems, that are not programming languages but communication languages---they are designed specifically to facillitate communication through standardization. In 2003 the most important of these are UML, XML, and SQL. You should have some familiarity with all of these so that you can communicate well and decide when to use them.

UML is a rich formal system for making drawings that describe designs. It's beauty lines in that is both visual and formal, capable of conveying a great deal of information if both the author and the audience know UML. You need to know about it because designs are sometimes communicated in it. There are very helpful tools for making UML drawings that look very professional. In a lot of cases UML is too formal, and I find myself using a simpler boxes and arrows style for design drawings. But I'm fairly sure UML is at least as good for you as studying Latin.

XML is a standard for defining new standards. It is not a solution to data interchange problems, though you sometimes see it presented as if it was. Rather, it is a welcome automation of the most boring part of data interchange, namely, structuring the representation into a linear sequence and parsing back into a structure. It provides some nice type- and correctness-checking, though again only a fraction of what you are likely to need in practice.

SQL is a very powerful and rich data query and manipulation language that is not quite a programming language. It has many variations, typically quite product-dependent, which are less important than the standardized core. SQL is the lingua franca of relational databases. You may or may not work in any field that can benefit from an understanding of relational databases, but you should have a basic understanding of them and they syntax and meaning of SQL.



[1] This term has several meanings, derived from ‘to hit’, but is particularly used by athletes to describe running out of blood sugar or some other basic resource that manifests as a sudden, rather than gradual, degradation of performance or spirit. I have been told it has an additional, and very different meaning, throughout some of the British Commonwealth.

[2] At the time of this writing Michael Tiemann is the CTO of RedHat.