Design and
code reviews promise to improve software quality, ensure compliance with
standards, and serve as a valuable teaching tool for developers. As with most
practices, there are subtle nuances surrounding how they're performed that can
dramatically affect their value. In some organizations, reviews are a valuable
aspect of the software lifecycle. In others, they are a necessary evil tainted
with political bureaucracy and big egos. Sub-optimal reviews conducted late in
the lifecycle are often misguided due to few objective guidelines that help
guide the review process. When used throughout the development lifecycle, code
and design quality metrics are valuable inputs to the review process.
Reviews
Increase Agility
Continuous
Integration. Test Driven Development. Refactoring. Pair Programming. Agile
practices are abundant, and for many teams interested in increasing their
agility, valuable energy and resources have been devoted to improving these
practices. Because of this, many teams have abandoned reviews while emphasizing
other aspects of agility.. But reviews are an important tool in the agile
toolkit.
Advertisement
A driving
principle of the Agile
Manifesto
is continuous attention to technical excellence. Another is embracing and
harnessing change as an opportunity to increase customer advantage. For
developers, change often begins and ends with modifications to the source code.
A poorly designed application with smelly code is a breeding ground for risk
that makes change incredibly difficult, and is the greatest technical inhibitor
to increased agility. Effective reviews that emphasize design quality and code
cleanliness are an important aspect of increased agility. Reviews done right
help ensure continuous attention to technical excellence. Unfortunately, not
all reviews are done right.
Review
Worst Practices
Some
development teams find reviews a healthy and valuable asset to developers and
the project team. Other teams realize little value from their review process.
There are numerous causes for painful and ineffective reviews. Some symptoms of
ineffective reviews include:
Witch
hunt reviews - Many reviews degrade quickly into attack and defend mode. This
often occurs because the developer who wrote the code feels attacked and
threatened when reviewers make direct and opinionated statements about the
code. Nothing could be less productive.
Curly
brace reviews - Some reviews emphasize formatting and comments instead of more
serious problems. Is placement of curly braces and misspelled comments really
that important? Curly brace reviews are feeding ground for the anal retentive,
and provide no real value.
Blind
reviews - Often times, reviewers walk into the review meeting having never laid
eyes on the code they are about to review.
Most of the review time is spent trying to figure out what the code
does. Spending time in the review meeting attempting to understand the code
instead of reviewing it for more serious ailments is a waste of time.
Exclusionary
reviews - Many times, the code provided for the review is only a sampling of
the code written. For example, unit tests might be excluded from the review. In
an unhealthy review environment, providing impartial and incomplete code
listings will leave the reviewers wondering how the code actually works.
Tree
killer review - If you can't baffle them by providing half of what they need to
understand the code, then maybe overwhelming them by providing thousands of
lines of code might work. Waiting until the codebase is incredibly large to
host the first review is entirely ineffective. Not only is it to difficult to
provide effective feedback on a large codebase, these reviews are often held
late in the lifecycle and do not allow the developer to improve her code based
on the feedback received.
Token
review - It's not uncommon for management to dictate that reviews be held. In
fact, that's the reason I held my first review. Fortunately, I decided that if
I was required to host them, I was going to try to make them effective. Token
reviews are typically held for political reasons. Management wants to ensure
that all code is reviewed for auditing purposes. Unfortunately, developers
realize very little value surrounding these reviews. Any problems found are not
fixed unless they are absolutely critical. Since the primary motivation is an
audit trail for management, the team has little motivation to improve the code.
World
review - I've been to some reviews that probably should have been held in an
auditorium given the number of people in attendance. This can be incredibly
intimidating for the developers whose code is being reviewed, and I'm not sure
what value it provides to invite so many people. A few developers, up to five,
should serve all the needs required of the review process. If more people want
to provide input, there are better ways.
A few simple
tactical adjustments can alleviate many of these traditional review worst
practices. Let's explore some ways to improve the value of the review process.
Driving
with Metrics - Continuous Review
I'm sure many
of us have experienced the pain surrounding ineffective code reviews.
Fortunately, a good share of the effort put into reviewing code can be
accomplished through automation and feedback obtained through source code
analysis. Design and code quality metrics help bring greater objectivity,
focus, and emphasis to the review process, and alleviate the pain brought on by
the review worst practices.
Design
quality - Design metrics turn attention
away from developer's opinion on what constitutes robust design, and toward
metrics that provide objective feedback on the quality of the design. JDepend is an example of a tool in the
Java space that offers important feedback on design quality.
Code
cleanliness - Unused imports, references to implementation types instead of
interfaces, and empty catch blocks are all common coding mistakes that can
easily be caught by static analyzers such as PMD.
Code
quality - Exceptionally large methods with complex conditionals are a
significant maintenance burden, and are incredibly difficult to test well. A
simple tool like JavaNCSS that provides a glimpse into the
cyclomatic complexity of each method is incredibly valuable.
Test
coverage - Well-tested code is less likely to contain defects. Evaluating the
percentage of code under test is an important quality that impacts how easily
code can be refactored when necessary. Emma is an example of a tool that determines test
coverage. A newer tool, called Crap4j, combines coverage metrics and
cyclomatic complexity to help identify poorly written Java code.
Incorporating
feedback tools that generate metrics into a developer's local integrated
development environment is an effective approach toward automating an aspect of
the code review process. If your team has a robust continuous integration
strategy in place, these tools can be included as part of your automated and
repeatable build process, and results can then be posted to a project
dashboard. The dashboard helps avoid blind reviews because it can be used as
part of a review meeting to offer a high level perspective on quality.
Additionally, world reviews are replaced by making the dashboard available via
a published url, and only inviting a subset of people to an actual review
meeting. By bringing greater objectivity to the review process, team leaders
can quickly thwart witch hunt and curly brace reviews by insisting that the
team emphasize the aspects of code that weren't subject to automation. Examples
might include class responsibility at specific layers within the application,
or exception propagation up the call stack.
The 20%
Review
While metrics
promise to bring greater objectivity, focus, and emphasis to the review
process, maximizing the value of reviews demands that reviews be held early in
the development lifecycle. Utilizing tools as part of the build process is a
good way to ensure source code is analyzed, but neglecting to take advantage of
the feedback provided by the analysis is a common problem.
The idea
behind the 20% review is simple: once 20% of development is complete, a review
should be held. Some teams might find it beneficial to hold the 20% review each
iteration. That's certainly effective, but I've found that if teams do a good
job using metrics for a continuous review, holding the 20% review for each
major system function is sufficient.
Often times, major system functions are broken down into smaller stories
that span iterations, so ensuring that a strong foundation is built when developing
stories in the earlier iterations is a driving force behind the 20%
review. The 20% review should focus on
initial design and code cleanliness. The metrics discussed above offer
wonderful insight into the evolution and growth of the code while the size is
still relatively manageable.
Holding a
review meeting early in development helps align the developer and review team,
and offers the review team insight into the code and its structure while the
code is relatively small. As such, the 20% review avoids the problem with
exclusionary and tree killer reviews. Since the review team and developers are
building their relationship early in the lifecycle, it's also less likely that
reviews will turn into token reviews because the code is a manageable size and
is easier to understand. Since the review committee is involved throughout the
lifecycle, developers have time to make code improvements based on feedback
provided from the reviews. Waiting until the end of the development lifecycle
to perform reviews is a lost opportunity because there is no time remaining for
change. Even after the initial 20% review, anyone can gain insight into the
evolution of the code by reviewing the metrics on the project dashboard. Any
obvious deficiencies in the design or structure of the code can be fixed before
the small problem infests the entire code base.
The 20%
review can also be used as an aid to the developer. For instance, if a
developer finds that certain methods have an especially high cyclomatic
complexity number, she can ask her peers for advice on how to best simplify the
method through refactoring.
I've found
the 20% review is a critical aspect of an effective review process. Earlier
reviews offer the reviewers an opportunity to assess a number of quality metrics,
and establish the review process for ongoing development. In cases where
development has veered off course, a 20% review offers ample time to make the
necessary corrections. In general, I've found that if the first 20% of the code
written is of high quality, the emphasis on quality will remain throughout
development.
The Rest
of the Way
Once the 20%
review is complete, remaining reviews should be scheduled as needed, or upon
request by either the developer or the review team. The 80% point has worked
well for me. Ideally, the reviewers should maintain awareness of the evolution
and growth of the code by monitoring the project dashboard. Using metrics to
help drive reviews brings greater objectivity and focus to the review process,
and avoids many common problems surrounding reviews. Additionally, reviews
early in the lifecycle offer opportunity for improvement while time still
permits. Reviews later in the lifecycle prove more productive if earlier
reviews are held because the review team is familiar with the code's evolution
and growth. The continuous attention to technical excellence will result in
increased design and code quality.
About the Author
Kirk
Knoernschild is Senior Technology Strategist at TeamSoft, where he leads based on his
firm belief in the pragmatic use of technology. In addition to his work on
large development projects, Kirk shares his experiences through courseware
development, teaching, writing, and speaking at seminars and conferences. Kirk
has provided training to thousands of software professionals, teaching courses
on UML, Java J2EE technology, object-oriented development, component based
development, software architecture, and software process.
Comments (1)
rbielby: ...
I was once in a review where the reviewers only comment was something to the effect of using an ArrayList vs. Vector. Or, something along those lines. Which was probably just an attempt to provide something of value. Which obviously wasn't of any value. I would assume because there wasn't any common understanding of what the goal of the walk through was. Or possibly he just like to hear himself talk (most likely).
Your definitions of "worst practices" is a great tool in itself for defining those goals. For, once you see you've stepped into the realm of one of the worst practices, you know at that point that you've also digressed from what the goal should really be. Ensuring quality software. Which, as far as I know, is what we all really want to be doing anyway. As with most things in life, just a general awareness of something is a huge step towards being able to accomplish most goals.
One of the biggest challenges I've faced in meeting many of the goals you've stated is due entirely to personality. Personality of the individual giving the review, the personality of those attending and the personality of the team as a whole. It takes a lot of effort and deliberate use of words to not be either an attacker or to feel you're being attacked. Which probably means that there needs to be some kind of statement upfront about how those scenarios are going to be dealt with. Which almost sounds too "touch feely" for a bunch of software geeks, doesn't it?