Code Refactoring

Hero image for Code Refactoring

Code refactoring is the continuous process of improving the codebase without changing its external behavior. It is a key practice to ensure the codebase remains maintainable, extensible, and performant.

Regardless of the development types, the engineering team always factors in code refactoring as part of the development process. In greenfield development, refactoring is a continuous process that happens during the implementation of stories. While in brownfield development, refactoring often requires a dedicated effort to improve the codebase apart from feature development.

Though both processes share similarities, there is a difference between code refactoring and code optimization. Refactoring focuses on changing the code structure to make the code cleaner, understandable, and more manageable. In contrast, optimization focuses on improving code performance, e.g., reducing memory consumption or compile time.

Principles

Standard Process

While every refactoring effort seems to require a different approach, it is an incorrect assumption. Every code refactor follows the same standard process described in the following diagram:

Code Refactoring Standard Process

The general tendency is to jump straight into the third step: refactoring. However, the first two steps must not be skipped. Understanding in-depth and validating the current behavior, regardless of its flaws, are essential to ensure the refactoring does not change the code’s external behavior.

When no tests exist, the test matrix must be used to identify the type of tests that can bring the highest confidence, ideally the lowest effort, in validating the current behavior. For instance, writing a UI test will often provide higher confidence than a non-UI test.

Small Steps

“Refactoring in small steps helps prevent the introduction of defects”
Joshua Kerievsky, Refactoring to Patterns

Code refactoring must be done in small steps. Not only it follows Agile principles of iterative development, but also brings the following benefits:

  • It avoids working on long-running refactoring efforts, often leading to difficult-to-merge branches and unexpected breakages.
  • It helps to keep the codebase in a working state at all times.
  • It increases the confidence that the refactoring does not change the code’s external behavior.

Each step can correspond to a single commit or a separate story. Separating the refactoring into distinct commits makes it easier to review the changes in individual pull requests and revert the changes if necessary. In contrast, separating the work into multiple stories makes it easier to estimate the effort and prioritize the refactoring work. For instance, this approach is often preferred to take preparatory steps to implement a new major feature.

Categorization

Not every code refactor is created equal. The following categorization helps identify and communicate the type of refactoring and the expected outcome.

Scale

Minor Refactoring

Minor refactoring refers to making small improvements to code during a single working session. Some examples include:

  • Adding missing tests
  • Reducing duplicate code by creating a simple abstraction
  • Adjusting code to adhere to the codebase conventions

The most effective way to tackle minor refactoring is to make these changes whenever working on that particular code through an existing story. It avoids context switching and follows Robert C. Martin’s boy scout rule, which states that every code contributor should always leave the code better than they found it. Creating a separate story for such refactoring would be more inefficient and costly in planning.

Major Refactoring

Major refactoring refers to any effort to address more complex issues during several working sessions or even a few days. Some examples include:

  • Migrating to a newer or different library
  • Refactoring a large piece of legacy code to add new features
  • Automating a tedious manual process

The most effective way to tackle major refactoring is to have a dedicated story for it. It allows the team to plan the effort and prioritize it accordingly. It also allows the team to communicate the refactoring effort to the stakeholders and manage their expectations.

Metrics

The key to prioritizing the right major refactoring effort is to use the following metrics:

  • Business Impact (BI)

    This answers the question: “How much does the refactoring effort impact the business?” If the refactoring concerns a critical feature, it should be prioritized higher than a refactoring that concerns a non-critical feature.

  • Engineering Productivity (EP)

    This angle answers the question: “How much does the issue affect the development team negatively?” Some technical debt issues can reduce the productivity of the whole team, e.g., a slow build process or a flaky test suite. The larger the team, the higher the cost.

  • Contagion Risk (CR)

    This perspective addresses the question: “If this issue is left unresolved, how much will it proliferate?” Technical debt must be kept under control, hence this perspective can help determine the urgency of the refactoring effort.

Used in conjunction and with a scoring system (e.g., out of 10) for each metric can help prioritize the refactoring effort. Below is a practical example:

Description BI EP CR Total
Update the payment SDK in checkout flow 8/10 2/10 4/10 14
Update the analytics library 6/10 2/10 8/10 16
Prevent flaky automated tests 2/10 8/10 5/10 15

The metrics are inspired and adapted from Riot’s technical debt taxonomy.

Techniques

Regardless of the stack, the refactoring techniques can be grouped into three main categories:

  • (Re-)composition

    The techniques focus on re-structuring existing methods and classes or modules to make them more transparent and easier to understand.

  • Simplification

    The techniques aim to simplify code blocks (e.g., conditional expressions, code branching, etc.) and method calls.

  • Abstraction

    Abstraction techniques refer to creating new classes or modules and defining new data structures to encapsulate the complexity of the code.

The techniques are not mutually exclusive. For instance, a refactoring effort can involve both re-composition and abstraction.

All these techniques are widely accepted as standard practices. Completing the following course, based on an online resource, is a must to learn more about these techniques efficiently:

View the course on Code Refactoring Techniques