Clean code: refactoring
Techniques about rewriting code
It's time for an effective refactoring! Extract from long functions (extract until you drop), edit in line, extract methods or new classes. On TDD, after a test is passed, refactoring is the 3rd and probably the most important step of the small cycle red \ green \ blue. Let's explore the refactoring techniques with a short description. Many descriptions are taken from Martin Fowler's website and book while the links are from the amazing SourceMaking website. Here you can read further informations and examples for different programming languages.
Long methods need to be rewritten, otherwise they can cause problems. They are hard to read, understand and change. We can use unit test to ensure the result is always correct.
- Extract method: turn the fragment into a method whose name explains the purpose of the method. See the "extract 'till you drop" practice.
- In-line method: put the method's body into the body of its callers and remove the method.
- Extract variable: put the result of the expression, or parts of the expression, in a temporary variable with a name that explains the purpose.
- In-line temp: replace the references to a temporary variable with the expression itself.
- Replace temp with query: extract the expression into a method. Replace all references to the temporary variable with the expression. The new method can then be used in other methods.
- Split temporary variable: you have a temporary variable assigned to more than once, but is not a loop variable nor a collecting temporary variable. Make a separate temporary variable for each assignment.
- Remove assignments to parameters: some value is assigned to a parameter inside method's body. Use a local variable instead of a parameter.
- Replace method with method object: You have a long method in which the local variables are so intertwined that you cannot apply the extract method.
- Substitute algorithm: you want to replace an algorithm with one that is clearer. Replace the body of the method with the new algorithm.
Moving features between objects
The perfect description of these techniques comes from the SourceMaking website: these refactoring techniques show how to safely move functionality between classes, create new classes, and hide implementation details from public access.
- Move method: create a new method with a similar body in the class it uses most. Solution: either turn the old method into a simple delegation, or remove it altogether.
- Move field: a field is used more in another class than in its own class.Solution: create a field in a new class and redirect all users of the old field to it.
- Extract class: when one class does the work of two, create a new class and place the fields and methods responsible for the relevant functionality in it.
- In-line class: a class isn't doing very much and it's responsible for nothing.Solution: move all its features into another class and delete it.
- Hide delegate: a client is calling a delegate class of an object. Solution: create methods on the server to hide the delegate.
- Remove middle man: a class has too many methods that simply delegate to other objects. Solution: delete these methods and force the client to call the end methods directly.
- Introduce foreign method: an utility class does not contain the method that you need and you cannot add the method to the class. Solution: add the method to a client class and pass an object of the utility class to it as an argument.
- Introduce local extension: an utility class does not contain some methods that you need. But you cannot add these methods to the class. Solution: create a new class containing the methods and make it either the child or wrapper of the utility class.
From refactoring.guru: these refactoring techniques help with data handling, replacing primitives with rich class functionality. Another important result is untangling of class associations, which makes classes more portable and reusable.
- Self encapsulate field: you are accessing a field directly, but the coupling to the field is becoming awkward. Create getting and setting methods for the field and use only those to access the field.
- Replace data value with object: a class (or group of classes) contains a data field. The field has its own behaviour and associated data. Create a new class, place the old field and its behaviour in the class, and store the object of the class in the original class.
- Change value to reference: so you have many identical instances of a single class that you need to replace with a single object. Convert the identical objects to a single reference object.
- Change reference to value: you have a reference object that is too small and infrequently changed to justify managing its life cycle. Turn it into a value object.
- Replace array with object: you have an array that contains various types of data. Replace the array with an object that will have separate fields for each element.
- Duplicate observed data: Is domain data stored in classes responsible for the GUI? Then it is a good idea to separate the data into separate classes, ensuring connection and synchronization between the domain class and the GUI.
- Change unidirectional association to bidirectional: You have two classes that each need to use the features of the other, but the association between them is only unidirectional. Add the missing association to the class that needs it.
- Change bidirectional association to unidirectional: you have a bidirectional association between classes, but one of the classes does not use the other's features. Remove the unused association.
- Replace magic number with symbolic constant: your code uses a number that has a certain meaning to it. Solution: Replace this number with a constant that has a human-readable name explaining the meaning of the number.
- Encapsulate field: you have a public field. Solution: make the field private and create access methods for it.
- Encapsulate collection: a class contains a collection field and a simple getter and setter for working with the collection. Solution: Make the getter-returned value read-only and create methods for adding/deleting elements of the collection.
- Replace type code with class: a class has a field that contains type code. The values of this type are not used in operator conditions and do not affect the behavior of the program. Solution: Create a new class and use its objects instead of the type code values.
- Replace type code with subclasses: you have a coded type that directly affects program behavior (values of this field trigger various code in conditionals). Solution: Create subclasses for each value of the coded type. Then extract the relevant behaviors from the original class to these subclasses. Replace the control flow code with polymorphism.
- Replace type code with state \ strategy: you have a coded type that affects behavior but you cannot use subclasses to get rid of it. Solution: Replace type code with a state object. If it is necessary to replace a field value with type code, another state object is "plugged in".
- Replace subclass with fields: you have subclasses differing only in their (constant-returning) methods. Solution: Replace the methods with fields in the parent class and delete the subclasses.
Simplifying conditional expressions
Conditionals tend to get more and more complicated in their logic over time, and there are yet more techniques to combat this as well.
- Decompose conditional: you have a complex conditional (if-then/else or switch).Solution: decompose the complicated parts of the conditional into separate methods: the condition, then and else.
- Consolidate conditional expression: You have multiple conditionals that lead to the same result or action. Solution: consolidate all these conditionals in a single expression.
- Consolidate duplicate conditional fragments: Identical code can be found in all branches of a conditional. Solution: Move the code outside of the conditional.
- Remove control flag: you have a boolean variable that acts as a control flag for multiple boolean expressions. Solution: Instead of the variable, use break, continue and return.
- Replace nested conditional with guard clauses:
- Replace conditional with polymorphism: you have a conditional that performs various actions depending on object type or properties. Solution: create subclasses matching the branches of the conditional. In them, create a shared method and move code from the corresponding branch of the conditional to it. Then replace the conditional with the relevant method call. The result is that the proper implementation will be attained via polymorphism depending on the object class.
- Introduce null object: since some methods return null instead of real objects, you have many checks for null in your code.Solution: instead of null, return a null object that exhibits the default behaviour.
- Introduce assertion: for a portion of code to work correctly, certain conditions or values must be true. Solution: replace these assumptions with specific assertion checks.
Simplifying method calls
These techniques make method calls simpler and easier to understand. This, in turn, simplifies the interfaces for interaction between classes.
- Rename method: the name of a method does not explain what the method does. Solution: Rename the method.
- Add parameter: a method does not have enough data to perform certain actions. Solution: Create a new parameter to pass the necessary data.
- Remove parameter: a parameter is not used in the body of a method. Remove the unused parameter.
- Separate query from modifier: do you have a method that returns a value but also changes something inside an object? Solution: Split the method into two separate methods. As you would expect, one of them should return the value and the other one modifies the object.
- Parametrize method: multiple methods perform similar actions that are different only in their internal values, numbers or operations. Solution: Combine these methods by using a parameter that will pass the necessary special value.
- Replace parameter with explicit methods: a method is split into parts, each of which is run depending on the value of a parameter. Solution: Extract the individual parts of the method into their own methods and call them instead of the original method.
- Preserve whole object: you get several values from an object and then pass them as parameters to a method. Solution: Instead, try passing the whole object.
- Replace parameter with method call: before a method call, a second method is run and its result is sent back to the first method as an argument. But the parameter value could have been obtained inside the method being called. Solution: Instead of passing the value through a parameter, place the value-getting code inside the method.
- Introduce parameter object: your methods contain a repeating group of parameters. Solution: Replace these parameters with an object.
- Remove setting method: The value of a field should be set only when it is created, and not change at any time after that. Solution: So remove methods that set the field's value.
- Hide method: a method is not used by other classes or is used only inside its own class hierarchy. Solution: Make the method private or protected.
- Replace constructor with factory method: you have a complex constructor that does something more than just setting parameter values in object fields. Solution: Create a factory method and use it to replace constructor calls.
- Replace error code with exception: a method returns a special value that indicates an error? Solution: Throw an exception instead.
- Replace exception with test: you throw an exception in a place where a simple test would do the job? Solution: Replace the exception with a condition test.
Dealing with generalization
Abstraction has its own group of refactoring techniques, primarily associated with moving functionality along the class inheritance hierarchy, creating new classes and interfaces, and replacing inheritance with delegation and vice versa.
- Pull up field: two classes have the same field. Solution: Remove the field from subclasses and move it to the superclass.
- Pull up method: your subclasses have methods that perform similar work.Solution: Make the methods identical and then move them to the relevant superclass.
- Pull up constructor body:your subclasses have constructors with code that is mostly identical. Solution: Create a superclass constructor and move the code that is the same in the subclasses to it. Call the superclass constructor in the subclass constructors.
- Push down method: is behavior implemented in a superclass used by only one (or a few) subclasses? Solution: Move this behavior to the subclasses.
- Push down field: is a field used only in a few subclasses? Solution: Move the field to these subclasses.
- Extract subclass: a class has features that are used only in certain cases. Solution: create a subclass and use it in these cases.
- Extract superclass: you have two classes with common fields and methods. Solution: Create a shared superclass for them and move all the identical fields and methods to it.
- Extract interface: Multiple clients are using the same part of a class interface. Another case: part of the interface in two classes is the same. Solution: Move this identical portion to its own interface.
- Collapse hierarchy: you have a class hierarchy in which a subclass is practically the same as its superclass. Solution: Merge the subclass and superclass.
- Form template method: your subclasses implement algorithms that contain similar steps in the same order. Solution: Move the algorithm structure and identical steps to a superclass, and leave implementation of the different steps in the subclasses.
- Replace inheritance with delegation: you have a subclass that uses only a portion of the methods of its superclass (or it's not possible to inherit superclass data). Solution: Create a field and put a superclass object in it, delegate methods to the superclass object, and get rid of inheritance.
- Replace delegation with inheritance: a class contains many simple methods that delegate to all methods of another class. Solution: make the class a delegate inheritor, which makes the delegating methods unnecessary.
Refactoring a database is more difficult than refactoring code. We change the database schema and improve the design keeping behavioral and informational semantics. Database normalization is an example of database refactoring.
Refactoring without breaking anything: use unit test!
We can be afraid to change code because we don't want to break an existing working software. If we are working on legacy code we can use refactoring techniques for new features and write unit tests applying the Open \ Closed principle: we don't necessarily update existing code but we write new tested code. I know this seems complex and hard but you can find examples and you can apply this concept in a real world only practicing. You can try again and again being sure your working tests are telling you are doing a good job!
About code samples
To have more and more examples for each refactoring technique can be useful but it can be even boring. I ensure it's more important to understand the concept and see a single example than seeing many code snippets. If you will understand the concept. you will memorize the technique faster and you will try to make practice with your working code very soon. It can be very interesting to explain more about every group of techniques and maybe I will write new posts about it :)
- Technical debt
- Code smell
- Coupling, Duplicated code, D.R.Y
- Design patterns and Anti-patterns
- S.O.L.I.D principles
- Code refactoring on Wikipedia
- Refactoring.com - Martin Fowler
- Refactoring guru: same contents of SourceMaking with code samples and UML
Here we have the most influencing and fundamental books you can read:
- Refactoring: improving the design of existing code - Martin Fowler
- Clean Code - Robert "Uncle Bob" Martin
- Working effectively with legacy code - Michael Feathers
- Refactoring to patterns interactive - Joshua Kerievsky
- Refactoring for Software Design Smells: Managing Technical Debt