Code Smells: Multi-Responsibility Methods
In the last article, I introduced a large and complex method, validateQuery
. The focus of that article was how to simplify the method looking only at the “smell” of mutability, but there’s actually one very simple change that can chop the method nearly in half.
I’m afraid the time has come to post the method in its entirety, so I apologize for taking up so much screen real-estate. I’m going to mitigate this slightly by using the end result of our last refactoring, which saves us 20 lines of code or so:
static ValidatedField validateQuery(Class clazz, Mapper mapper, String propertyName, FilterOperator op,
Object val, boolean validateNames, boolean validateTypes) {
final ValidatedField validatedField = new ValidatedField(propertyName);
if (!propertyName.startsWith("$")) {
final String[] pathElements = propertyName.split("\\.");
final List<String> databasePathElements = new ArrayList<>(asList(pathElements));
if (clazz == null) {
return validatedField;
}
MappedClass mc = mapper.getMappedClass(clazz);
for (int i = 0; ; ) {
final String fieldName = pathElements[i];
boolean fieldIsArrayOperator = fieldName.equals("$");
final Optional<MappedField> mf = getMappedField(fieldName, mc, databasePathElements,
i, propertyName, validateNames,
fieldIsArrayOperator);
validatedField.mappedField = mf;
i++;
if (mf.isPresent() && mf.get().isMap()) {
//skip the map key validation, and move to the next fieldName
i++;
}
if (i >= pathElements.length) {
break;
}
if (!fieldIsArrayOperator) {
//catch people trying to search/update into @Reference/@Serialized fields
if (validateNames && !canQueryPast(mf.get())) {
throw new ValidationException(format("Cannot use dot-notation past '%s' in '%s'; found while validating - %s", fieldName, mc.getClazz().getName(), propertyName));
}
if (!mf.isPresent() && mc.isInterface()) {
break;
} else if (!mf.isPresent()) {
throw new ValidationException(format("The field '%s' could not be found in '%s'", propertyName, mc.getClazz().getName()));
}
//get the next MappedClass for the next field validation
MappedField mappedField = mf.get();
mc = mapper.getMappedClass((mappedField.isSingleValue()) ? mappedField.getType() : mappedField.getSubClass());
}
}
//record new property string
validatedField.databasePath = databasePathElements.stream().collect(joining("."));
if (validateTypes && validatedField.mappedField.isPresent()) {
MappedField mappedField = validatedField.mappedField.get();
List<ValidationFailure> typeValidationFailures = new ArrayList<>();
boolean compatibleForType = isCompatibleForOperator(mc, mappedField, mappedField.getType(), op, val, typeValidationFailures);
List<ValidationFailure> subclassValidationFailures = new ArrayList<>();
boolean compatibleForSubclass = isCompatibleForOperator(mc, mappedField, mappedField.getSubClass(), op, val, subclassValidationFailures);
if ((mappedField.isSingleValue() && !compatibleForType)
|| mappedField.isMultipleValues() && !(compatibleForSubclass || compatibleForType)) {
if (LOG.isWarningEnabled()) {
LOG.warning(format("The type(s) for the query/update may be inconsistent; using an instance of type '%s' for the field '%s.%s' which is declared as '%s'", val.getClass().getName(), mappedField.getDeclaringClass().getName(), mappedField.getJavaFieldName(), mappedField.getType().getName()));
typeValidationFailures.addAll(subclassValidationFailures);
LOG.warning("Validation warnings: \n" + typeValidationFailures);
}
}
}
}
return validatedField;
}
The smell that led me to this code in the first place was the multiple isPresent()
checks on my Optional mf
value, but I’m going to bypass that issue for now and see if there’s a quick win to simplify the code.
The Smell: Multi-Responsibility Methods
A method this size is almost definitely doing more than One Thing. The key is trying to work out which things it’s doing and how to classify those things.
One of my pet peeves is passing in boolean
parameters. These primitives are sometimes used to switch the behavior of a method, which suggests that if the method is doing two different things, following two paths, under two circumstances, well, maybe it’s two methods. If we investigate how the boolean parameters validateTypes
and validateNames
are used in this method, we may be able to find at least one code path which can be extracted into a separate method.
The validateNames
flag is used twice in the first half of the method, in lines 63 and 79, for a simple check. I suspect there’s a prettier way of achieving the same thing, but for now I’m going to ignore this value and move on to the other boolean.
Looking at validateTypes
, we see this is used to turn on (or off) the entire second half of the method. This looks like a good candidate to extract into its own method. Next I need to work out which values are needed for the first half of the method and which are needed for the second. I’ll start by looking at the parameters.
I already know validateNames
is only needed by the first half so I can ignore that. Next I’m simply going to highlight usages to see at a glance what I’m using. To fit it all on the screen for the screenshots, I’m using code folding to hide the code I’m not so interested in.
The first three parameters are only used in the first half of the method
The first three parameters aren’t used in the second part of the method that validates types, they’re only used in the first half.
Parameters “op” and “val” are only used in the second half of the method
The fourth and fifth parameters aren’t used at all in the first half, they’re only used in the second half.
The fact that three parameters are used for one part of the method and two parameters are used for a different part makes it pretty clear that what we’re dealing with here is really two different bits of functionality that need two different sets of input. This almost definitely deserves to be two distinct methods.
Step 1: Extract a validateTypes
Method
We want to see how easy it is to separate the second part into its own method without impacting any of the callers for the time being. There’s a good chance this code needs values from earlier in the method. Let’s do extract method and see what it looks like
See original source for gif
We end up with a validateTypes
method that looks like:
private static void validateTypes(FilterOperator op, Object val, boolean validateTypes,
ValidatedField validatedField, MappedClass mc) {
if (validateTypes && validatedField.mappedField.isPresent()) {
MappedField mappedField = validatedField.mappedField.get();
List<ValidationFailure> typeValidationFailures = new ArrayList<>();
boolean compatibleForType = isCompatibleForOperator(mc, mappedField, mappedField.getType(), op, val, typeValidationFailures);
List<ValidationFailure> subclassValidationFailures = new ArrayList<>();
boolean compatibleForSubclass = isCompatibleForOperator(mc, mappedField, mappedField.getSubClass(), op, val, subclassValidationFailures);
if ((mappedField.isSingleValue() && !compatibleForType)
|| mappedField.isMultipleValues() && !(compatibleForSubclass || compatibleForType)) {
if (LOG.isWarningEnabled()) {
LOG.warning(format("The type(s) for the query/update may be inconsistent; using an instance of type '%s' "
+ "for the field '%s.%s' which is declared as '%s'", val.getClass().getName(),
mappedField.getDeclaringClass().getName(), mappedField.getJavaFieldName(), mappedField.getType().getName()
));
typeValidationFailures.addAll(subclassValidationFailures);
LOG.warning("Validation warnings: \n" + typeValidationFailures);
}
}
}
}
Not surprisingly, our new validateTypes()
method needs op
and val
, the parameters we identified earlier, and the validateTypes
parameter. In addition, it also needs validatedField
, which is effectively the output from the first half of the method, and a MappedClass mc
. It looks like mc
is also an output of the earlier half of the method.
Step 2: Put All Outputs From validateQuery
Into ValidatedField
The reason we introduced ValidatedField
in the last article was to encapsulate multiple return values from validateQuery
. It seems logical, therefore, to add the MappedClass mc
to this class. Again, we have the option of adding it to the MappedField
instead (and I would certainly question why the MappedField
doesn’t already have it), but we know from our earlier investigations that MappedField
is sometimes null, and we expect MappedClass
to always be present.
See original source for gif
I’ve added a mappedClass
field to the ValidatedQuery
, and I used inline variable to use this new field everywhere I was using mc
(which has automatically been removed). Next I can remove the MappedClass
parameter from the validatedTypes
method.
See original source for gif
This now looks like:
private static void validateTypes(FilterOperator op, Object val, boolean validateTypes, ValidatedField validatedField) {
if (validateTypes && validatedField.mappedField.isPresent()) {
MappedField mappedField = validatedField.mappedField.get();
List<ValidationFailure> typeValidationFailures = new ArrayList<>();
boolean compatibleForType = isCompatibleForOperator(validatedField.mappedClass, mappedField,
mappedField.getType(), op, val, typeValidationFailures);
List<ValidationFailure> subclassValidationFailures = new ArrayList<>();
boolean compatibleForSubclass = isCompatibleForOperator(validatedField.mappedClass, mappedField,
mappedField.getSubClass(), op, val, subclassValidationFailures);
if ((mappedField.isSingleValue() && !compatibleForType)
|| mappedField.isMultipleValues() && !(compatibleForSubclass || compatibleForType)) {
logWarningForMismatchedTypes(val, mappedField, typeValidationFailures, subclassValidationFailures);
}
}
}
(Sorry, I know it’s not the most readable code).
Step 3: Get the Callers of validateQuery
to Call validateTypes
as Well
All the values that are passed in to this method are available to anything that called validateQuery
, therefore we can push the call of validateTypes
to the callers of validateQuery
. Firstly, we need to remove the private keyword from validateTypes
so that it can be called from other classes in the package.
Then, we find all the callers of validateQuery
using Alt + F7.
From here we can get the callers to call validateTypes
directly
Now all the call sites make this call themselves, we no longer need validateQuery
to call it, we can just remove the call to validateTypes
from the end of the method.
See original source for gif
When we changed the callers, we saw half of the places that call validateQuery
don’t need to call validateTypes
. The other two cases we don’t know, so we wrapped the call inside an if statement, for example:
final ValidatedField validatedField = validateQuery(query.getEntityClass(),
query.getDatastore().getMapper(),
fieldName,
op,
value,
validateNames,
validateTypes);
if (validateTypes) {
validateTypes(op, value, validateTypes, validatedField);
}
Now we’ve done that, IntellIJ IDEA is hinting that we may be able to make further simplifications:
We’ve moved the check for validateType
to the caller of the method (since this place often knows whether it needs to call the method or not). Therefore we can remove the check from inside the validateTypes
method, and remove the flag:
By moving the call of validateTypes
(and the decision of whether to call it) into the code that was calling validateQuery
, this has simplified the validateTypes
method a little, and put the responsibility of deciding whether to validate the types or not into the correct place. Running the tests shows everything still works as before.
Step 4: Simplify validateQuery
We saw at the beginning that half the parameters to validateQuery
weren’t needed for anything other than the section that validates the types. Now we’ve moved that code into its own method, and we’re not even calling that method any more from inside validateQuery
, we should be able to clean up some of the parameters.
See original source for gif
In Summary
Our validateQuery
method is now down to 57 lines of code, from 75, which is a noticeable improvement (I can nearly fit it on the screen without scrolling). More importantly, pushing validateTypes
into its own method means we no longer need to understand an additional boolean flag, and some of our calling code is much simpler because it doesn’t need to call this method at all.
When we stumble across very long methods we normally want to break them into smaller ones. In some cases, this process is easier than others. If the method takes groups of parameters that don’t interact with each other, for example in our case where the first half of the parameters are used in the first half of the method, and the second half are used in the other section of code, there’s a pretty clear indication of where the method should be split.
In our example, this is made even clearer by the use of flags turning functionality on or off. The use of a flag parameter may indicate that a method is doing two different things, and therefore should be split into two different methods. If the caller of a method can pass in a flag turning functionality on or off, that same caller can make the decision themselves (or avoid the call altogether).
In our case, the functionality was grouped together because the second part of the method relied on the output of the first. We solved this issue by adding another field to the new ValidatedField
class we introduced in the last refactoring, precisely to deal with multiple return values. The fact that the method needs to return multiple return values at all might be yet another smell, but let’s tackle one problem at a time.
Symptoms:
- Groups of parameters that are used in non-overlapping sections of code. E.g. three parameters used in the first half of a method, two in the second half.
- Boolean parameters used to turn whole blocks of code on or off.
- One part of the method is used to calculate a value (or values) that are only used as “input” to a later block of code.
Possible solutions:
- Extract distinct blocks of code (e.g. code from within an if statement that’s checking a boolean parameter) into its own method, and push the call of this new method to the caller of the original method (see: Replace Parameter with Explicit Methods).
- If the complex method with non-overlapping parameters is a constructor (or if you have two constructors with very different parameters), create two separate classes for the two separate contexts/sets of values (see: Extract Class).