Leaky Libraries: How To Assess The Risk Of Dependencies

Leaky Libraries: How To Assess The Risk Of Dependencies

Ben Voss

May 26, 2014

A library is a package of code. Usually a library is small. It is focused on solving a single problem. A library is created when a problem is generic enough to be abstracted and reused.

The popular opinion around libraries today seems to be: the more it decides for the developer, the better. A library that forces implementations on a codebase is beginner-friendly. New developers, or dabblers, can get a feeling of instant productivity and gratification. There is a lot to learn when a developer first starts out, and black box tools build momentum and sustain enthusiasm.

Eventually, while using all of these libraries to build out solutions, something will have to change. It always does. If a library forces the consuming application to adopt its designs to work -- if it’s a leaky library -- the risk of higher costs skyrocket. The system will be harder to read and maintain. New developers will have a more difficult time ramping up. Choosing libraries is a cost-benefit analysis, but by analyzing libraries based on how many implementation details leak into the consuming application’s designs, we can make smarter decisions on which libraries to use.

The Costs

There is a difference between the code you write and the library’s code you consume. If a library is consumed and a modification in behavior is required, there are two costs:

  1. There is a cost to learn how the library was designed and operates. Depending on the quality and complexity of the design, this could be inestimable, and could be too advanced for many less experienced developers. And there is always the risk that it will all become an unrewarded sunk cost after figuring out the 3rd party isn’t flexible enough.

  2. The code to customize it will, in varying degrees, feel and look like a hack. The risk of introducing confusing designs into the codebase increases. I’ve seen custom solutions to 3rd parties that are easier to read and work with, but it is hard.

There a direct correlation between these costs, and how strongly a library forces implementation details. Lately I’ve been doing a lot of work in javascript, form validation and Backbone, and some of the more popular libraries may be long-term liabilities on a project.

Popular Libraries, and Risk Points

Backbone.Validation

Backbone.Validation is a plugin that offers dual form and model data validation. After installing, I add a validation configuration onto my model, not that different from where Rails decides to place validations. But having the library dictate where to place your validations introduces some potential problems.

Oftentimes, the data in my model will be different than the data I have in my view. Coupling these validations could introduce problems. Backbone.Validation also suffers from how it appends its messages and validations in a custom way:

_.extend(Backbone.Validation.messages, {
		required: 'This field is required',
		min: '{0} should be at least {1} characters'
});

And custom patterns:


_.extend(Backbone.Validation.patterns, {
		myPattern: /my-pattern/,
		email: /my-much-better-email-regex/
});

We need to extend a different object if it’s just a regular validator:

_.extend(Backbone.Validation.validators, {
		myValidator: function(value, attr, customValue, model) {
				if (value !== customValue) {
						return 'error';
				}
		},
		required: function(value, attr, customValue, model) {
				if (!value) {
						return 'My version of the required validator';
				}
		},
});

All in all, it is flexible enough to make changes. But there are potential problems: changes need to be made on a global object with no protection. Most likely all messages and validations need to be kept in a single file to protect against overriding. The application code experiences design leaks and this makes it harder to own and control the application.

The developer needs to learn the library instead of the application. There is a difference between where pattern validations go and everything else. There is a separate place for messages to their corresponding validations. This may be ok, but more time needs to be spent learning and customizing the library, instead of the application.

jQuery.validate

After calling #validate on a form, jQuery.validate handles data validation, error display timing, error display style and bindings for when that happens. It does a lot.

jQuery.validate is fairly customizable. It has ways to change where the error is displayed. Or what it looks like, or when it disappears. I can’t change my HTML structure though because it has to assume a certain markup contract for its out-of-the-box magic to work. But for everything else, there is documentation to cover it.

If a library’s documentation is hefty, be scared. It’s probably doing too much, and that means that when the hood gets popped, someone is going to get hurt. Libraries that control the way they are customized don’t give the developer any real control. The customization happens on the library’s terms, and this raises the risk of higher costs when change comes.

jQuery.validate was designed with side-effects as a feature. After binding a form and submitting it will automatically insert errors into the DOM and decide when they will be reevaluated. Developers supporting libraries that do things like this may be saying: “Everybody has to handle the errors, so we do it for them.” The problem is: you don’t know that. You can’t know that. There are too many use cases. If a library decides side-effects for you, it raises the risk of use.

Paperclip

I get it, it is built for the active record ecosystem, but because it assumes so much about the environment and designs, the cost of change would be enormous. Almost none of the code written in a rails system with paperclip could be reused or repackaged into a library to make file uploading easier in a company’s other applications. The developers would have to start from scratch.

Choose Libraries Based on Extensibility

Knowing that change always happens, these are heuristics to identify libraries that have a higher risk:

  1. Avoid libraries that offer several ways to customize or encourages monkey patching. If a library offers numerous ways to customize, that is an indication that it does too much out of the box.
  2. Avoid libraries that have side effects as a feature.
  3. Avoid libraries with lots of documentation. It’s probably doing too much.

Applications built with libraries chosen with those characteristics in mind will have a better chance for consistent, clean development, and lower the risk to change in the future. It can be tempting to choose a library because it delivers everything needed, right now. Just install and go. But beware of instant gratification. Don’t be afraid to say no to some dangerous libraries -- it may be the insurance needed to protect our applications from the future.