Bioinformatics vs Computational Biology

The world of quantitative biology is large, diffuse and sometimes overwhelming. It’s hard sometimes to even figure out what someone means when they say “bioinformatics”. This can make it hard to figure out what part of the field someone works in.

One way to break it down is to describe bioinformatics as the building of tools and methods for the processing and management of biological data, and computational biology as the pursuit of biological sciences using computational methods. Therefore, bioinformatics is more of an engineering discipline and computational biology more a scientific discipline.

It’s helpful to think about these distinctions, subtle as they seem. It takes a certain mindset and skillset to build a robust sequencing analysis pipeline that will serve the needs of a large group of scientists for years. That mindset and skillset may be very different from the one required to do a deep investigation of the variants that impact risk of heart disease.

We can argue about the naming conventions all we want, but the label we apply to these two types of specialist doesn’t really matter. What matters is what they do; the person I would call a computational biologist writes code, yes, but does it in pursuit of a particular biological problem, and they would love to write less code and more manuscripts. The bioinformatician, on the other hand, wants to spend their time writing robust, high-quality code that does interesting and powerful computations. Papers are more of a nice side-effect.

The truth of the matter is that most programming biologists are a mix of the two disciplines.

When hiring for a small department or a startup, the distinction between these two caricatures becomes very important. Some people will be in the field for the biology specifically, and will choke when pressed to develop a tool for use by a team. Others will jump at the chance to write such a thing. Every group needs both of these. Consider the current needs; will this person be building a pipeline that will be re-used again and again? Or will they investigate particular variants, or particular compound response profiles? Fitting the right person to the job will ensure a happy employee and high productivity.

Figuring out what kind of background and preferences someone has can be as simple as asking them. Their resume or LinkedIn profile can also give clues. A software-focused person will tend to have one or more large, open-source bioinformatics software tools prominently listed. Their reference list may include a few papers describing this project and others (potentially many others) that use that tool. A manuscript-focused person will not be as likely to have a major tool-building segment of their resume. Instead, they will list a series of biology or dataset-focused projects, with manuscripts describing each.

Data Science

But where does data science fit into all this? That, at least, is simple; bioinformatics/computational biology is data science with a biology application, just as computational chemistry is data science for chemistry. Physicists have figured out that they’re all data scientists already, so there is no need for a name for them beyond “physicist”. I hope in the future we’ll do the same and just call ourselves “biologists”.