Last year I was lucky enough to attend a talk by Dr Ulrich Keyser, a world-leader in “DNA origami” and synthetic nanopores from the University of Cambridge. He gave an excellent talk about his recent groundbreaking research, which I’ll attempt to do justice to in this blog post outlining it.

You may have heard of nanopores in the context of DNA sequencing, whereby strands of DNA are sucked through a tiny pore (in a membrane) which is only slightly larger than the width of the DNA strand itself. As the strand passes through the membrane via the pore, the electrical resistance of the membrane is changed depending on the individual base pair in the sequence. Here’s a short video which explains the concept:

Traditionally, these nanopores have been made of proteins (as seen in the video above), but this limits our ability to customise the nanopore structure for different applications because we don’t fully understand the process of protein folding, and it’s also a fairly arduous process to create complex custom proteins in the first place.

This is where DNA origami comes in, and as the name implies, it’s all about folding DNA sequences into desired 2D or 3D structures. This short video explains and demonstrates the concept brilliantly:

What Keyser and colleagues did a few years ago is use this DNA origami approach to build  DNA-based nanopores that are structured in a similar way to the protein nanopores, with implications for a lot of future nanopore sequencing or detection methods. That alone is amazing, but it’s got nothing on the main topic of this post – a 2016 paper by Bell and Keyser.

Remember the nanopore sequencing I mentioned earlier, and how it works by altering the electrical resistance across the synthetic membrane in a measurable way? Well, when a protein is attached to the DNA sequence and it passes through the nanopore, there is a similar characteristic spike in the resistance of the membrane at the precise moment when the protein passes through. This spike is much more noticeable than the spikes caused by the different DNA bases because the protein is so much larger than the DNA strand. Imagine a bead tied in the middle of a piece of string being pulled through a drinking straw.

So already we have a system that can tell you whether the DNA strand coming through the nanopore has a protein attached or not. Great. But what if you wanted to know which specific protein it was? To make a long story short, it’s not possible to distinguish between different proteins based on the disruption to the electrical current they cause when they pass through the nanopore, because they all essentially pass through the pore as a big lump – the amino acid sequence can’t be ascertained to identify the protein. So how else can it be done?

Since we ideally want to be able to detect a wide range of different proteins, not just ones that naturally bind to DNA, they attached specific antibodies (which bind specific proteins) to double-stranded DNA (dsDNA) strands. They have to use dsDNA for things like this because it’s more stable, and because most antibodies can’t be bound to single-stranded DNA (ssDNA). Now we have a huge library of dsDNA sequences that are capable of binding a range of different proteins. In order to identify the protein attached to a specific dsDNA sequence, the specific protein-binding antibody needs to be coupled to it, such that dsDNA sequence 1 is linked to antibody 1, dsDNA sequence 2 is linked to antibody 2, and so on. The problem is that we can’t just sequence these DNA strands and see which one of them carries a big resistance spike with it for a couple of reasons: first, dsDNA is being used which can’t be sequenced (it would need to be single stranded), and second, the nanopore required to fit large proteins through would be too wide to accurate read the signature of the DNA bases, even if the DNA was single-stranded.

This is where the real innovation by Keyser and his lab comes in.

The dsDNA sequences were specially designed using the principles of DNA origami to contain little bulging “dumbbells” of DNA in block along their length. This is illustrated in Figure 1.

Figure 1 | DNA origami “bumbbells”. a) diagram illustrating a length of dsDNA containing N DNA origami “dumbbells”, as well as close up view of the structure of each dumbbell. b) a 3d diagram of the arrangement of the “dumbbells” along a dsDNA sequence with a twist of 34.3 degrees. c) a representation of the dsDNA strand with attached “dumbbells” moving through a nanopore. Taken from Figure 1 of Bell & Keyser (2016).

These “dumbbells” result in similar spikes in resistance to the protein. After a series of experiments, they settled upon using 11 consecutive “dumbbells” to give a strong enough individual spike. Bell and Keyser used this 11-dumbbell block to represent a single digital bit. By spacing out several of these block along the dsDNA sequence they were able to create a digital barcode signature.

This means that when one of these specific dsDNA sequences is “read” by the nanopore, there is a characteristic pattern of spikes in the electrical resistance. You can design a dsDNA strand to contain the barcode “011” (no spike, spike, spike), and to contain the antibody for a particular protein. In other words, it’s possible to create a library of antibody-coupled dsDNA sequences where a particular barcode is only linked with a specific antibody, and therefore will only bind a specific protein. When this whole strand is “read” by the nanopore, you’ll see the characteristic barcode of that dsDNA sequence (e.g. 011) followed by the separate large spike of a protein. Therefore, you know that the protein that just passed through the nanopore was the one whose antibody you attached to the DNA sequence with the barcode 110. If you received a signal of that barcode without the additional protein peak, then you know that protein wasn’t bound by the antibody. Of course, as you expand the dsDNA library to bind a larger set of proteins, these barcodes can be made more complex to allow each individual protein their own specific signature.

Figure 2 |DNA barcoding and antigen presentation. a) A schematic of the dsDNA sequence with an antigen attached which binds a particular protein antibody. Throughout the post I discuss antibodies being attached directly to the DNA that would in turn bind a specific protein antigen, but the principle is the same. b) The electric current graph generated when the dsDNA strand with “dumbbells” and bound protein travels through the nanopore. This reading is divided into 2 sections: the barcode reading which tells us which specific protein is likely to be bound, and the protein reading where a peak would indicate that the protein in question is actually bound and therefore present. Taken from Figure 4 of Bell & Keyser (2016).

The future applications of this work are massive. Imagine that you have something like a blood sample, and you want to know which proteins are present. You could simply add a library of these dsDNA sequences, all with different barcodes and corresponding antibodies for different proteins. By running this solution through the nanopores (e.g. on a nanopore sequencing chip) within minutes you would know which of the proteins you were testing for were present and which were absent. This method even has the potential to be sensitive enough to give reliable measurements of the specific concentrations of each protein in the original sample, according to subsequent research by the same lab.

All that being said, there is still a great deal of work to be done before the technology is refined enough for these kinds of commercial applications. In this “proof-of-concept” research Bell and Keyser attached antigens to the dsDNA sequences in order to bind 4 different variants of a well-understood protein (antibody): IgG. A lot of the research will therefore be focused on customising the binding sites to be able to bind a range of relevant proteins.

To end on an optimistic note, this work (and the research leading up to it) could represent a quantum leap forward in protein detection assays, comparable to the jump to Next-Generation (DNA) Sequencing that occurred in the 1990s and led to the genomics revolution.


Comments and queries are welcome.