Taking Extracellular Vesicles From Discovery to the Clinic: Requirements, Challenges and General Considerations

Extracellular vesicle (EV)-based biomarker discovery efforts are being directed across almost every disease, in the hope of making EV diagnostics mainstream. With so much progress and growth being seen at the discovery level, the question must be asked: What does it take to make an EV biomarker successful? Here, we highlight fundamental requirements for the development of EV-based diagnostics.

Clinical application calls for more rigorous evaluation

Identifying differences between healthy and patient groups is a critical step towards biomarker development. However, there is a large gap between initial biomarker discovery studies and using an EV-based biomarker test to make clinical decisions about an individual. Early on, the goal is to identify differences between healthy and diseased groups, or between groups at different stages of disease progression – and studies are often retrospective. In the clinic, however, results are used to make decisions that affect an individual’s health outcome, and therefore have a heavier weighting. Measured parameters used to make clinical decisions must be reliable enough to establish and recognise healthy reference ranges, and/or detect changes in an individual over time.

The decision to implement a new test will not only be based on performance measures, but also practical considerations such as cost, time, analytical capacity, clinical usefulness, and ethical considerations. Any new test will also be assessed in the context of existing risk stratification methods,<super-script>1,2<super-script> therefore its performance musts hold up under a high level of scrutiny.

In early discovery stages, a certain level of crudeness might be acceptable. In development phases, however, everything needs to be highly streamlined. For an EV-based biomarker to be successful in the clinic, it must be rigorously evaluated, and a higher level of reproducibility is expected across the workflow. The whole process should be easy to reproduce by others who do not have the same level of understanding as those who developed the method.

Performance characteristics important for clinical application

Those on the path towards clinical application will know there are many bridges to be crossed; scalability, assay validation, and regulatory hurdles are challenges central to all types of biomarker development – not just EVs. With regards to assay validation, Ayers et al. (2019) provide a review of the clinical requirements for extracellular vesicles assays.<super-script>3<super-script> Here, performance characteristics are described as follows:

Trueness: Qualitative assessment of the closeness of the measured value to the true value

Precision: Measurement of how close a group of measurements are to each other
Clinical Sensitivity: Ability to identify all individuals with a condition
Clinical Specificity: Ability to identify all individuals without a condition
Linearity: Ability to detect an analyte in a linear fashion across the reportable range
Analytical sensitivity: Lowest level that can be reliably detected

Analytical specificity: Effect of interfering substances on the assay
Sample stability: Acceptable conditions and time that a sample can be stored prior to analysis
Sample diversity: Range of sample types that can be used on the assay
Uncertainty of measurement: Range of values that could reasonably be attributed to the measurement quality

Predictive power required for EV biomarker success

The predictive power of a classification system, such as an EV-based diagnostic test, can be assessed using receiver-operating characteristic (ROC) analysis. The system was originally developed during World War II to determine the accuracy of classification systems used to differentiate signal from noise in radar detection, and has since been adapted for the clinic.<super-script>4,5<super-script>

ROC curves depict the connection between clinical sensitivity (TP/(TP+FN)) and specificity (TN/(TN+FP)) at different cut-off values (TP: true positive, FN: false negative, FP: false positive). Sensitivity is plotted against the false positive rate (which can also be calculated as 1 – specificity) in a square plot where the maximum area under the curve (AUC) value = 1.

While there are several ways in which an ROC curve can be used, the AUC is one highly useful measure. The AUC provides a combined measure of sensitivity and specificity that describes the inherent validity or predictive power of different diagnostic tests.

ROC curves and AUC scores provide a way to select the best cut-off value and allow comparisons of different diagnostic tests.

The maximum AUC of an ROC curve is 1, indicating 100% sensitivity and specificity, whereas a test based on random guessing would have an AUC of 0.5. In reality, most tests fall somewhere in between. When considering a diagnostic test based on its sensitivity and specificity alone, a test with an AUC of 0.85 would be superior to one with an AUC of 0.81. ROC curves can also be used to identify the cut-off value that provides the most optimal balance of sensitivity and specificity. This will always be a trade-off, and the optimal cut-off value will be context-dependent.

Addressing challenges of normalisation

EVs are dynamic populations influenced by disease and physiological state, as well as many pre-analytical factors. Diagnostic developers must therefore navigate how to process data in a way that provides a sensible and meaningful EV-based measurement. Part of this includes the use of controls to show that measured changes reflect true differences, rather than differing amounts of starting material, for example. Potential avenues include the use of housekeeping controls, spiked-in material, and normalising EV-related parameters to a measure of the starting material.

Use of scalable technologies: start with success in mind

In the early stages of biomarker discovery, method selection is commonly based on ease of access, convenience, and cost. Scalability is therefore usually not a top priority, particularly when funding is limited. During biomarker development and clinical use, however, methods used in earlier stages are often no longer suitable. Ultracentrifugation, for example, is too labour-intensive (and subsequently, expensive), and irreproducible for clinical use. Therefore, if potential biomarkers are identified using ultracentrifugation as the separation method, a new separation protocol will need to be developed if the test is to be scaled. However, this is suboptimal and creates a time delay as earlier findings are validated and new controls are established.

An example of this bottleneck can be found in a recent study whereby candidate tumour-specific EV proteins were targeted using proteomics approaches, after separation by multiple rounds of ultracentrifugation coupled with sucrose density gradient centrifugation.<super-script>6<super-script> The study reported a final protocol which warrants further exploration as a potential rapid and non-invasive screening tool to identify early-stage colorectal cancer. However, the separation method is a major limitation and will need to be replaced by more rapid and cost-effective approaches if the biomarker candidates are to be taken further.

Rather than change isolation methods at the end of a biomarker discovery phase, we encourage researchers to start as they intend to continue: with a rapid, reproducible separation method that can be used in the clinic. Selecting a separation method with these characteristics has many benefits:

Reduce the measurement error observed in downstream analysis. Minimising measurement error is an important factor relevant to all stages of discovery and development: doing so is not only critical in the clinic where inter-laboratory reproducibility is key, but it also improves the likelihood of being able to detect a significant difference between groups in the first place.
Avoid the time-consuming and costly bottleneck associated with changing methods.
Save time in the discovery phase: spend less time on tedious separation, so more time can be spent on data analysis. Faster separation = more iterations = better chance of success.
Have a clear vision and start your biomarker research with a clear, scalable way forward. Biomarker implementation nearly always requires a collaborative effort; others may be more willing to invest and collaborate if they can see the path to success.

Clinical translation dependent on reproducibility

Reproducible and rapid EV separation methods are critical to enabling in-depth analysis of EVs and their cargo. In recognition of this, there has been a major shift away from ultracentrifugation towards size-exclusion chromatography – as documented by an ISEV Rigor and Standardization Subcommittee survey.<super-script>7<super-script> Given the critical importance of reproducibility and scalability for EV isolation and analysis, Izon Science has focused on building these characteristics into its qEV isolation platform. In our next article, we discuss a concept that is fundamental to driving the development of both EV biomarkers and the qEV isolation platform: maximising the signal-to-noise ratio.

‍