Please use this link to access this publication.
Abstract
Cannabis is used to treat various medical conditions, and lines are commonly classified according to their total concentrations of Δ9-tetrahydrocannabinol (THC) and cannabidiol (CBD). Based on their ratio of total THC to total CBD, cannabis cultivars are commonly classified into high-THC, high-CBD, and hybrid classes. While cultivars from the same class have similar compositions of major cannabinoids, their levels of other cannabinoids and their terpene compositions may differ substantially. Therefore, a more comprehensive and accurate classification of medicinal cannabis cultivars, based on a large number of cannabinoids and terpenes is needed. For this purpose, three different chemometric-based classification models were constructed using three sets of chemical profiles. We examined those models to determine which provides the most accurate “chemovar” classification. This was done by analyzing profiles of cannabinoids, terpenes, and the combination of these substances using the partial least square-discriminant analysis multivariate (PLS-DA) technique. The chemical profiles were selected from the three major classes of medicinal cannabis that are most commonly prescribed to patients in Israel: high-THC, high-cannabigerol (CBG), and hybrid. We studied the correlations between cannabinoids and terpenes to identify major bio-indicators representing the plant’s terpene and cannabinoid content. All three PLS-DA models provided highly accurate classifications, utilizing six to nine latent variables with an overall accuracy ranging from 2 to 11% CV. The PLS-DA model applied to the combined cannabinoid-and-terpene profile did the best job of differentiating between the chemovars in terms of misclassification error, sensitivity, specificity, and accuracy. The combined cannabinoid-and-terpene PLS-DA profile had cross-validation and prediction misclassification errors of 4% and 0%, respectively. This is the first study to demonstrate the highly accurate classification of samples of medicinal cannabis based on their cannabinoid and terpene profiles, as compared to cannabinoid profiles alone. Furthermore, our correlation analysis indicated that 11 cannabinoids and terpenes might serve as bio-indicators for 32 different active compounds. These findings suggest that the use of multivariate statistics could assist in breeding studies and serve as a tool for minimizing the mislabeling of cannabis inflorescences.