Abstract
Deep Raman spectroscopic techniques have been highlighted as a potential avenue for disease diagnosis and characterizing disease states. Transmission Raman spectroscopy has been explored in the literature for applications such as breast cancer detection as a non-invasive, non-ionising and chemically specific characterization method to target detection of breast microcalcification composition. However the sensitivity has not yet reached levels for uptake in clinic. The use of the low wavenumber analogue to transmission Raman (transmission low frequency Raman) provides additional information on the order of solid microcalcifications which is proposed to add information to increase the sensitivity of the approach. In addition, the multivariate classification methods (e.g. convolutional neural networks and support vector machines) used for diagnosis need to be further optimised. The stability of two machine learning techniques were probed by intentionally introducing spectral artefacts to the transmission low frequency Raman spectroscopic data collected from calcifications (calcium oxalate, crystalline, intermediate and amorphous hydroxyapatite) buried in chicken breast. SVM yielded a slightly better model with an AUC of 0.989 compared to 0.979 for the CNN. However, in general SVM were found to be more susceptible to spectral artefacts than CNN. Additionally, the performance of the CNNs and SVMs was not dependent on the magnitude of the shifts and stretches in the augmented data. An example is the linear- stretching of the data where the AUC remained at 0.977 and 0.969 for both 2 cm-1 and 5 cm-1 shifts for CNN and SVM, respectively.