Abstract
eXplainable Artificial Intelligence (XAI) offers a powerful framework for enhancing the transparency and trustworthiness of machine learning models in highly regulated fields such as food traceability. In this study, we applied XAI techniques, SHAP (SHapley Additive exPlanations) and Partial Dependence Plots (PDPs), to interpret a Random Forest (RF) classification model developed to determine the geographical origin of Brazilian soybean samples. A total of 60 samples, representing two biomes and six states, were analysed using stable isotope ratios (δ13C, δ15N, δ2H, δ18O, δ34S) and elemental composition (41 elements). The RF model achieved high classification accuracy at both the biome and state levels using the fused stable isotopes and elemental composition datasets. XAI tools revealed δ18O, δ2H, Rb, Cs, and Ca as the most influential features, with δ18O consistently emerging as the dominant predictor. SHAP beeswarm and waterfall plots provided global and local explanations of feature importance, while PDPs and two-way PDPs captured non-linear relationships and synergistic effects between isotopic and elemental variables. These findings confirm the discriminative power of geochemical markers and show the practical value of interpretable models for agroecological traceability and regulatory compliance. This approach advances XAI in food provenance, providing a transparent, region-specific framework that supports sustainability initiatives.
• RF model classified soybeans by biome and state of Brazil using geochemical data.
• First XAI study using SI and EC data for soybean traceability in Brazil.
• XAI revealed non-linear interactions and enhanced model interpretability.
• SHAP and PDPs identified δ18O, δ²H, Rb, Cs, and Ca as key origin markers.
• δ18O was the most consistent and influential feature in all classification tasks.