Climate change is causing rapid and significant changes to environments around the world, affecting plant health, reducing biodiversity, and threatening food production. To tackle these challenges, it’s crucial to understand how plants and the organisms (like bacteria and fungi) living in and around them respond to these shifting conditions (Thuiller et al., 2008; Delcour, Spanoghe and Uyttendaele, 2015; Bernatchez et al., 2023). Metagenomics has emerged as a powerful tool for analyzing microbial communities, offering insights into microbial diversity, population dynamics, and their interactions with environmental factors (Klindworth et al., 2013; Goodwin, McPherson and McCombie, 2016). This approach has been successfully applied in various fields, including agriculture and environmental protection, to monitor microbial responses to climate change and assess the efficacy of plant treatments (Eichmeier et al., 2018; Zhang et al., 2021; Špetík et al., 2022; Sieber et al., 2023). However, while metagenomics provides vast amounts of data, the challenge remains in effectively integrating these data with other environmental factors, such as climate variables, to produce meaningful insights. Machine learning (ML) has the potential to address this challenge by uncovering complex patterns in large datasets. Supervised learning algorithms such as random forests (RF), support vector machines (SVM), and artificial neural networks (ANN) have been used to predict species‘ responses to environmental changes and detect biomarkers indicative of ecosystem health (Thompson et al., 2019; García-Jiménez et al., 2021; Goodswen et al., 2021; Dutta et al., 2022; Yan et al., 2022; Bernatchez et al., 2023; Zhang et al., 2023). Studies appling machine learning to metagenomic data (e.g., using Random Forests, SVMs, etc.) to predict environmental changes or to identify biomarkers are often focusing on either environmental factors or metagenomic data in isolation, rather than in combination (Thompson et al., 2019; Yan et al., 2022). The, current ML models not capture the intricate relationships between microbial diversity, environmental factors, and crucially, the effects of agricultural treatments This gap is problematic because without understanding these interactions, it’s challenging to make accurate predictions about ecosystem health and the effects of agricultural interventions.