Identification of Plasma Proteins Associated with Alzheimer's Disease Using Feature Selection Techniques and Machine Learning Algorithms
DOI:
https://doi.org/10.54327/set2025/v5.i1.189Keywords:
Alzheimer’s disease , ANOVA, blood biomarker, Feature selection, Machine learning, Plasma proteinsAbstract
Alzheimer’s disease (AD) is a chronic, progressive neurodegenerative disorder that typically affects elderly individuals. Detecting Alzheimer’s using plasma proteins is a critical step toward improving treatment results for this disease. This study aims to use computational algorithms to explore the relationship between plasma proteins and AD progression by identifying a panel of plasma proteins that can serve as biomarkers for tracking and diagnosing AD. We applied two feature selection methods, Sequential Backward Feature Selection (SBFS) and Analysis of Variance (ANOVA) to extract significant proteins from a dataset of 146 proteins. The data was collected from the plasma of 566 individuals, comprising both Alzheimer’s patients and healthy controls. The SBFS technique generated all possible combinations of protein groups from the 146 proteins, which were then trained and tested using five machine learning models: Decision Tree, Random Forest, Extremely Randomized Trees, Extreme Gradient Boosting, and Adaptive Boosting. Subsequently, ANOVA was applied to refine and reduce the selected panel size. Finally, we used XGBoost and AdaBoost models to validate the final panel. The findings introduce a plasma protein panel consisting of A2Macro, BNP, BTC, PPP, and PYY proteins for diagnosing AD. This panel achieved a sensitivity of 88.88%, a specificity of 66.66%, and an AUC of 0.85. These results demonstrate that plasma protein biomarkers can facilitate timely interventions, potentially slowing disease progression and improving patient outcomes. This non-invasive and affordable diagnostic method has the potential to make Alzheimer’s screening accessible to a broader population.
Downloads

Downloads
Additional Files
Published
Data Availability Statement
The data used in this study was obtained from the Alzheimer’s Disease Neuroimaging Initiative (ADNI) database (http://adni.loni.usc.edu) and is available with permission to all researchers.
Issue
Section
Categories
License
Copyright (c) 2025 Zakaria Mokadem, Mohamed Djerioui, Bilal Attallah, Youcef Brik

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website, social networking sites, etc).