Izvestiya of Saratov University.

Physics

ISSN 1817-3020 (Print)
ISSN 2542-193X (Online)


For citation:

Shevtsova S. A., Saveleva M. S., Mayorova O. A., Prikhozhdenko E. S. Effect of low concentrations of hyaluronic acid on the structure of whey protein isolate during conjugation: Development and optimization of machine learning models based on adaptive boosting for spectroscopic data analysis. Izvestiya of Saratov University. Physics , 2025, vol. 25, iss. 3, pp. 305-315. DOI: 10.18500/1817-3020-2025-25-3-305-315, EDN: KWEXHY

This is an open access article distributed under the terms of Creative Commons Attribution 4.0 International License (CC-BY 4.0).
Published online: 
29.08.2025
Full text:
(downloads: 101)
Language: 
Russian
Article type: 
Article
UDC: 
519.688:543.424.2
EDN: 
KWEXHY

Effect of low concentrations of hyaluronic acid on the structure of whey protein isolate during conjugation: Development and optimization of machine learning models based on adaptive boosting for spectroscopic data analysis

Autors: 
Shevtsova Svetlana Alexandrovna, Saratov State University
Saveleva Mariia Sergeevna, Saratov State University
Mayorova Oksana Aleksandrovna, Saratov State University
Prikhozhdenko Ekaterina Sergeevna, Saratov State University
Abstract: 

Background and Objectives: Multicomponent mixtures with bioactive compounds, such as hyaluronic acid (HA) in protein matrices, are critical in pharmaceuticals, nutraceuticals, and cosmetics. However, detecting low-concentration additives (e.g., 0.1–0.5 wt.% HA in whey protein isolate, WPI) remains challenging due to signal interference and matrix complexity. Raman spectroscopy (RS) is a powerful tool for such analyses, but interpreting spectral data requires advanced computational methods. This study leverages adaptive boosting (AdaBoost), an ensemble ML algorithm, to (1) classify WPI-HA mixtures by HA concentration, (2) quantify HA content via regression, and (3) determine the minimal training dataset size needed for robust predictions. Materials and Methods: WPI (5 wt.%) was mixed with HA (0.1, 0.25, 0.5 wt.%) in saline, dialyzed, and dried into thin films. Renishaw inVia spectrometer equipped with a 532 nm laser was implemented to collect 600 spectra/sample (20×30-point maps). Preprocessing included cosmic-ray removal, baseline correction, and L2 normalization. AdaBoost models (scikit-learn) were optimized via GridSearchCV (hyperparameters: DecisionTree max_depth, 1–3; n_estimators, 50–350). Performance was tested across training set sizes (50–500 spectra/sample). Metrics included accuracy (classification) and R2/RMSE (regression). Results: Optimization: 325 DecisionTrees with max_depth = 3 have been found to be the best hyperparameters of AdaBoost. Classification: 50 spectra/sample have achieved 94.5% accuracy; 200/300 spectra have improved this to 97.9%/98.3%, respectively. The models have reliably distinguished WPI + 0.1% HA from WPI (>96% accuracy). Regression: 300 spectra/sample have yielded optimal results (R2 = 0.910, RMSE = 0.061%). Larger datasets (400–500 spectra) have reduced performance (R2 = 0.894), suggesting overfitting. Key bands for analysis: 763 cm–1 (tryptophan), 1003 cm–1 (phenylalanine), and 1240 cm–1 (amide III). Bands at 1450–1667 cm–1 (C–H/amide I/II) have shown negligible importance, indicating minimal HA-induced changes. Conclusion: AdaBoost models efficiently analyze trace HA in WPI with small training datasets (200 spectra for classification, 300 for regression). The method precision and speed make it ideal for industrial applications, while identified spectral markers have deepen understanding of HAprotein interactions. Future work could extend this framework to other multicomponent systems with low analyte concentrations.

Acknowledgments: 
This work was supported by the Russian Science Foundation (project No. 22-79-10270, https://rscf.ru/project/22-79-10270/).
Reference: 
  1. Vaou N., Stavropoulou E., Voidarou C., Tsakris Z., Rozos G., Tsigalou C., Bezirtzoglou E. Interactions between medical plant-derived bioactive compounds: Focus on antimicrobial combination effects. Antibiotics, 2022, vol. 11, iss. 8, art. 1014. https://doi.org/10.3390/antibiotics11081014
  2. Mehta N., Kumar P., Verma A. K., Umaraw P., Kumar Y., Malav O. P., Sazili A. Q., Domínguez R., Lorenzo J. M. Microencapsulation as a noble technique for the application of bioactive compounds in the food Industry: A comprehensive review. Appl. Sci., 2022, vol. 12, no. 3, art. 1424. https://doi.org/10.3390/app12031424
  3. Senthilkumar K., Vijayalakshmi A., Jagadeesan M., Somasundaram A., Pitchiah S., Gowri S. S., Alharbi S. A., Ansari M. J., Ramasamy P. Preparation of self-preserving personal care cosmetic products using multifunctional ingredients and other cosmetic ingredients. Sci. Rep., 2024, vol. 14, no. 1, art. 19401. https://doi.org/10.1038/s41598-024-57782-9
  4. Saletnik A., Saletnik B., Puchalski C. Overview of Popular Techniques of Raman Spectroscopy and Their Potential in the Study of Plant Tissues. Molecules, 2021, vol. 26, no. 6, art. 1537. https://doi.org/10.3390/molecules26061537
  5. Rebrosova K., Samek O., Kizovsky M., Bernatova S., Hola V., Ruzicka F. Raman spectroscopy – A novel method for identification and characterization of microbes on a single-cell level in clinical settings. Front. Cell. Infect. Microbiol., 2022, vol. 12, art. 866463. https://doi.org/10.3389/fcimb.2022.866463
  6. Pezzotti G. Raman spectroscopy in cell biology and microbiology. J. Raman Spectrosc., 2021, vol. 52, no. 12, pp. 2348–2443. https://doi.org/10.1002/jrs.6204
  7. Kočišová E., Kuižová A., Procházka M. Analytical applications of droplet deposition Raman spectroscopy. Analyst, 2024, vol. 149, iss. 12, pp. 3276–3287. https://doi.org/10.1039/D4AN00336E
  8. Dodo K., Fujita K., Sodeoka M. Raman Spectroscopy for Chemical Biology Research. J. Am. Chem. Soc., 2022, vol. 144, no. 43, pp. 19651–19667. https://doi.org/10.1021/jacs.2c05359
  9. Koronaki E. D., Kaven L. F., Faust J. M., Kevrekidis I. G., Mitsos A. Nonlinear manifold learning determines microgel size from Raman spectroscopy. AIChE J., 2024, vol. 70, no. 10, art. e18494. https://doi.org/10.1002/aic.18494
  10. Zhang Y., Gao P., Zhang N., Hong H., Ruan J., Gao X. Efficient detection of specific pharmaceutical components in compound medications based on Raman spectroscopy. Opt. Commun., 2025, vol. 577, art. 131470. https://doi.org/10.1016/j.optcom.2024.131470
  11. Sun Y., Tang H., Zou X., Meng G., Wu N. Raman spectroscopy for food quality assurance and safety monitoring: A review. Curr. Opin. Food Sci., 2022, vol. 47, art. 100910. https://doi.org/10.1016/j.cofs.2022.100910
  12. Fernández-Manteca M. G., Ocampo-Sosa A. A., de Alegría-Puig C. R., Roiz M. P., Rodríguez-Grande J., Madrazo F., Calvo J., Rodríguez-Cobo L., López-Higuera J. M., Fariñas M. C., Cobo A. Automatic classification of Candida species using Raman spectroscopy and machine learning. Spectrochim. Acta Part A Mol. Biomol. Spectrosc., 2023, vol. 290, art. 122270. https://doi.org/10.1016/j.saa.2022.122270
  13. Guo F., Yang X., Zhang Z., Liu S., Zhang Y., Wang H. Rapid Raman spectroscopy analysis assisted with machine learning: A case study on Radix Bupleuri. J. Sci. Food Agric., 2025, vol. 105, iss. 4, pp. 2412–2419. https://doi.org/10.1002/jsfa.14012
  14. Tang J.-W., Li F., Liu X., Wang J. T., Xiong X. S., Lu X. Y., Zhang X.-Y., Si Y.-T., Umar Z., Tay A. C. Y., Marshall B. J., Yang W.-X., Gu B., Wang L. Detection of Helicobacter pylori Infection in Human Gastric Fluid Through Surface-Enhanced Raman Spectroscopy Coupled With Machine Learning Algorithms. Lab. Investig., 2024, vol. 104, iss. 2, art. 100310. https://doi.org/10.1016/j.labinv.2023.100310
  15. Freund Y., Schapire R. E. A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting. J. Comput. Syst. Sci., 1997, vol. 55, no. 1, pp. 119–139. https://doi.org/10.1006/jcss.1997.1504
  16. Zhu J., Zou H., Rosset S., Hastie T. Multi-class adaboost. Stat. Interface, 2009, vol. 2, no. 3, pp. 349–360. https://doi.org/10.4310/SII.2009.v2.n3.a8
  17. Wang P., Li Y., Wang K., Qu H. Research on the application of ensemble learning methods for rapid diagnosis of osteoarthritis. Ensemble learning-assisted rapid diagnosis methods. Practical research on the application of serum Raman spectroscopy combined with ensemble learning methods. In: ICBAR’24: Proceedings of the 2024 4th International Conference on Big Data, Artificial Intelligence and Risk Management. New York, ACM, 2024, pp. 421–427. https://doi.org/10.1145/3718751.3718818
  18. Poth M., Magill G., Filgertshofer A., Popp O., Großkopf T. Extensive evaluation of machine learning models and data preprocessings for Raman modeling in bioprocessing. J. Raman Spectrosc., 2022, vol. 53, no. 9, pp. 1580–1591. https://doi.org/10.1002/jrs.6402
  19. Mishra D. P., Gupta H. K., Saajith G., Bag R. Optimizing heart disease prediction model with gridsearch CV for hyperparameter tuning. In: 2024 1st International Conference on Cognitive, Green and Ubiquitous Computing (IC–CGU). IEEE, 2024, pp. 1–6. https://doi.org/10.1109/IC-CGU58078.2024.10530772
  20. Muzayanah R., Pertiwi D. A. A., Ali M., Muslim M. A. Comparison of gridsearchcv and bayesian hyperparameter optimization in random forest algorithm for diabetes prediction. J. Soft Comput. Explor., 2024, vol. 5, no. 1, pp. 86–91. https://doi.org/10.52465/joscex.v5i1.308
  21. Kurniasih A., Previana C. N. Implementation of Grid-SearchCV to find the best hyperparameter combination for classification model flgorithm in predicting water potability. J. Artif. Intell. Eng. Appl., 2025, vol. 4, no. 2, pp. 1174–1182. https://doi.org/10.59934/jaiea.v4i2.844
  22. Rajput D., Wang W.-J., Chen C.-C. Evaluation of a decided sample size in machine learning applications. BMC Bioinformatics, 2023, vol. 24, no. 1, art. 48. https://doi.org/10.1186/s12859-023-05156-9
  23. Ramezan C. A., Warner T. A., Maxwell A. E., Price B. S. Effects of training set size on supervised machine-learning land-cover classification of large-area high-resolution remotely sensed data. Remote Sens., 2021, vol. 13, iss. 3, art. 368. https://doi.org/10.3390/rs13030368
  24. Stahlschmidt S. R., Ulfenborg B., Synnergren J. Multimodal deep learning for biomedical data fusion: A review. Brief. Bioinform., 2022, vol. 23, iss. 2, art. bbab569. https://doi.org/10.1093/bib/bbab569
  25. Bates F., Busato M., Piletska E., Whitcombe M. J., Karim K., Guerreiro A., del Valle M., Giorgetti A., Piletsky S. Computational design of molecularly imprinted polymer for direct detection of melamine in milk. Sep. Sci. Technol., 2017, vol. 52, iss. 8, pp. 1441–1453. https://doi.org/10.1080/01496395.2017.1287197
  26. Lu Y., Xia Y., Liu G., Pan M., Li M., Lee N. A., Wang S. A Review of methods for detecting melamine in food samples. Crit. Rev. Anal. Chem., 2017, vol. 47, iss. 1, pp. 51–66. https://doi.org/10.1080/10408347.2016.1176889
  27. Einkamerer O. B., Ferreira A. V., Fair M. D., Hugo A. The effect of dietary non-protein nitrogen content on the meat quality of finishing lambs. S. Afr. J. Anim., 2024, vol. 54, no. 3, pp. 340–357. https://doi.org/10.4314/sajas.v54i3.05
  28. Alizadeh Sani M., Jahed-Khaniki G., Ehsani A., Shariatifar N., Hadi Dehghani M., Hashemi M., Hosseini H., Abdollahi M., Hassani S., Bayrami Z., McClements D. J. Metal-organic framework fluorescence sensors for rapid and accurate detection of melamine in milk powder. Biosensors, 2023, vol. 13, no. 1, art. 94. https://doi.org/10.3390/bios13010094
  29. Lukacs M., Zaukuu J. L. Z., Bazar G., Pollner B., Fodor M., Kovacs Z. Comparison of multiple NIR spectrometers for detecting low-concentration nitrogen-based adulteration in protein powders. Molecules, 2024, vol. 29, no. 4, art. 781. https://doi.org/10.3390/molecules29040781
  30. Lukacs M., Bazar G., Pollner B., Henn R., Kirchler C. G., Huck C. W., Kovacs Z. Near infrared spectroscopy as an alternative quick method for simultaneous detection of multiple adulterants in whey protein-based sports supplement. Food Control, 2018, vol. 94, pp. 331–340. https://doi.org/10.1016/j.foodcont.2018.07.004
  31. Marinho A., Nunes C., Reis S. Hyaluronic acid: A key ingredient in the therapy of inflammation. Biomolecules, 2021, vol. 11, no. 10, art. 1518. https://doi.org/10.3390/biom11101518
  32. Yasin A., Ren Y., Li J., Sheng Y., Cao C., Zhang K. Advances in hyaluronic acid for biomedical applications. Front. Bioeng. Biotechnol., 2022, vol. 10, art. 910290. https://doi.org/10.3389/fbioe.2022.910290
  33. Juncan A. M., Moisă D. G., Santini A., Morgovan C., Rus L. L., Vonica-Țincu A. L., Loghin F. Advantages of hyaluronic acid and Its combination with other bioactive ingredients in cosmeceuticals. Molecules, 2021, vol. 26, no. 15, art. 4429. https://doi.org/10.3390/molecules26154429
  34. Iaconisi G. N., Lunetti P., Gallo N., Cappello A. R., Fiermonte G., Dolce V., Capobianco L. Hyaluronic Acid: A powerful biomolecule with wide-ranging applications – A comprehensive review. Int. J. Mol. Sci., 2023, vol. 24, no. 12, art. 10296. https://doi.org/10.3390/ijms241210296
  35. Wang N., Zhao X., Jiang Y., Ban Q., Wang X. Enhancing the stability of oil-in-water emulsions by non-covalent interaction between whey protein isolate and hyaluronic acid. Int. J. Biol. Macromol., 2023, vol. 225, pp. 1085–1095. https://doi.org/10.1016/j.ijbiomac.2022.11.170
  36. Zhong W., Li C., Diao M., Yan M., Wang C., Zhang T. Characterization of interactions between whey protein isolate and hyaluronic acid in aqueous solution: Effects of pH and mixing ratio. Colloid. Surf. B: Biointerfaces, 2021, vol. 203, art. 111758. https://doi.org/10.1016/j.colsurfb.2021.111758
  37. Zhong W., Zhang T., Dong C., Li J., Dai J., Wang C. Effect of sodium chloride on formation and structure of whey protein isolate/hyaluronic acid complex and its ability to loading curcumin. Colloid. Surf. A: Physicochem. Eng. Asp., 2022, vol. 632, art. 127828. https://doi.org/10.1016/j.colsurfa.2021.127828
  38. Zhong W., Li J., Wang C., Zhang T. Formation, stability and in vitro digestion of curcumin loaded whey protein/hyaluronic acid nanoparticles: Ethanol desolvation vs. pH-shifting method. Food Chem., 2023, vol. 414, art. 135684. https://doi.org/10.1016/j.foodchem.2023.135684
  39. Pedregosa F., Varoquaux G., Gramfort A., Michel V., Thirion B., Grisel O., Blondel M., Prettenhofer P., Weiss R., Dubourg V., Vanderplas J., Passos A., Cournapeau D., Brucher M., Perrot M., Duchesnay É. Scikit-learn: Machine Learning in Python. J. Mach. Learn. Res., 2011, vol. 12, iss. 85, pp. 2825–2830. Available at: http://jmlr.org/papers/volume12/pedregosa11a/pedregosa11a.pdf (accessed April 22, 2025).
  40. Zhao Y., Ma C. Y., Yuen S. N., Phillips D. L. Study of succinylated food proteins by Raman spectroscopy. J. Agric. Food Chem., 2004, vol. 52, iss. 7, pp. 1815–1823. https://doi.org/10.1021/jf030577a
  41. Mayorova O. A., Saveleva M. S., Bratashov D. N., Prikhozhdenko E. S. Combination of machine learning and Raman spectroscopy for determination of the complex of whey protein isolate with hyaluronic acid. Polymers, 2024, vol. 16, no. 5, art. 666. https://doi.org/10.3390/polym16050666
  42. Breiman L. Random Forests. Mach. Learn., 2001, vol. 45, pp. 5–32. https://doi.org/10.1023/A:1010933404324
  43. Becker T., Rousseau A. J., Geubbelmans M., Burzykowski T., Valkenborg D. Decision trees and random forests. Am. J. Orthod. Dentofac. Orthop., 2023, vol. 164, iss. 6, pp. 894–897. https://doi.org/10.1016/j.ajodo.2023.09.011
  44. Sun Z., Wang G., Li P., Wang H., Zhang M., Liang X. An improved random forest based on the classification accuracy and correlation measurement of decision trees. Expert Syst. Appl., 2024, vol. 237, pt. B, art. 121549. https://doi.org/10.1016/j.eswa.2023.121549
  45. Friedman J. H. Greedy Function Approximation: A gradient boosting machine. Ann. Stat., 2001, vol. 29, no. 5, pp. 1189–1232. Available at: http://www.jstor.org/stable/2699986 (accessed April 22, 2025)
  46. Friedman J. H. Stochastic gradient boosting. Comput. Stat. Data Anal., 2002, vol. 38, iss. 4, pp. 367–378. https://doi.org/10.1016/S0167-9473(01)00065-2
  47. Wang M., Zhang J. Surface enhanced Raman spectroscopy Pb2+ Ion Detection based on a gradient boosting decision tree algorithm. Chemosensors, 2023, vol. 11, no. 9, art. 509. https://doi.org/10.3390/chemosensors11090509
Received: 
02.05.2025
Accepted: 
12.06.2025
Published: 
29.08.2025