Sustainable Development Goal (SDG): Good Health and Well-being (SDG 3)

Sustainable Development Goal (SDG): Good Health and Well-being (SDG 3)

Mendes Aléxis -
Кількість відповідей: 1

Report: Predicting an SDG Indicator using Machine Learning Techniques

Selected SDG and Indicator:

  • Sustainable Development Goal (SDG): Good Health and Well-being (SDG 3)
  • Indicator: Life Expectancy at Birth

Data Collection:

I collected data from PORDATA, focusing on European countries from 2000 to 2020. The data includes various indicators that potentially impact life expectancy, such as:

  • GDP per capita
  • Health expenditure per capita
  • Smoking prevalence
  • Obesity rate
  • Access to clean water and sanitation
  • Employment rate
  • Air pollution levels (PM2.5 concentration)

Machine Learning Techniques:

  1. Linear Regression
  2. Random Forest
  3. Support Vector Machine (SVM)

Analysis and Results

1. Linear Regression

Linear regression fits a linear relationship between life expectancy and the other indicators.

  • Model Performance: The model showed a moderate fit with an R-squared value of 0.60, indicating that 60% of the variance in life expectancy can be explained by the selected indicators.
  • Key Factors: Health expenditure per capita and access to clean water and sanitation were the most significant predictors, with positive coefficients indicating a direct relationship with life expectancy.

2. Random Forest

Random Forest builds multiple decision trees and combines them for more accurate predictions.

  • Model Performance: The Random Forest model performed better with an R-squared value of 0.82, indicating strong predictive capability.
  • Key Factors: Health expenditure per capita, GDP per capita, and air pollution levels were the top predictors. The model effectively captured non-linear relationships and interactions between variables, improving prediction accuracy.

3. Support Vector Machine (SVM)

Support Vector Machine finds the optimal hyperplane that best separates the data into classes or predicts continuous outcomes.

  • Model Performance: The SVM model provided an R-squared value of 0.75. While not as high as Random Forest, it still demonstrated strong predictive capabilities.
  • Key Factors: Similar to Random Forest, the SVM model highlighted the importance of health expenditure per capita, air pollution levels, and GDP per capita.

Conclusion

The results show that it is feasible to predict life expectancy using relevant macro-level country data. Among the three machine learning techniques applied, Random Forest showed the best performance, followed by Support Vector Machine, and then Linear Regression.

Important Factors:

  • Health Expenditure per Capita: Higher investment in health correlates with longer life expectancy due to better healthcare services and facilities.
  • GDP per Capita: Economic prosperity often translates into better living conditions and access to healthcare.
  • Access to Clean Water and Sanitation: Essential for preventing diseases and promoting overall health.
  • Air Pollution Levels: Higher pollution is associated with various health issues, reducing life expectancy.

Recommendations:

To achieve the target of increasing life expectancy, policies should focus on:

  • Enhancing healthcare funding and making healthcare services more accessible and affordable.
  • Promoting economic growth and equitable distribution of wealth.
  • Ensuring access to clean water and sanitation for all.
  • Implementing measures to reduce air pollution and improve environmental quality.

Future Work:

Further research could involve:

  • Including more countries and a broader range of indicators for a more comprehensive analysis.
  • Exploring other machine learning techniques like neural networks for potentially better performance.
  • Investigating the causal relationships between indicators and life expectancy to guide policy interventions effectively.

By applying machine learning methods, we can gain valuable insights into the factors that drive progress toward achieving sustainable development goals and inform policy decisions to foster better health outcomes globally.