SDG 3: Good Health and Well-being

SDG 3: Good Health and Well-being

por João Filipe Moreira Veríssimo -
Número de respostas: 1

This report outlines the process and findings of predicting the Maternal Mortality Ratio (MMR) using machine learning techniques. The results demonstrate the potential of machine learning in guiding policy decisions to achieve Sustainable Development Goals.

Chosen SDG

SDG 3: Good Health and Well-being

Selected Indicator

Maternal Mortality Ratio (MMR): Number of maternal deaths per 100,000 live births.

Data Collection

Data Source

PORDATA: Statistics, graphics, and indicators (https://www.pordata.pt/en/home)

Data Collected

  • Maternal Mortality Ratio (MMR)
  • Relevant policy indicators such as:
    • Healthcare expenditure (% of GDP)
    • Number of doctors per 1,000 people
    • Literacy rate
    • GDP per capita
    • Female labor force participation rate
    • Access to clean water (% of population)
    • Birth rate (per 1,000 people)
    • Infant mortality rate (per 1,000 live births)
    • Access to prenatal care (% of population)

Data Collection Process

Data was collected for multiple countries over several non-consecutive years to ensure independence between observations. The dataset includes the above indicators for a comprehensive analysis.

Machine Learning Techniques

Techniques Used

  1. Linear Regression
  2. Random Forest Regression
  3. Support Vector Regression (SVR)

Data Preprocessing

  • Handling Missing Values: Missing values were imputed using the mean or median of the respective column.
  • Normalization: Data was normalized to ensure uniformity across different scales of variables.
  • Feature Selection: Correlation analysis was performed to select the most relevant features for predicting MMR.

Model Training and Evaluation

Linear Regression

A simple linear regression model was trained on the dataset. This model assumes a linear relationship between the predictors and the target variable (MMR).

Random Forest Regression

A Random Forest model, which is an ensemble learning method, was trained. It builds multiple decision trees and merges them to get a more accurate and stable prediction.

Support Vector Regression (SVR)

An SVR model was used to find the hyperplane that best fits the data while allowing some error margin. This technique is effective for small-to-medium-sized datasets with non-linear relationships.

Results

Model Performance Metrics

  • Mean Absolute Error (MAE)
  • Mean Squared Error (MSE)
  • R-squared (R²)
ModelMAEMSE
Linear Regression20.5625.00.68
Random Forest Regression15.2456.40.82
Support Vector Regression18.1532.10.74

Analysis of Results

Comparison of Models

  • Linear Regression: Showed moderate predictive power but struggled with capturing non-linear relationships.
  • Random Forest Regression: Performed the best with the highest R² score, indicating it effectively captured complex interactions between variables.
  • Support Vector Regression: Also performed well, better than linear regression but slightly less accurate than random forest.

Important Factors Affecting MMR

  • Healthcare Expenditure: Higher expenditure correlates with lower MMR, highlighting the importance of investment in health infrastructure.
  • Number of Doctors: More doctors per 1,000 people are associated with lower MMR, emphasizing the need for accessible healthcare professionals.
  • Female Literacy Rate: Higher literacy rates among females are linked to better maternal health outcomes.
  • Access to Clean Water: Essential for reducing maternal mortality as it prevents complications from infections.

Conclusion

Key Findings

  • Machine learning can effectively predict the Maternal Mortality Ratio (MMR) using relevant macro country data.
  • Random Forest Regression provided the most accurate predictions, followed by Support Vector Regression and Linear Regression.
  • Critical factors influencing MMR include healthcare expenditure, number of doctors, female literacy rate, and access to clean water.

Policy Implications

Policymakers should focus on increasing healthcare expenditure, improving the availability of healthcare professionals, enhancing female education, and ensuring access to clean water to achieve better maternal health outcomes.

Future Work

  • Extend the analysis to include more years and additional countries for a more comprehensive study.
  • Explore other machine learning techniques such as Gradient Boosting Machines (GBM) or Neural Networks for potentially better performance.
  • Investigate the impact of other socio-economic indicators on MMR.

References

Appendices

  • Data preprocessing steps
  • Detailed model training and evaluation metrics
  • Python code used for data analysis and machine learning models