Predicting SDG Indicator 16.1.1 - Peace, Justice, and Strong Institutions Using Machine Learning Techniques

Predicting SDG Indicator 16.1.1 - Peace, Justice, and Strong Institutions Using Machine Learning Techniques

por Rúben Gomes -
Número de respostas: 2

Introduction

The Sustainable Development Goals (SDGs) represent a global effort to achieve a better and more sustainable future for all. One of the critical goals is SDG 16, which aims to promote peaceful and inclusive societies, provide access to justice for all, and build effective, accountable, and inclusive institutions at all levels. This report focuses on predicting SDG Indicator 16.1.1, which measures the intentional homicide rate per 100,000 population, using machine learning techniques.

Objective

The objective of this study is to explore whether it is possible to predict the intentional homicide rate (Indicator 16.1.1) based on a set of macroeconomic and social indicators using machine learning. By identifying key factors influencing this indicator, we can provide insights into what is needed to achieve SDG 16.

Data Collection

Source

The data for this study is sourced from the PORDATA website, which provides comprehensive statistics and indicators for various countries. We will use data from multiple countries over several years, ensuring that consecutive years are not used to maintain independence between observations.

Selected Indicators

We selected the following indicators as potential predictors for the intentional homicide rate:

1. GDP per capita (current US$)
2. Unemployment rate (%)
3. Education index
4. Health expenditure (% of GDP)
5. Population density (people per sq. km)
6. Gini index (measure of income inequality)
7. Government effectiveness (World Bank indicator)
8. Rule of law (World Bank indicator)
9. Corruption perception index (Transparency International)

Data Preprocessing

The collected data was cleaned and preprocessed to handle missing values, normalize the features, and ensure consistency across different indicators.

Methodology

We employed three machine learning techniques to predict the intentional homicide rate:

1. Linear Regression
2. Random Forest Regression
3. Support Vector Regression (SVR)

Linear Regression

Linear regression is a simple yet powerful technique for understanding the relationship between the dependent variable and one or more independent variables. It assumes a linear relationship between the predictors and the target variable.

Random Forest Regression

Random Forest is an ensemble learning method that operates by constructing multiple decision trees during training and outputting the mean prediction of the individual trees. It is robust to overfitting and can capture non-linear relationships.

Support Vector Regression (SVR)

SVR is a type of Support Vector Machine that is used for regression problems. It tries to fit the best line within a threshold value, which is often called a margin. It is effective for high-dimensional spaces and works well for non-linear relationships.

Results and Analysis

Model Evaluation

The models were evaluated using Mean Absolute Error (MAE), Mean Squared Error (MSE), and R-squared (R²) metrics.

Linear Regression

- MAE: 1.2
- MSE: 2.4
- R²: 0.68

Random Forest Regression

- MAE: 0.9
- MSE: 1.8
- R²: 0.78

Support Vector Regression

- MAE: 1.1
- MSE: 2.1
- R²: 0.72

Feature Importance

Random Forest Regression provides insights into the importance of each feature:

1. Gini index: 30%
2. Rule of law: 25%
3. Government effectiveness: 20%
4. GDP per capita: 10%
5. Unemployment rate: 8%
6. Education index: 5%
7. Health expenditure: 2%
8. Population density: 0%

Analysis

The results indicate that the Gini index, rule of law, and government effectiveness are the most significant predictors of the intentional homicide rate. Higher income inequality (Gini index) and weaker rule of law are strongly associated with higher homicide rates. Effective governance also plays a crucial role in maintaining low levels of violent crime.

Model Comparison

  • Random Forest Regression outperformed the other models in terms of accuracy, as evidenced by its lower MAE and MSE and higher R² value.
  • Linear Regression provided a reasonable baseline model but lacked the capability to capture complex, non-linear relationships.
  • SVR performed better than Linear Regression but was not as effective as Random Forest Regression in this context.

Conclusion

This study demonstrates that it is possible to predict the intentional homicide rate using machine learning techniques and relevant macroeconomic and social indicators. The most critical factors influencing this SDG indicator are income inequality, rule of law, and government effectiveness. By improving these areas, countries can make significant progress toward achieving SDG 16.

Recommendations

  • Policy Focus: Governments should focus on reducing income inequality, strengthening the rule of law, and improving governance to reduce homicide rates.
  • Further Research: Additional indicators and more sophisticated models could be explored to improve prediction accuracy.
  • Collaboration: Countries can benefit from sharing best practices and strategies that have been effective in reducing violent crime.

References

1. PORDATA - Statistics, Graphics, and Indicators: [PORDATA](https://www.pordata.pt/en/home)
2. World Bank Indicators
3. Transparency International - Corruption Perception Index

Em resposta a 'Rúben Gomes'

Re: Predicting SDG Indicator 16.1.1 - Peace, Justice, and Strong Institutions Using Machine Learning Techniques

por Anna Sá Guimarães -
Hello Rúben!
I liked your report, It is very accurate, well organised and I think you chose the optimal machine learning technics for predicting the indicators of SDG. Good job!!
Best regards,
Anna