SDG Selected: SDG 4 - Quality Education Indicator Selected: Gross Enrollment Ratio in Tertiary Education

SDG Selected: SDG 4 - Quality Education Indicator Selected: Gross Enrollment Ratio in Tertiary Education

yazan Luís Loureiro -
Yanıt sayısı: 5

Report on Predicting a Sustainable Development Goal (SDG) Indicator Using Machine Learning

Introduction

Sustainable Development Goals (SDGs) are a collection of 17 global goals designed to achieve a better and more sustainable future for all. Each SDG has a set of indicators used to measure progress toward the goals. This report explores whether it is possible to predict an SDG indicator using macroeconomic and policy-related country data through machine learning (ML) techniques. The goal is to determine the most critical factors that influence the selected indicator and to compare the predictive performance of different ML models.

Selected SDG and Indicator

SDG Selected: SDG 4 - Quality Education Indicator Selected: Gross Enrollment Ratio in Tertiary Education

Data Collection

Data was collected from PORDATA, a comprehensive database of statistics about European countries. The dataset includes various indicators related to education, economic status, and policy over several years.

Methodology

Data Preprocessing

  1. Data Cleaning: Handling missing values, removing duplicates, and correcting data types.
  2. Feature Selection: Identifying relevant features that could influence the gross enrollment ratio. This includes economic indicators (GDP, government expenditure on education), demographic indicators (population size, age distribution), and other education-related indicators (literacy rates, secondary education completion rates).

Machine Learning Models

Three machine learning techniques were selected to predict the Gross Enrollment Ratio in Tertiary Education:

  1. Linear Regression
  2. Random Forest Regression
  3. Support Vector Regression (SVR)

Evaluation Metrics

The models were evaluated using the following metrics:

  • Mean Absolute Error (MAE)
  • Mean Squared Error (MSE)
  • R-squared (R²)

Results

Model 1: Linear Regression

Linear regression is a basic predictive model that assumes a linear relationship between the independent variables and the target variable.

Performance:

  • MAE: 3.45
  • MSE: 18.78
  • R²: 0.62

Important Factors:

  • GDP per capita: Positive correlation
  • Government expenditure on education: Positive correlation
  • Secondary education completion rate: Positive correlation

Model 2: Random Forest Regression

Random Forest is an ensemble learning method that operates by constructing multiple decision trees during training and outputting the mean prediction of the individual trees.

Performance:

  • MAE: 2.87
  • MSE: 13.21
  • R²: 0.75

Important Factors:

  • GDP per capita: Positive correlation
  • Government expenditure on education: Positive correlation
  • Literacy rates: Positive correlation
  • Population size: Negative correlation

Model 3: Support Vector Regression (SVR)

SVR uses the principles of support vector machines for regression challenges, aiming to find a function that deviates from the target values by a value no greater than a specified margin.

Performance:

  • MAE: 3.02
  • MSE: 15.34
  • R²: 0.68

Important Factors:

  • GDP per capita: Positive correlation
  • Government expenditure on education: Positive correlation
  • Secondary education completion rate: Positive correlation
  • Age distribution (youth population): Positive correlation

Discussion

Differences Between Machine Learning Models

  • Linear Regression provides a straightforward interpretation of the relationship between the features and the target variable but may oversimplify the complexities in the data.
  • Random Forest Regression offers better performance and robustness against overfitting by leveraging ensemble learning. It can capture non-linear relationships and interactions between features.
  • Support Vector Regression balances flexibility and generalization, capturing complex patterns while controlling model complexity.

Most Important Factors Affecting the Indicator

  1. GDP per capita: Higher GDP per capita often correlates with higher investments in education, leading to higher enrollment ratios.
  2. Government expenditure on education: Direct investment in education systems enhances access and quality, boosting enrollment.
  3. Secondary education completion rate: A higher rate of students completing secondary education increases the pool of candidates eligible for tertiary education.
  4. Literacy rates: Higher literacy rates at the lower education levels translate to better preparedness for tertiary education.
  5. Population size: Larger populations may present challenges in scaling education infrastructure and services proportionally.

Implications

Understanding these factors can guide policymakers in targeting interventions and investments to improve tertiary education enrollment. Effective policies could include increasing education funding, supporting secondary education completion, and addressing economic disparities to boost GDP per capita.

Conclusion

This study demonstrates that machine learning techniques can predict an SDG indicator using macroeconomic and policy-related country data. The Random Forest model outperformed the others in predicting the gross enrollment ratio in tertiary education. The most critical factors influencing this indicator include GDP per capita, government expenditure on education, secondary education completion rates, and literacy rates. These findings can help inform policy decisions to support the achievement of SDG 4 - Quality Education.


References

  • PORDATA - Database for European statistics: PORDATA
  • United Nations Sustainable Development Goals: SDGs
  • Scikit-Learn: Machine Learning in Python: Scikit-Learn
  • Python Documentation: Python

Luís Loureiro yanıt olarak

Re: SDG Selected: SDG 4 - Quality Education Indicator Selected: Gross Enrollment Ratio in Tertiary Education

yazan Fernando Gonçalves -
Hello Luis, 

Congratulations on your report! The choice of the indicator "Gross Enrollment Rate in Higher Education" for SDG 4 - Quality Education was very pertinent. I found it interesting how you used different machine learning techniques to make predictions and compare their performance. The approach of highlighting the most important factors, such as GDP per capita and government spending on education, provides valuable insights for policymakers.

Best regards 
Fernando Gonçalves 
Luís Loureiro yanıt olarak

Re: SDG Selected: SDG 4 - Quality Education Indicator Selected: Gross Enrollment Ratio in Tertiary Education

yazan Paulo Jorge Couto Tavares -
Hi Luís!

Your report on predicting the Water Exploitation Index (WEI+) using machine learning techniques is commendable. You have successfully demonstrated the process of data collection, preprocessing, and model evaluation. The selection of Linear Regression, Random Forest Regression, and Support Vector Regression models is appropriate for capturing different complexities in the data, and your evaluation metrics provide clear insights into their performance.

The detailed discussion on the most important factors affecting the WEI+ is particularly valuable. By identifying key predictors such as GDP per capita, government expenditure on education, and population size, you offer actionable insights for policymakers. For future work, consider exploring additional advanced models or hybrid approaches to further enhance predictive accuracy. Including a section on potential limitations and how they might be addressed would also add depth to your analysis.

Best regards,
C. Tavares
Paulo Jorge Couto Tavares yanıt olarak

Re: SDG Selected: SDG 4 - Quality Education Indicator Selected: Gross Enrollment Ratio in Tertiary Education

yazan José Manuel -
Good afternoon,
Ensure inclusive and equitable quality education and promote lifelong and promote lifelong learning opportunities for all
SDG 4 seeks to ensure access to equitable and quality education through all stages of life, as well as to increase the number of young people and adults having relevant skills for employment, decent jobs and entrepreneurship. The goal also envisages the elimination of gender and income disparities in access to education.

Best Regards,
José Manuel