Enhancing Open Access to Data Science Education: Analyzing Skill Patterns Using LDA and K-Means Clustering in the Learning Path Index Dataset
DOI:
https://doi.org/10.63913/ail.v1i2.9Keywords:
Data Science Education, Topic Modeling, Latent Dirichlet Allocation (LDA), K-Means Clustering, Open Educational Resources (OER)Abstract
This study examines the application of Latent Dirichlet Allocation (LDA) and K-Means clustering techniques to analyze the Learning Path Index Dataset, with the aim of identifying and categorizing data science education skills. By employing these machine learning models, the research reveals distinct skill patterns and clusters that characterize the dataset, highlighting prevalent skills and potential gaps in data science education accessible through open educational resources (OER). The findings demonstrate specific clusters of beginner to advanced data science topics, offering insights into the accessibility and distribution of educational content. These results can guide educators and platform developers in enhancing the structure and delivery of data science education, thereby improving learner outcomes and resource allocation. The study also discusses the broader implications for educational strategy and policy, emphasizing the role of targeted analytics in optimizing educational offerings in an increasingly digital landscape. Future research directions include expanding the dataset and applying similar analytical frameworks to other fields within open education to further validate and refine these findings.