Data Mining

In this course, we started with data preprocessing. This step was crucial—making sure the data was clean and structured for analysis. We learned techniques for managing outliers and missing data, applying k-Nearest Neighbors and Dynamic Time Warping.


After we had our data set up properly, we moved on to study various machine learning models. The course covered how to use regression for predicting numerical values, how to classify data into different categories, and how to use clustering to discover patterns in the data.


We also dedicated time to understanding time series data, mastering how to forecast future trends based on historical data patterns. The course concluded with lessons on integrating our data mining models into web applications, focusing on making our models ready for real-world application and user interaction. This provided a comprehensive look at the lifecycle of data—from collection and cleaning to practical application.


A significant part of the course was dedicated to research. We learned how to conduct research thoroughly and how to articulate our findings in research papers, which was critical for adding depth to our technical knowledge. For this course's final project, my group worked on predicting how well students would do based on their activity in educational games. We used a special kind of neural network called Bidirectional Long Short-Term Memory (BLSTM) to process the game data. We tried two ways: one that looked at patterns over 100 timesteps and another simpler method without timesteps for comparison. 

Long_Short_Term_Memory_Networks_for_Predicting_Student_Performance_from_Game_Play.pdf