Abstract

To predict the final grade and explore the importance of attribute selection and classification algorithm on a dataset of student lifestyle information, we ran an analysis using the Weka datamining toolkit. The results shows that the dataset is insufficient to accurately predict the letter or number grade a student will receive, but sufficient to predict a binary pass-fail, using J48 (C4.5) or Logistic Regression and a minor subset of attributes including the student's number of past failures (failures), intent to pursue higher education (higher) and the presence of additional school support (schoolsup) as the most relevant features.

Objective

Choose a practical dataset (as opposed to the example ones we used in class) with a reasonable size. Download the data, read the description, and try various approaches to solve the problem as best as you can.

The objective of the chosen dataset is to predict the final grade (G3) of a student

The Student Performance Dataset

The Student Performance Data Set [1] consists of 1044 records recorded from the 2005-2006 school year of two secondary schools in Portugal. The records contain 33 attributes consisting of academic information for 662 unique students in two courses – math and portuguese language (port) – as well as self-reported survey data regarding the students home life and school-related efforts (see Table 1 for a breakdown of the attributes). There are both nominal and numeric attributes, the latter of which are associated with pre-defined ranges. Not all values in the range of an attribute appear in the dataset. 382 students appear in both datasets, though they are not duplicate records because they correspond to grades and survey answers for different courses. Therefore, they are union-compatable sets. The G1, G2 and G3 attributes are tightly correlated, G3 is directly the results of the other two, and thus it was decided to ignore the G1 and G2 attribute to prevent prediction bias.

Outcome of this project

100 % Overall Feedback: Very well done, very detailed analyses and discussion

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
originals		originals
CPS844_FinalProject.doc		CPS844_FinalProject.doc
DataToARFF.java		DataToARFF.java
GradeToNominal.java		GradeToNominal.java
README.md		README.md
cps844_w21_assignment.pdf		cps844_w21_assignment.pdf
student-math.arff		student-math.arff
student-mixed.arff		student-mixed.arff
student-port.arff		student-port.arff

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Abstract

Objective

The Student Performance Dataset

Outcome of this project

About

Releases

Packages

Languages

BShennette/CPS844_FinalProject

Folders and files

Latest commit

History

Repository files navigation

Abstract

Objective

The Student Performance Dataset

Outcome of this project

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages