Data Mining in the Humanities
Feb 1, 2022 • 1 min read

First Post: Dataset Analysis

The dataset I choose is “U.S. News and World Report, “Best Colleges Ranking Criteria and Weights,” and the reason why I choose it is because the dataset expresses several ranking factors and criteria of best colleges. I think analyzing it would give me a better view about colleges choosing——for example, which college fits me and what should I mention in the future, which will be a great help for my future development. The dataset is created by U.S. News, intending to rank colleges in the United States based on their academic quality and provide students with a clear table. Although the dataset specifically lists the marking criteria, it does not take the difficulty of courses into account. Also, because the dataset is based on academic quality, it does not include environmental factors and safety factors. Besides, for the several questions I come up with while I am reading the dataset, I might ask questions about COVID 19 and some exceptional cases. For example, Does COVID 19 affect the graduation and retention rates? Because of COVID 19, colleges have to change the traditional teaching way to online class. Some students may adapt to online-learning environment, but others may not. The next question is how many samples is the dataset based on? To create an effective and convincing dataset, sample plays an important role. To present argument, I will illustrate each ranking factors because introducing these factors clearly may convince audience, considering this dataset into their account when they choose college. Obviously, the major audience would be students.

ranking factor

Guest post by: Jesse X.