Data Mining in the Humanities
Feb 1, 2022 • 2 min read

Leading Causes of Death in New York City

Leading Causes of Death in New York City

    The dataset that I chose to analyze was the one that was about the leading causes of death in New York City. In my opinion, this dataset should be a very important asset to many people. To elaborate, NYC is one of the most famous cities in the entire world, meaning that it has a very large population of people. Being able to identify causes of deaths allows people to make efforts to prevent them, not only in NYC, but in similar densely populated areas in the world.

    The data featured in the set was collected by the Bureau of Vital Statistics and New York City Department of Health and Mental Hygiene. The department of health did not provide a reason why the data was being collected, but the reason should be almost common sense as deaths of human beings should be important.

    Unfortunately, the data in the set is very limited, with the only details that we know about the deceased victims are their cause of death, sex, and ethnicity. This means that other important details that could play a factor in the death, such as age, are unknown. Despite this, With the provided information we can answer questions like: Is there a correlation of being a certain ethnicity with a certain cause of death? Is there a correlation of being a certain sex with a certain cause of death? These questions are easily answered with the dataset with little to no modification to the dataset in any way.

    But what can we do with the information gathered from the dataset? If the data revealed that a certain sex or a certain ethnicity were affected by a certain cause of death, the infomation gathered could be used to target that specific demographic to prevents more deaths from occuring. Additionally, this information could also be used to target cities with large populations that are similar to New York City. For example, the data collected in the dataset showed that a large percent of the population died of heart disease. So, this information can be used to advise entire populations to be specifically wary about their heart’s condition. In the end, this dataset provides vital information that can potentially be used to save lives.

The dataset can be found here, and below is a table of the most common causes of death in NYC.

Leading Causes of Death in NYC

Guest post by: Andy M.