shooting_data <- shooting_data %>%
filter(!is.na(perp_race), perp_race != "") %>%
mutate(
boro = str_to_lower(boro),
perp_race = str_to_lower(perp_race)
)3 NYPD
3.1 Chapter Introduction
In this report, I will be looking at the NYC shooting dataset, from NYC opendata. I explore patterns in shootings by time of day and borough, clean up the data, and then use plots to highlight some key trends.
3.2 Loading dataset:
3.3 Omiting missing rows, and lower-casing:
Standardizing column names to lowercase simplifies data cleaning and avoids case-sensitivity errors.
boro_counts <- shooting_data %>% count(boro)boro_counts %>%
arrange(desc(n)) %>%
knitr::kable() %>%
kableExtra::kable_styling(full_width = FALSE)| boro | n |
|---|---|
| brooklyn | 7404 |
| bronx | 6328 |
| queens | 3069 |
| manhattan | 2953 |
| staten island | 680 |
3.4 Comparing Shootings per Borough:
shooting_data <- shooting_data %>%
mutate(
occur_time = hms::as_hms(occur_time),
hour = hour(occur_time),
time_of_day = case_when(
hour >= 6 & hour < 12 ~ "Morning",
hour >= 12 & hour < 18 ~ "Afternoon",
TRUE ~ "Night"
)
)ggplot(shooting_data, aes(x = time_of_day, fill = time_of_day)) +
geom_bar() +
labs(
title = "Shootings by Time of Day",
x = "Time of Day",
y = "Number of Shootings"
) +
theme_minimal(base_size = 14)
3.5 Interpretation:
Night has dramatically more shootings than any other time period.
Afternoon is second.
Morning has the fewest incidents.
ggplot(shooting_data, aes(x = boro, fill = boro)) +
geom_bar() +
facet_wrap(~time_of_day) +
labs(
title = "Shootings by Borough and Time of Day",
x = "Borough",
y = "Number of Shootings"
) +
theme_minimal(base_size = 14)
3.6 Interpretation:
Across all boroughs, most shootings occur at night.
- Brooklyn has the highest in every time category. Especially high at night (5,200+).
- Bronx has the second highest overall. Also spikes at night (4,700).
- Queens & Manhattan have moderate levels (night still clearly highest).
- Staten Island has very low overall, yet still follows the same pattern: More shootings happen in the afternoon, compared to morning, and many more shootings happen at night, compared to afternoon.
The concentration of shootings at night suggests that temporal risk patterns may be as important as geographic ones when designing intervention strategies.
3.7 Conclusion:
The analysis reveals two major patterns: shootings are heavily concentrated at night, and Brooklyn and the Bronx account for the majority of incidents. The consistency of nighttime spikes across all boroughs suggests that time of day is a critical factor in understanding violent crime patterns in New York City. These findings indicate that prevention strategies may be most effective if they focus on high-risk time windows and borough-specific resource allocation.