Introducing MadSafety, a Django-based scraper of incident reports released by the Madison Police Department
Published 2012-07-02
tl:dr - I can finally say I made a Django application, and it’s live on the web and available on GitHub.
MadSafety uses Python to scrape a table of Madison Police Department incident reports and display the content to the user.
From the list of reports, a user can click through to see the location of the incident and read the details. Incident reports from the Madison Fire Department will be added soon.
It’s been more than a year in the making, but I can finally say I built and deployed a Django project.
The project itself isn’t overly complicated, but it has a high ceiling and most of all has given me some confidence in my practice habits and ability to retain concepts.
MadSafety uses Python to scrape a table of Madison Police Department incident reports and display the content to the user.
From the list of reports, a user can click through to see the location of the incident and read the details. Incident reports from the Madison Fire Department will be added soon.
A basic JSON output – perfect to store locally – via TastyPie is available.
The incident reports displayed are among those released by the Madison Police Department.
From the department's website:
Incidents listed are selected by the Officer In Charge of each shift that may have significant public interest. Incidents listed are not inclusive of all incidents.
Also, suspects and individuals charged in connection with incidents are innocent until proven guilty. Addresses are those provided on the public incident reports, and are only edited for location reasons.
There are limitations; some from the data, more based on my abilities. For instance, suspect, arrest and victim details are entered into one field, and I would need to either manually separate out, or figure out how to parse out the details into separate fields. More importantly, this table is not the comprehensive list of Madison Police Department incident reports, but rather a snapshot that is released to the public.
But with the possibility of the open data ordinance being passed and implemented that might change, so in some terms the life span of MadSafety as it is right now is likely limited. Which is why it was important for me to “get this out the door” somewhat quickly and call it version 1 as opposed to sitting on it while learning how to implement search, or make the main display sortable.
The code is on GitHub, and I have a nice long list of potential improvements, including adding search functionality and separating out information like suspect information that is contained in one singular field in the incident report.
The idea to scrape police incident reports and map from CityCampMadison last month, but the follow-through and execution has been informed and inspired by so many different projects and thought-provoking discussions, many of which I’ve mentioned in previous posts:
-
HackingMadison, which is dedicated to “highlighting the civic projects, datasets, resources and people that make Madison better.”
-
Christopher Groskopf’s work, especially the Tyler Sirens project, and for encouraging me to just build something.
-
Ben Welsh’s story-writing algorithm proposals in his talk at the 13th International Symposium on Online Journalism, and web scraping walkthrough, not to mention his Crime L.A. project.
-
Andy Boyle’s Firetracker walkthroughs.
-
Heather Billings' words of wisdom a couple weeks back on the ViewSource podcast.
-
Ryan Pitts for turning me on to Beautiful Soup.
-
Kevin Schaul’s Web scraping with Django tutorial.
-
A whole host of folks at #nicar12 -- Jeremy Bowers, Jason Bartz, Adam Playford -- and everyone who taught the Django workshop.
-
Jonathan Stray’s discussion last month around bringing a new model for crime reporting.
-
My days time covering crime and courts in Chicago’s South Suburbs while working as Illinois Editor for The Times.
This little program won’t write the story itself, had the data been available I could have used a tool like this to create shells for briefs that could have been edited down so I could move on to focus on larger stories.
And with some simple search capabilities, maybe even find some similarities between crimes and suspects.