The Newton Police website has a public information page that can give you some top-level data, but if you really wanted to dig deeper, it’s surprisingly difficult. For example, you can see the locations of bike/pedestrian incidents, but you can’t do any data mining on the reports themselves. As was pointed out during a recent Public Saftey and Transportation committee meeting, we cannot find out, for example, how many of those incidents happened in or near a crosswalk. We cannot learn what caused those accidents or mine for repeated terms. It’s not that the information isn’t there, it is. It’s written into reports filed away and sometimes located elsewhere on the website. But they’re almost impossible to use.
And that’s just the police. Want to understand how your elected officials voted? Good luck to you. All those votes are buried in PDFs scattered around the city website. You would need to dig in, pull out each vote in each committee, type it (manually) into some kind of spreadsheet, then do the same for every vote in city council. Of course, you would also need to tag the purpose of each vote. You may want to do this if, for example, you want to better understand how a particular official voted on zoning issues over time. I’m sure if you asked that official they’d tell you how they voted on individual events, but understanding their votes at a macro level could also provide data and information that would not only hold them accountable but would enable you, the voter, to make a more informed decision at election time. You could also use the data to identify trends, such as which councilors vote together more often than not?
If you want an example of a city that does it well you only need to look east. Boston has a wonderful data-driven website that helps people understand what is happening around them. In fact, they have a whole office called New Urban Mechanics that looks for new ways to use data and information to help serve the public. This can be as simple as identifying potholes or a better understanding of crime statistics. A report of the team’s activity in 2019 can be found here. The office grew out of Mayor Menino’s idea that those in city government are effectively mechanics and their job is to fix what’s broken. This office regularly attracts great minds who would otherwise take jobs in major consulting firms, thereby bringing into city government new, fresh ideas.
But it’s Boston’s open datasets that really fascinate me. People use them to create all sorts of maps and analyze information on their own. For comparison, take a look at Boston’s building permit data, which is available as a downloadable file, already put in a sortable format. Now compare that to Newton’s building permit data, which requires a click for each month and is then presented only as a PDF. If you wanted to use that data you need to grab each file, convert it, and then make sure it’s all in the right cells BEFORE you could do anything with it. That’s a lot of work.
This is true for everything Newton produces. Any data we have is wrapped up in PDF or text documents and buried on websites. The city is developing a new site, but that doesn’t solve the data availability issue.
I recently had a chance to talk (very briefly) with David Olsen about what makes tracking our votes so complicated. The short story is that off-the-shelf systems designed to record city council votes don’t work for our governmental structure. We’re just way too complex, meaning if we wanted a system for the city it would be expensive as it needs to be massively customized. There was hope for change a couple of years ago, however, as the new structure offered up in the proposed new city charter would have better matched what was out there. Which means, we gave up our chance at increased transparency by voting down the new charter. We can now either authorize a very expensive software package or change our city government. I’m not sure either will happen.
When you can examine the data and then map that across departments, you can do amazing things. I’ve spoken with companies that do just that for municipalities. If, for example, you could combine road quality data with data about gas leaks, you could work with the gas company to create a construction schedule that focuses on the most at-risk areas. If you also included information about speeding and pedestrian safety, you could look at construction in a whole new way.
If you had a better understanding of where home sales are turning over and then could examine building permits at a granular level, you would have a map that shows you where houses are getting torn down and where they’re being renovated (and where they’re being purchased as-is). This could help as part of the zoning discussion when we talk about the long-term effects of a zoning change. We could also gain an understanding of the pace of construction at a city-wide level to get a handle on the pace of change.
Today, if you wanted to know that a house in your neighborhood is applying for a special permit to be torn down, you need to watch the PDF agendas from the city committees. It’s all public, it’s all there, but it’s not easy to find or use. What if you just had a map of things that are happening? What if the data could be moved around and used in different forms?
In order for our government to serve us it needs to communicate better. This isn’t just about a periodic email from the Mayor, but about the information that is part of our everyday lives.
OK, what would it take to organize a citizen’s group to do this? I’d volunteer to help set up the infrastructure for vote-tracking. Is this something the Newton LWV would support?
@Chuck-
You have detailed the problem.
Please detail the solution….
I don’t think the solution is all that difficult to understand. We need a way to gain access to the data that our departments produce. There are various tried and true systems out there that do this. From a very high level, it’s rather simple.
The question becomes, how much of a commitment does the city have to this? We can take a number of paths, but each one is going to cost money and require a long timeline. I believe it’s worth the investment, but I can see this devolving into a long debate that ultimately results in a watered-down version that doesn’t really give us what we want. Are the people going to demand this? Are they willing to pay for it? Can they see the benefits?
We also need to ask ourselves: where do we start? Given that we’re undergoing a deep examination of our police, maybe that’s a place to consider. Or does it make sense to start in places where the data exists but isn’t able to leave in a usable manner (building, planning, etc.).
Step one, however, is a discussion.
@Chuck. Thanks for highlighting this very important issue. Good public policy requires good data; without the latter we won’t have the former. In my Opinion piece in the Tab earlier this summer, I made many of the same arguments about the shortcomings of budget and finance information on the City website that you have outlined more broadly in your post. https://newton.wickedlocal.com/news/20200727/column-budgeting-in-newton
I pointed out, as you do, that other municipalities are already much more successful than Newton at presenting data in a more transparent and accessible manner. (Arlington, as I mentioned, has two terrific budget and finance tools on its website.) In addition to Boston, which you cite, cities like Cincinnati also use interactive features and data visualization to share complex information. https://data.cincinnati-oh.gov/ In many cases, there are off-the-shelf platforms that can be customized, so municipalities don’t have to start building tools from scratch. Sadly, this is the type of issue that usually falls to the bottom of the list as elected officials focus on putting out fires (and, sadly, we have plenty of those lately). But this is a critical issue related to good governance, and I hope that residents of Newton and our elected officials will give it the sustained attention it deserves.
Let’s all thank Mr. Tanowitz for identifying the two needs to make this happen: time and money. Both in short supply. Add to that rather long list of priorities for each that are either mandated (OPEB) or more far more pressing (High School Zoom for the foreseeable future) than a fancy new information system and we are once again reminded why the broad consensus as to the utility of this forum is as it is.
@Chuck,
The first step I imagine is determining what data is and is not considered a public record subject to disclosure. Anyone may submit a public records request to the City and the designated records officer is required by law to respond to the request. Upon receipt a determination is made whether the data sought is subject to disclosure or should or must be withheld under one of the exceptions to the public records law. As I read this, it seems to me that you are essentially asking the city to make all public records available via a searchable online library.
And while that would be ideal, I think that would require creation of an entire public records bureau to review the documents to ensure that they aren’t exempt from disclosure, redact where necessary and upload documents as they are created to some designated portion of the online library. Add in prior records for how ever many years, and I think that would be a rather significant undertaking not only to establish the database but to maintain it.
As I noted above, much of this data is already public in one form or another, it’s just not organized and distributed in a usable way. Also, the concept isn’t new. Several cities, as noted by Meryl above, are already doing this. The processes and technologies are all well-established.
This isn’t just about a “fancy new system”, it’s about giving citizens the information we need to make better decisions about our own government. Seems like a sound investment to me.
Thanks Chuck. This isn’t a frill. Easy to access information means more informed residents, businesspeople, and lawmakers. It makes is easier to learn from the past and anticipate the future. Better decisions made on sound advice. Debate supported by facts. Less re-invention. Longer institutional memory. Fewer oversights and wasted money. More transparency and accountability.
To Newton’s credit, it has fabulous open GIS information. That takes a lot of work too, but it is used very effectively across departments. Unfortunately, the police aren’t directly tied into it at this point.
This doesn’t have to start with anything fancy. It starts with a commitment to transparency, in part through open data, well-organized data, and useful data.
Chuck wrote:
“For example, you can see the locations of bike/pedestrian incidents, but you can’t do any data mining on the reports themselves.”
Actually, I just checked the NPD website and since January 1st they have been providing online access to police reports where they responded to a collision scene. Go here: http://www.newtonpolice.com/ACCIDENTS/index.asp
The reports are organized by date and include the location(s). It is quite easy to identify a report pertaining to an intersection because two streets are referenced. There is also additional data that earmarks those relating to a utility, and reports referenced “other” include motor vehicle vs. bike.
This is a far cry from the old days of sending a written request for a collision report to the local police department and waiting for a response by snail mail.
In addition, NPD provides online access to the police log which as of today is 1,147 pages long. The data can be downloaded and searched for specific terms. The latest police log is here: http://www.newtonpolice.com/POLICE_LOG/CURRENT/2020PoliceLog.pdf
@LisaP I think we’re talking past one another here.
Yes, the reports are there, and each one can be downloaded as a PDF. But doing any data or text analysis is much more cumbersome than it should be given that even the raw text is wrapped up in the PDF and difficult to extract.
The same goes for the police log, it’s not in an easy-to-use format.
All of the information I’m referencing is public. It’s just not presented in a way that makes it usable and therefore effectively reduces transparency.
@Chuck,
As a lawyer who deals with the unique facts of each individual case, I humbly and respectfully disagree. The data is accumulated first and foremost for the Commonwealth to prosecute criminal conduct, and secondarily for individual parties involved who have suffered harm that may or may not rise to the level of criminal prosecution. The data is there to be analyzed to determine whether traffic improvements would mitigate collisions. But having worked with civil engineers and collision reconstructionists, that in my experience is not within the expertise of law enforcement.
Frankly, I’ve defended first degree homicide cases at trial and on appeal as well as complex civil matters. I have dealt with the police as an adversary in both civil and criminal litigation in both state and federal court and I honestly and candidly do not know what you want. This is not a critique but a genuine ask. Because I’m speaking as someone who has fought successfully to obtain information and open doors. So I’d really, truly like to understand what it is that you seek.
Lisa
I’m looking at it from the perspective of data journalism, which tries to find stories and patterns in the data. While a legal case will dive deep into the information on one particular case, I’m asking for the data to be shared in a format that allows those with the expertise to analyze and use it in new ways.
This isn’t about getting the information from the police on one particular incident, but about taking a whole year’s worth of traffic incidents and looking for patterns across all of them. That’s when you need a tool like Tableau or someone who is proficient in Python or R to take this data, ask questions, and look for answers.
What I’m asking for is what other cities have. In that case the police log that you referenced would be released in a series of cells so that it can be sorted and searched by a machine. As an example, here are the incident reports for Boston in a useable and searchable format: https://data.boston.gov/dataset/crime-incident-reports-august-2015-to-date-source-new-system/resource/12cb3883-56f5-47de-afa5-3b1cf61b257b
Let me offer an anecdote that came up tonight. I was talking with someone about city council committee meetings and that person mentioned that they felt a city councilor hasn’t been at any meetings of a particular committee of which they are a member. To find that you need to open each PDF and look at the names. And for that one piece of information, you can do that.
But what if you wanted to know the average attendance rate for all of the city councilors? Or if you wanted to know how many meetings your three city councilors attended through the year? Or which committees they attended most and which they ignored? That’s where it gets complicated and where you need to have the data in a more open and usable form.
Data is great. The hard part is collecting it.
Idea: perhaps we can do for data that Gail did for journalism; find a data-driven, Gail-like object to drive the initiatve, and using our (high school?) students to mine the various data sources, enter the data, the use in an educational capacity (as well as benefit the community per Chuck’s OP). I’m sure IT can either find some server space for this or set up a cloud based server on AWS, Rackspace, GCP or Azure.
Where can we find a Gail + Data Science love child to lead this? :-)
I’d like to clarify this topic with a couple examples. First, police crash reports. As you can see from the crash reports on the police web site, there’s extensive data about each crash captured in the report. It has three main pieces: demographic information about the people involved, additional structured information about the crash itself, and a narrative about the crash captured at a particular time.
The reports the public sees are PDF renditions of the underlying data. They are suitable for reading. They are not suitable for automatically searching for trends that could identify places, say, where solar glare affects drivers’ view. Or if lowering a speed limit has affected crashes over time. Or how many motor vehicles end up in people’s yards or the Cheesecake Brook (yes, it happens). Or if No Turn On Red improves pedestrian safety, or if it is ever enforced.
The public police reports contain both too little and too much information. In particular, they don’t always have follow-up information about later crash assessments, interviews, or charges. They do sometimes contain sensitive personal information.
All this data is collected and sent to the state. However, that data is always two years old. That doesn’t help in making city planning decisions. And that’s what this is about. It isn’t just the public being able to get the information. The city itself can’t get it.
I’m using the police as an example only. They have budget and privacy limitations on what they can make available. The first step, though, is understanding the value of this information to making our city better. I posit that NPD would be more efficient and more effective if they had the resources to better use this information. Transportation, Public Works, and Planning would as well. And citizens would benefit from access to whatever data can be made available.
This example brings me to the larger point that should be clear as the result of the pandemic: information about government’s workings should be easy to find, follow, and understand. This point goes back to Richard Rasala’s post a few weeks ago. Public outreach and public information has value to everyone. People should be able to follow ongoing issues and discussions that affect them without digging through pages and pages of other stuff they don’t especially care about.
Boston’s New Urban Mechanics is this kind of approach.
https://www.boston.gov/departments/new-urban-mechanics
They web site says, “We work across departments and communities to explore, experiment, and evaluate new approaches to government and civic life.” They have done a great job simplifying government web forms, improving web pages, experimenting with new community outreach, and releasing open data.
I was originally skeptical they could pull it off in Boston, but this group has really changed, among other things, the digital face of the city.
If anyone wants to see a completely dysfunctional municipal website when it comes to presenting even the most basic information (forget about allowing for any type of data parsing), then take a look at Needham’s – absolutely laughable. It’s a dream product for the three or four autocratic bozos on the Needham Select Board because it allows them to get away with anything and everything and have zero accountability. But of course it’s par for the course when dealing with any of the towns outside of 128.
BTW There have been car crashes into Cheesecake Brook? When I search that term, I mostly get stories about a woman crashing her ex-boyfriend’s date at the Cheesecake Factory.
@Mike Halle,
A couple of thoughts continuing with the example of crash reports. First a minor point – from my small random sampling and experience, the police reports do reflect when a driver has been cited for a traffic violation. This information is contained in the officer’s narrative. They do not, as you noted, reference any follow up concerning any such citation.
The greater point which I think one should consider in evaluating the utility of this information is that these reports are limited to those crashes where the police have responded to the scene. Many collisions occur which do not require police presence. In some instances, vehicle operators file the required operator’s report with the Mass. RMV and local police. In others, no paperwork is generated. Thus, I would think that anyone seeking to analyze this data would need to weigh that it reflects only a subset of crash data (when police responded) and does not reflect all crash data.
I can certainly support making data more useable and accessible. I think that some forms of information are more amenable to synthesis in a spreadsheet type format (date, time, location, GPS coordinates) whilst other data (the narrative) are less so.
And I think this thread title contains a bit of a misnomer because it suggests a lack of transparency when in actuality it seems from the comments here not to be a matter of availability (substantive ) but the form in which the data is available. But I think there is a saying in architecture that form follows function, and I can embrace functional improvements.