Using Twitter to Document Incidents of Police Brutality

8 min readMar 5, 2021

Background of project

With the gruesome death of George Floyd and as protests and swept across United States in 2020 the actions of the police officers across the country become very heavily scrutinized. You could not turn the news or open a social media app without seeing scenes of unnecessary violent force by police office. To be clear, police officers have a very difficult job, and most officers are not using unnecessary violent force. However, those officers that do use unnecessary violent and brutal force need to held responsible for their actions, and these incidents need to be recorded and documented so that change can happen.

My labs team at Lambda School was assigned to work with The Human Rights First organization to develop a website to display incidents of police using unnecessary force. Our team was tasked sourcing data of these incident and displaying them to the user. The hope is that journalist and/or average people can come to this website and see stats about incident of police using excessive force in different areas of the country, and with this data people can be inspired to create change.

My Job

My portion of this project and what I’ll be talking about in the job post was to create a data pipeline from twitter and using NLP techniques to search for tweets containing report of police use of force. Once the tweets were identified, I needed to make them available to the front end web team to be displayed to the website admin who, once the website is live, will approve or reject the tweets as having police use of force. All approved tweets will be shown on the website.

Initial Considerations

The Data — Most tweets don’t have a lot of data. Most tweets just consist of text. No location data. No context. NLP will need to be used to pull out location and other important data to populate those fields.

The Model — Because the data is not labeled, any modeling would need to use unsupervised learning. Which can be messy especially when working with lots of text from tweets.

Time — The timeline for this project is tight, 2.5 weeks of coding to build a data pipeline and create an NLP model to filter tweets. Because of this short time constraint I won’t be able to spend a lot of time experimenting with different models to deliver MVP.

Taking all this into consideration. I needed to hit MVP, therefore my initial plan was to incorporate a working model, possibly a low performing one, and get tweets to the dashboard for review. Then, once the “plumbing” of the data is dialed in I can improve the model.

The Build

Task 1: Pulling Data from Twitter

Using the python library tweepy I was able establish a connection with twitter. When importing tweets from Twitter I had two options. I could either 1)Set up a streaming listener that would send live tweets into my model as they were tweeted, or 2) I could grab a batch of tweets each day at a specific time and run them through my model to determine if they report police use of force.

I chose the second option. While I was testing option number 1 I was getting waaaaaay too many tweets. Grabbing a batch of tweets once day instead allowed me to filter for only the popular tweets, this significantly cut down the amount of tweets coming from twitter.

Too make sure my model wasn’t grabbing the same tweets more than once. I made use of the since_id field in tweepy. Which only imports tweets from that id forward. My code quickly does a query of the tweet id in my database and uses that for my since_id.

Task 2: Building the model

It’s important to remember the term data science includes the word science. Science is the practice of proving or disproving hypotheses through experiment and observation. When I was tasked with using a model to determine if tweets had contained language of police brutality I knew experimentation was needed, but I also had to have a back up plan incase my experiments failed. So,

Plan A: Develop a model using topic modeling
Plan B: Use a modified version of the KNN model developed by the previous group.

Plan A

So my first plan was to use topic modeling. I had two dataset, a dataset from a public data source containing tweets that had already been flagged as containing police use of force, and data I pulled from twitter that contained the word police.

My hope was to combine the two data sets and use topic modeling to create a few topics. Some with words like “hit”, “shoot”, “tear gas” for topics about police use of force, and another one with random words that did not apply to police using force. However, it wasn’t as easy as I had hoped.

After a day of implementing the model and another day of fine tuning the model the best I could get was this.

----- Topic 0 ------
protester people department crowd watch

----- Topic 1 ------
kill people worker defund social

----- Topic 2 ------
think arrest really protest involve

----- Topic 3 ------
death black report would video

----- Topic 4 ------
people arrest right charge attack

As you can see there’s not much of a difference between the topics. At this point I ran out time for experimentation and I needed to move forward with a model. Luckily a previous labs group came up with a very simple model that actually performed quite well.

Plan B

Plan B makes use a of KNN model, using training data that was created by hand (see training data below).

ranked_reports = {
    "Rank 1 - Police Presence": [
        "policeman", "policewoman", "law enforcement",
        "police officer, cop, five-o, fuzz, DHS", 
        "protester", "FPS", "officer",
        "Federal Protective Services",
    ],
    "Rank 2 - Empty-hand": [
        "policeman", "policewoman", "law enforcement",
        "police officer", "cop", "five-o", "fuzz, DHS",
        "pushed and shoved with shields", "officer",
        "grabs, holds and joint locks",
        "punch and kick", "thrown to the ground", "hit",
        "charge a protester", "tackle to the ground", 
        "kneel on", "arrest", "protester",
        "FPS", "Federal Protective Services", "zip-ties",
        "police chase and attack", "kicking him", 
        "threw him to the ground", "handcuff him", 
        "kneeling on a protester", "pinning down", 
        "tackle", "shoved to the ground", "violent",
        "officer shove"],
    "Rank 3 - Blunt Force": [
        "policeman", "policewoman", "law enforcement",
        "police officer", "cop", "five-o", "fuzz", "DHS",
        "rubber bullets", "officer",
        "riot rounds",
        "batons", "blood", "hit", "arrest",
        "protester", "FPS", 
        "Federal Protective Services", 
        "strike with baton", "violent",],
    "Rank 4 - Chemical & Electric": [
        "policeman", "policewoman", "law enforcement",
        "police officer", "cop", "five-o", "fuzz", "DHS",
        "tear gas", "officer",
        "pepper spray",
        "flashbangs", "stun grenade",
        "chemical sprays",
        "Conducted energy devices, CED or tazor",
        "blood", "arrest", "protester", "FPS", 
        "Federal Protective Services", "pepper balls",
        "using munitions on protesters", "struck by a round",
        "fire pepper balls and tear gas", 
        "struck in chest by projectile", "violent", 
        "munition", "firing a riot gun", "paintball gun",
        "shots are fired", "fire explosives", 
        "fire impact munitions",],
    "Rank 5 - Lethal Force": [
        "policeman", "policewoman", "law enforcement",
        "police officer", "cop", "five-o", "fuzz", "DHS",
        "shoot and kill", "protester",
        "open fire", "FPS", "officer",
        "Federal Protective Services",
        "deadly force", "fatal",
        "dies", 'kill', "arrest", "violent", 
        "shot and killed",],
}

Essentially they created 5 clusters by hand with these key words. Then when making a prediction using the trained KNN model, it will return a Rank 0–5 depending on which Rank key words most closely match the input text.

The positives are this model is simple it doesn’t take much to train, and it has fairly good results. However, it is a long way off from being able to detect tweets that contain police use of force with high enough precision to be used without human intervention, but it is a great first step.

See my TextMatcher Class, which tokenized, vectorized, and trains a model base on the above training data. Then makes a prediction when the class is called.

class TextMatcher:
    """ Generic NLP Text Matching Model """class Tokenizer:
        """ Standard SpaCy Tokenizer """
        nlp = en_core_web_sm.load()def __call__(self, text: str) -> list:
            return [
                token.lemma_ for token in self.nlp(text)
                if not token.is_stop and not token.is_punct
            ]def __init__(self, train_data: dict, ngram_range=(1, 3), max_features=8000):
        """ Model training on live data at init """
        self.lookup = {k: ' '.join(v) for k, v in train_data.items()}
        self.name_index = list(self.lookup.keys())
        self.tfidf = TfidfVectorizer(
            ngram_range=ngram_range,
            tokenizer=self.Tokenizer(),
            max_features=max_features,
        )
        self.knn = NearestNeighbors(
            n_neighbors=1,
            n_jobs=-1,
        ).fit(self.tfidf.fit_transform(self.lookup.values()).todense())
        self.baseline, _ = self._worker('')def _worker(self, user_input: str):
        """ Prediction worker method - internal only """
        vec = self.tfidf.transform([user_input]).todense()
        return (itm[0][0] for itm in self.knn.kneighbors(vec))def __call__(self, user_input: str) -> str:
        """ Callable object for making predictions """
        dist, idx = self._worker(user_input)
        if dist != self.baseline:
            return self.name_index[int(idx)]
        else:
            return 'Rank 0 - No Police Presence'

Task 3: Populating the Database

Populating the Database didn’t take much. The only thing I’ll mention is using the python library dataset.

I’ve used psycopg2 and SQLalchemy in the past, and they work great, but dataset is the simplest and most intuitive database library I’ve used with python. Creating tables, populating tables, and simple queries could not have been easier. I highly recommend using it if you aren’t doing complex queries in a database.

Where does the project stand now?

Please take a look at the public view of the website here.

Right now the website displays a map and statistics of police brutality around the country. The data being used comes from a reddit source that vets all the data to make sure it truly does have evidence of police use of excessive force. The data pipeline from twitter is up and running and supplying tweets to the admin of the website for review. Any tweets he approves will be added to the website.

The main shortcoming right now it the model is not very strong. It’s a good first layer of filtering, but needs to be improved upon. The good things is I’ve created a lot plumbing so to speak. Data is coming in from twitter being filtered through the model and populating the database. With all this in place another labs team can come it and focus all their effort on improving the model.

Also I realized the data coming out of the tweets like location is lacking. A future team should make use of NLP to pull location data from the tweets.

What I’ve Learned

Although I’m a little disappointed about my model not performing as well as I would’ve hoped. I’m glad I delivered MVP to the client. It was very important that I had a back up plan. Very early on I understood the huge task before me and I knew with the time had and the amount of work I needed to accomplish there was a good chance I wouldn’t be able to implement a high performing model. So instead failing in every other aspect of the project because I wanted an awesome model. I sacrificed the performance of the model to make sure the overall product was good enough to hit MVP. Which I consider a win.