New Software to Access the World Prison Brief Dataset: The prisonbrief Statistical Package∗
Danilo Freire & Robert Myles McDonnell
The World Prison Brief Dataset (WPB) provides a unique source of information about the prison system in 223 jurisdictions worldwide. The dataset was compiled by the Institute for Criminal Policy Research at Birberk in London. Scholars and policymakers are increasingly interested in cross-country comparisons of prison indicators. Not only comparative data allow researchers to understand macro-level trends in prison systems but also to investigate the causes and effects of different incarceration policies.
The WPB features a wealth of prison information. For instance, the WPB presents data about prison population, the ratio of female to male inmates, the number of pre-trial detainees, the number of correction facilities, the occupancy level of the prison system, and overall prison population trends when these data are available for many years. It is also possible to obtain the contact information of the government body in charge of corrections and about the head of prison administration.
Although the WPB dataset is of notable interest to the public at large, it does not lend itself easily for statistical analysis. To the best of our knowledge, the data are only available as HTML pages on the project’s website. Each country has a specific online address, yet there is no way to quickly convert the information the website contains to a spreadsheet.
We address this problem with the prisonbrief package for the R statistical language. R is free to use, it is open-source, and it has become the de-facto standard software in data analysis. Users can also extend R’s core functionalities by adding packages to the software, and this is what we do here. As R itself, our prisonbrief package is completely free and its code can be independently verified by any interested party.
Our package downloads, cleans, and formats all information available in the WPB to a format that is convenient to users. Those who have no previous experience with R will find our package easy to use. All built-in functions are accessible with only two lines of code, and the dataset can also be downloaded as a .csv file and loaded into Excel, Stata, and SPSS with no compatibility issues. Moreover, users familiar with R are able to plot maps and draw graphs with their packages of choice and present the data in tables if they so choose.
The prisonbrief has only three functions and all of them start with wpb_, a mnemonic for World Prison Brief. The first is a convenience function named wpb_list(). It prints a list of available countries to the R console. The second function is wpb table(). This function returns a series of variables about the prison systems of the world, of a particular region, or of a specific country. However, please note that country-level data are sometimes not ready for quantitative analysis without further cleaning (removing parentheses etc.). Since some of this information may be relevant, we have chosen to leave it in. Data from regions instead of a single country are fully prepared for automated analysis.
Finally, we have added the wpb_series() function to the package. The function downloads and tidies the tables describing the trends in the prison population total and the prison population rate for every jurisdiction included in the project. This is useful for scholars who want to quickly create time trends and compare developments of prison policies over time.
We hope the package is useful to policymakers and scholars interested in prisons. More information about the package, including example code and graphs, can be found on the project website. Users can also check the source code and send suggestions to new functions on GitHub.
This content is published with permission from the authors.
Danilo Freire is a Postdoctoral Research Associate in the Political Theory Project at Brown University, Rhode Island, United States of America. His research interests are political violence, prison gangs and computational social science, with a special focus in Latin America. He holds a PhD in Political Economy from King’s College London and an MA in International Relations from the Graduate Institute Geneva, Switzerland.
Robert Myles McDonnell is a Senior Data Scientist at First Data Corporation, with an interest in political economy. All views expressed are his own.