Bay Area Craigslist Posts, 2000 - 2018
Like many cities, San Francisco doesn’t track rents. I created a panel of historic Craigslist rents by scraping posts archived by the Wayback Machine. Please feel free to use the data, and of course, please cite.
Here’s an overview of the methodology and choices about variable creation in the clean data, along with some sample python code.
Raw Bay Area Craigslist Posts, 2000-2012
This is the raw data from scraping Craigslist posts from 2000-2012, archived by the Wayback Machine. For every post, I’ve extracted the posting date, title, and neighborhood.
Citation: Pennington, Kate (2018). Raw Bay Area Craigslist Rental Housing Posts, 2000-2012. Retrieved from https://github.com/katepennington/historic_bay_area_craigslist_housing_posts/blob/master/raw_2000_2012.csv.zip.
Variables: date, title, neighborhood
Observations: 167,090
Raw Bay Area Craigslist Posts, 2013- 2018
From 2013-2018, it was often possible to enter individual listings and generate more detailed data.
Citation: Pennington, Kate (2018). Raw Bay Area Craigslist Rental Housing Posts, 2013-2018. Retrieved from https://github.com/katepennington/historic_bay_area_craigslist_housing_posts/blob/master/raw_2013_2018.csv.zip.
Variables: post_id, date, neighborhood, price, square footage, number of bedrooms, address, lat, lon, description, title, details, year
Observations: 58,551
Clean Bay Area Craigslist Posts, 2000- 2018
Please read the methodology for important information about how the data was cleaned and how variables were defined.
Citation: Pennington, Kate (2018). Bay Area Craigslist Rental Housing Posts, 2000-2018. Retrieved from https://github.com/katepennington/historic_bay_area_craigslist_housing_posts/blob/master/clean_2000_2018.csv.zip.
Variables: post_id, date, year, neighborhood, city, county, price, number of bedrooms, number of bathrooms, square footage, dummy for being a room in an apartment/house, address, lat, lon, title, description, details
Observations: 200,796