Computational Demography

I think of computational demography as encapsulating the measurement and analysis of demographic processes via non-traditional data sources and computationally intensive methods. Professor Matthew Hall and I have developed a reading list for the field, available here.

My current work in this field is a collaboration with professors Matthew Hall and Arthur Acolin. We are evaluating potential contribution of consumer reference data for the measurement of small-area populations. Consumer reference data is collected by combining and cross-referencing people’s interactions with various private and public institutions, like utility payments, voter registrations, real estate tax assessments, and credit card billing statements. Private companies assemble these datasets for sale to marketers. We have purchased data from one such company and aim to determine how useful it can be for researchers. We are testing the following potential contributions of the data for measuring population levels:

  • Annual small-area estimates: Researchers often need measures of year-over-year change in populations. The Census Bureau does not provide such measures for small geographies like tracts and block-groups on an annual basis, but consumer reference data does. This might make it uniquely useful, for example, for computing accurate rates of a virus like COVID-19, or measuring how population structures change after a natural disaster like a hurricane.

  • Up-to-date estimates: Policy decisions should be based on the most up-to-date information possible. The Census releases updated population information on a two-year schedule, so estimates of population levels in 2020 will be released towards the end of 2021. In contrast, we received consumer reference data for the entire United States in January 2021, almost a full year earlier than the Census.

  • Estimates for any geographic unit: There are lots of ways to divide up America. Demographers divide it into Census blocks and tracts. Education researchers look at school districts. Political scientists care about election precincts. Increasingly, researchers are interested in creating their own geographic buffers around particular neighborhoods and locations. The consumer reference data contains latitude-longitude coordinates for every household. By the magic of spatial data analysis, we can use these coordinates to count the number of people within any geographic area in which a researcher might be interested.