Skip to content
This repository was archived by the owner on Feb 1, 2022. It is now read-only.
This repository was archived by the owner on Feb 1, 2022. It is now read-only.

query sumlevels, tables, geo by variable #4

@lecy

Description

@lecy

This is more of a feature request than a report of a bug.

I have been hugely impressed with your API as I think it could be very useful for social scientists like myself that often use census and economic data and are frustrated with the census site. I especially like that you can specify a variable and a geographic sumlevel, and the results will be returned at the desired unit of analysis. This is a big step forward!

I have created a short guide for constructing queries in R to make your API accessible to scholars:

https://gist.github.com/lecy/0aa782a873cd174573f32d243233ca5b

The hardest part of using this data is that the validity of arguments varies by the underlying data source. So in one instance I can use "show=geo&sumlevel=msa" and in another instance I cannot, which is confusing for users and it makes it frustrating to figure out if you can get data at the desired level.

To address this problem I have included a couple of helper functions in the guide to allow the user to peek at attributes, and to print the set of valid cases based upon the logic view at "http://api.datausa.io/api/logic/?".

For example, I have formatted the query "http://api.datausa.io/api/logic/?show=geo&sumlevel=all" to print a bunch of tables that look like this:

getUsage( "geo" )
# TABLE: ygi_num_emp 
# DATA SOURCE: ACS 3-year Estimate 
# DEPARTMENT: Census Bureau 
# LINK: http://www.census.gov/programs-surveys/acs/ 
# 
# SUPPORTED SUMLEVELS: 
# 
# acs_ind:  0, 1, 2, all
# geo:  nation, state, msa, all
#
# ...67 more tables printed

The problem is that this use case searches by data attribute (show=geo) and references a data table (ygi_num_emp), but in the typical case the user would want to search for valid sumlevels associated with a specific variable. For example, can I find the number of philosophy majors by county?

Ideally there would be a way to specify an API query that submits a variable name and returns all valid attribute-sumlevel pairs (for example show=geo & sumlevel= nation, state, or msa).

It would be possible to create some functions to do this if the following queries are available:

  • Return all data tables and their valid sumlevels
  • Return all variables associated with a data table (even better if they include definitions)
  • Return all valid sumlevels associated with an attribute

Please let me know if these functions currently exist, or can possibly be added.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions