Batch Geocode Parsed Addresses — cxy

Provides access to the US Census Bureau batch endpoints for locations and geographies. The function implements iteration and optional parallelization in order to geocode datasets larger than the API limit of 1,000 and more efficiently than sending 10,000 per request. It also supports multiple outputs, including (optionally, if sf is installed,) sf class objects.

cxy_geocode(
  .data,
  id = NULL,
  street,
  city = NULL,
  state = NULL,
  zip = NULL,
  return = "locations",
  benchmark = "Public_AR_Current",
  vintage = NULL,
  timeout = 30,
  parallel = 1,
  class = "dataframe",
  output = "simple"
)

Arguments

.data: data.frame containing columns with structured address data
id: Optional String - Name of column containing unique ID
street: String - Name of column containing street address
city: Optional String - Name of column containing city
state: Optional String - Name of column containing state
zip: Optional String - Name of column containing zip code
return: One of 'locations' or 'geographies' denoting returned information from the API. If you would like Census geography data, you must specify a valid vintage for your benchmark.
benchmark: Optional Census benchmark to geocode against. To obtain current valid benchmarks, use the cxy_benchmarks() function.
vintage: Optional Census vintage to geocode against. You may use the cxy_vintages() function to obtain valid vintages.
timeout: Numeric, in minutes, how long until request times out
parallel: Integer, number of cores greater than one if parallel requests are desired. All operating systems now use a SOCK cluster, and the dependencies are not longer suggested packages. Instead, they are installed by default. Note that this value may not represent more cores than the system reports are available. If it is larger, the maximum number of available cores will be used.
class: One of 'dataframe' or 'sf' denoting the output class. 'sf' will only return matched addresses.
output: One of 'simple' or 'full' denoting the returned columns. Simple returns just coordinates.

Value

A data.frame or sf object containing geocoded results

Details

Parallel requests are supported across platforms. If supported (POSIX platforms) the process is forked, otherwise a SOCK cluster is used (Windows). You may not specify more cores than the system reports are available

Examples

# load data
x <- stl_homicides[1:10,]

# geocode
cxy_geocode(x, street = 'street_address', city = 'city', state = 'state', zip = 'postal_code',
   return = 'locations', class = 'dataframe', output = 'simple')
#>              street_address year             date state postal_code      city
#> 9            5738 Terry Ave 2008 01/12/2008 12:37    MO          NA St. Louis
#> 7            5356 Page Blvd 2008 01/17/2008 04:00    MO          NA St. Louis
#> 10        5826 Roosevelt Pl 2008 01/20/2008 21:19    MO          NA St. Louis
#> 4             3859 Ohio Ave 2008 01/21/2008 17:38    MO          NA St. Louis
#> 5      4100 Saint Louis Ave 2008 01/30/2008 15:34    MO          NA St. Louis
#> 3         2418 N Euclid Ave 2008 01/30/2008 19:19    MO          NA St. Louis
#> 2            1646 S 39th St 2008 02/04/2008 17:45    MO          NA St. Louis
#> 8          5617 Enright Ave 2008 02/09/2008 17:30    MO          NA St. Louis
#> 6  5001 N Kingshighway Blvd 2008 02/09/2008 22:59    MO          NA St. Louis
#> 1             1500 Cass Ave 2008 02/11/2008 21:50    MO          NA St. Louis
#>      cxy_lon  cxy_lat
#> 9  -90.27428 38.67787
#> 7  -90.27337 38.66153
#> 10 -90.27588 38.67951
#> 4  -90.22947 38.58545
#> 5  -90.23143 38.66019
#> 3  -90.25558 38.66611
#> 2  -90.24478 38.61837
#> 8  -90.28232 38.65497
#> 6  -90.24478 38.68690
#> 1  -90.19743 38.64179