Enrich Using Point Geometries¶
Starting off, we import a few required Python resources. While there are quite a few in there, the import of note is from arcgis.geoenrichment import Country, get_countries
. We are going to use this object and method to discover and perform our analysis.
[1]:
import os
from pathlib import Path
from arcgis.features import GeoAccessor
from arcgis.geoenrichment import Country
from arcgis.geometry import Geometry
from arcgis.gis import GIS
from dotenv import load_dotenv, find_dotenv
import pandas as pd
Next, we need some test data to work with. For the sake of simplicity, we are going to create two point geometries on the fly and build a Spatially Enabled Data Frame with these two points. We are going to investigate how to get very similar results by inputting just the list of point Geometry
objects and also these geometries as part of a Spatially Enabled Data Frame.
[2]:
nm_lst = ['Bayview', "Ralph's Thriftway"]
coord_lst = [
(-122.9074835, 47.0450249), # Bayview Grocery Store
(-122.8749600, 47.0464031) # Ralph's Thriftway Grocery Store
]
# create a list of point geometry objects from the coordinate tuples - what we are going to use first
geom_lst = [Geometry({'x': pt[0], 'y': pt[1], 'spatialReference': {'wkid': 4326}}) for pt in coord_lst]
# create a spatially enabled dataframe - what we are going to use later
grocery_df = pd.DataFrame(zip(nm_lst, geom_lst), columns=['store_name', 'SHAPE'])
grocery_df.spatial.set_geometry('SHAPE')
grocery_df
[2]:
store_name | SHAPE | |
---|---|---|
0 | Bayview | {"x": -122.9074835, "y": 47.0450249, "spatialR... |
1 | Ralph's Thriftway | {"x": -122.87496, "y": 47.0464031, "spatialRef... |
Now, we are going to need a connection to ArcGIS Online to demonstrate the abiliy to use ArcGIS Online for geoenrichment. This is accomplished by instantiating a GIS
object instance with valid credentials read from environment variables.
[3]:
gis_agol = GIS(
url=os.getenv('ESRI_GIS_URL'),
username=os.getenv('ESRI_GIS_USERNAME'),
password=os.getenv('ESRI_GIS_PASSWORD')
)
gis_agol
[3]:
Point List Using Defaults¶
To enrich, we start by creating a Country
object instance. As part of the constructor, we need to tell the object what Business Analyst source to use in the gis
parameter. In this case, we are telling the object to use a local instance of ArcGIS Pro with Business Analyst and the United States data pack.
[4]:
usa_local = Country('usa', gis=GIS('pro'))
usa_local
[4]:
<Country - United States 2021 ('local')>
Next, we need to get some enrich variables to use. We can discover what is available using the enrich_variables
property of the country object to retrieve a Pandas Data Frame of variables available for the country.
[5]:
ev = usa_local.enrich_variables
ev
[5]:
name | alias | data_collection | enrich_name | enrich_field_name | |
---|---|---|---|---|---|
0 | CHILD_CY | 2021 Child Population | AgeDependency | AgeDependency.CHILD_CY | AgeDependency_CHILD_CY |
1 | WORKAGE_CY | 2021 Working-Age Population | AgeDependency | AgeDependency.WORKAGE_CY | AgeDependency_WORKAGE_CY |
2 | SENIOR_CY | 2021 Senior Population | AgeDependency | AgeDependency.SENIOR_CY | AgeDependency_SENIOR_CY |
3 | CHLDDEP_CY | 2021 Child Dependency Ratio | AgeDependency | AgeDependency.CHLDDEP_CY | AgeDependency_CHLDDEP_CY |
4 | AGEDEP_CY | 2021 Age Dependency Ratio | AgeDependency | AgeDependency.AGEDEP_CY | AgeDependency_AGEDEP_CY |
... | ... | ... | ... | ... | ... |
17958 | MOEMEDYRMV | 2019 Median Year Householder Moved In MOE (ACS... | yearmovedin | yearmovedin.MOEMEDYRMV | yearmovedin_MOEMEDYRMV |
17959 | RELMEDYRMV | 2019 Median Year Householder Moved In REL (ACS... | yearmovedin | yearmovedin.RELMEDYRMV | yearmovedin_RELMEDYRMV |
17960 | ACSOWNER | 2019 Owner Households (ACS 5-Yr) | yearmovedin | yearmovedin.ACSOWNER | yearmovedin_ACSOWNER |
17961 | MOEOWNER | 2019 Owner Households MOE (ACS 5-Yr) | yearmovedin | yearmovedin.MOEOWNER | yearmovedin_MOEOWNER |
17962 | RELOWNER | 2019 Owner Households REL (ACS 5-Yr) | yearmovedin | yearmovedin.RELOWNER | yearmovedin_RELOWNER |
17963 rows × 5 columns
Tens of thousands of variables is just a few too many to deal with, so we can parse this down a bit using some Pandas Data Frame filtering to get just key United States variables for the current year.
[6]:
kv = ev[
(ev.data_collection.str.lower().str.contains('key'))
& (ev.name.str.lower().str.endswith('cy'))
].reset_index(drop=True)
kv
[6]:
name | alias | data_collection | enrich_name | enrich_field_name | |
---|---|---|---|---|---|
0 | TOTPOP_CY | 2021 Total Population | KeyUSFacts | KeyUSFacts.TOTPOP_CY | KeyUSFacts_TOTPOP_CY |
1 | GQPOP_CY | 2021 Group Quarters Population | KeyUSFacts | KeyUSFacts.GQPOP_CY | KeyUSFacts_GQPOP_CY |
2 | DIVINDX_CY | 2021 Diversity Index | KeyUSFacts | KeyUSFacts.DIVINDX_CY | KeyUSFacts_DIVINDX_CY |
3 | TOTHH_CY | 2021 Total Households | KeyUSFacts | KeyUSFacts.TOTHH_CY | KeyUSFacts_TOTHH_CY |
4 | AVGHHSZ_CY | 2021 Average Household Size | KeyUSFacts | KeyUSFacts.AVGHHSZ_CY | KeyUSFacts_AVGHHSZ_CY |
5 | MEDHINC_CY | 2021 Median Household Income | KeyUSFacts | KeyUSFacts.MEDHINC_CY | KeyUSFacts_MEDHINC_CY |
6 | AVGHINC_CY | 2021 Average Household Income | KeyUSFacts | KeyUSFacts.AVGHINC_CY | KeyUSFacts_AVGHINC_CY |
7 | PCI_CY | 2021 Per Capita Income | KeyUSFacts | KeyUSFacts.PCI_CY | KeyUSFacts_PCI_CY |
8 | TOTHU_CY | 2021 Total Housing Units | KeyUSFacts | KeyUSFacts.TOTHU_CY | KeyUSFacts_TOTHU_CY |
9 | OWNER_CY | 2021 Owner Occupied HUs | KeyUSFacts | KeyUSFacts.OWNER_CY | KeyUSFacts_OWNER_CY |
10 | RENTER_CY | 2021 Renter Occupied HUs | KeyUSFacts | KeyUSFacts.RENTER_CY | KeyUSFacts_RENTER_CY |
11 | VACANT_CY | 2021 Vacant Housing Units | KeyUSFacts | KeyUSFacts.VACANT_CY | KeyUSFacts_VACANT_CY |
12 | MEDVAL_CY | 2021 Median Home Value | KeyUSFacts | KeyUSFacts.MEDVAL_CY | KeyUSFacts_MEDVAL_CY |
13 | AVGVAL_CY | 2021 Average Home Value | KeyUSFacts | KeyUSFacts.AVGVAL_CY | KeyUSFacts_AVGVAL_CY |
14 | POPGRW10CY | 2010-2021 Growth Rate: Population | KeyUSFacts | KeyUSFacts.POPGRW10CY | KeyUSFacts_POPGRW10CY |
15 | HHGRW10CY | 2010-2021 Growth Rate: Households | KeyUSFacts | KeyUSFacts.HHGRW10CY | KeyUSFacts_HHGRW10CY |
16 | FAMGRW10CY | 2010-2021 Growth Rate: Families | KeyUSFacts | KeyUSFacts.FAMGRW10CY | KeyUSFacts_FAMGRW10CY |
17 | DPOP_CY | 2021 Total Daytime Population | KeyUSFacts | KeyUSFacts.DPOP_CY | KeyUSFacts_DPOP_CY |
18 | DPOPWRK_CY | 2021 Daytime Pop: Workers | KeyUSFacts | KeyUSFacts.DPOPWRK_CY | KeyUSFacts_DPOPWRK_CY |
19 | DPOPRES_CY | 2021 Daytime Pop: Residents | KeyUSFacts | KeyUSFacts.DPOPRES_CY | KeyUSFacts_DPOPRES_CY |
Finally, we can enrich using the points and variables collected above. Please notice, we are not specifying the area around the input points, so we the proximity defaults are being used - a straight line distance of one kilometer around the points. This circular area is then used to apportion data to the locations specified by the point geometries.
[7]:
pt1_enrich_df = usa_local.enrich(
geographies=geom_lst,
enrich_variables=kv
)
pt1_enrich_df.info()
pt1_enrich_df.head()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 2 entries, 0 to 1
Data columns (total 27 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 has_data 2 non-null int64
1 area_type 2 non-null object
2 buffer_units 2 non-null object
3 buffer_unit 2 non-null object
4 buffer_radii 2 non-null float64
5 aggregation_method 2 non-null object
6 keyusfacts_totpop_cy 2 non-null float64
7 keyusfacts_gqpop_cy 2 non-null float64
8 keyusfacts_divindx_cy 2 non-null float64
9 keyusfacts_tothh_cy 2 non-null float64
10 keyusfacts_avghhsz_cy 2 non-null float64
11 keyusfacts_medhinc_cy 2 non-null float64
12 keyusfacts_avghinc_cy 2 non-null float64
13 keyusfacts_pci_cy 2 non-null float64
14 keyusfacts_tothu_cy 2 non-null float64
15 keyusfacts_owner_cy 2 non-null float64
16 keyusfacts_renter_cy 2 non-null float64
17 keyusfacts_vacant_cy 2 non-null float64
18 keyusfacts_medval_cy 2 non-null float64
19 keyusfacts_avgval_cy 2 non-null float64
20 keyusfacts_popgrw10cy 2 non-null float64
21 keyusfacts_hhgrw10cy 2 non-null float64
22 keyusfacts_famgrw10cy 2 non-null float64
23 keyusfacts_dpop_cy 2 non-null float64
24 keyusfacts_dpopwrk_cy 2 non-null float64
25 keyusfacts_dpopres_cy 2 non-null float64
26 SHAPE 2 non-null geometry
dtypes: float64(21), geometry(1), int64(1), object(4)
memory usage: 560.0+ bytes
[7]:
has_data | area_type | buffer_units | buffer_unit | buffer_radii | aggregation_method | keyusfacts_totpop_cy | keyusfacts_gqpop_cy | keyusfacts_divindx_cy | keyusfacts_tothh_cy | ... | keyusfacts_vacant_cy | keyusfacts_medval_cy | keyusfacts_avgval_cy | keyusfacts_popgrw10cy | keyusfacts_hhgrw10cy | keyusfacts_famgrw10cy | keyusfacts_dpop_cy | keyusfacts_dpopwrk_cy | keyusfacts_dpopres_cy | SHAPE | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 1 | RingBuffer | esriKilometers | esriKilometers | 1.0 | BlockApportionment:US.BlockGroups;PointsLayer:... | 2706.0 | 66.0 | 38.6 | 1727.0 | ... | 218.0 | 381908.0 | 419728.0 | 1.46 | 1.74 | 1.08 | 7604.0 | 6222.0 | 1382.0 | {"x": -122.90748349999996, "y": 47.04502490000... |
1 | 1 | RingBuffer | esriKilometers | esriKilometers | 1.0 | BlockApportionment:US.BlockGroups;PointsLayer:... | 4672.0 | 75.0 | 41.8 | 2145.0 | ... | 116.0 | 323665.0 | 360023.0 | 0.73 | 0.71 | 0.54 | 4623.0 | 2558.0 | 2065.0 | {"x": -122.87495999999999, "y": 47.04640310000... |
2 rows × 27 columns
Specify Proximity Value¶
If wanting to use a value different from the default of one kilometer (highly recommended), this can easily be specified using the proximity_value
parameter.
[8]:
pt2_enrich_df = usa_local.enrich(
geographies=geom_lst,
enrich_variables=kv,
proximity_value=5
)
pt2_enrich_df.info()
pt2_enrich_df.head()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 2 entries, 0 to 1
Data columns (total 27 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 has_data 2 non-null int64
1 area_type 2 non-null object
2 buffer_units 2 non-null object
3 buffer_unit 2 non-null object
4 buffer_radii 2 non-null float64
5 aggregation_method 2 non-null object
6 keyusfacts_totpop_cy 2 non-null float64
7 keyusfacts_gqpop_cy 2 non-null float64
8 keyusfacts_divindx_cy 2 non-null float64
9 keyusfacts_tothh_cy 2 non-null float64
10 keyusfacts_avghhsz_cy 2 non-null float64
11 keyusfacts_medhinc_cy 2 non-null float64
12 keyusfacts_avghinc_cy 2 non-null float64
13 keyusfacts_pci_cy 2 non-null float64
14 keyusfacts_tothu_cy 2 non-null float64
15 keyusfacts_owner_cy 2 non-null float64
16 keyusfacts_renter_cy 2 non-null float64
17 keyusfacts_vacant_cy 2 non-null float64
18 keyusfacts_medval_cy 2 non-null float64
19 keyusfacts_avgval_cy 2 non-null float64
20 keyusfacts_popgrw10cy 2 non-null float64
21 keyusfacts_hhgrw10cy 2 non-null float64
22 keyusfacts_famgrw10cy 2 non-null float64
23 keyusfacts_dpop_cy 2 non-null float64
24 keyusfacts_dpopwrk_cy 2 non-null float64
25 keyusfacts_dpopres_cy 2 non-null float64
26 SHAPE 2 non-null geometry
dtypes: float64(21), geometry(1), int64(1), object(4)
memory usage: 560.0+ bytes
[8]:
has_data | area_type | buffer_units | buffer_unit | buffer_radii | aggregation_method | keyusfacts_totpop_cy | keyusfacts_gqpop_cy | keyusfacts_divindx_cy | keyusfacts_tothh_cy | ... | keyusfacts_vacant_cy | keyusfacts_medval_cy | keyusfacts_avgval_cy | keyusfacts_popgrw10cy | keyusfacts_hhgrw10cy | keyusfacts_famgrw10cy | keyusfacts_dpop_cy | keyusfacts_dpopwrk_cy | keyusfacts_dpopres_cy | SHAPE | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 1 | RingBuffer | esriKilometers | esriKilometers | 5.0 | BlockApportionment:US.BlockGroups;PointsLayer:... | 67415.0 | 1265.0 | 45.4 | 29537.0 | ... | 1431.0 | 371293.0 | 415153.0 | 1.59 | 1.57 | 1.44 | 82942.0 | 50340.0 | 32602.0 | {"x": -122.90748349999996, "y": 47.04502490000... |
1 | 1 | RingBuffer | esriKilometers | esriKilometers | 5.0 | BlockApportionment:US.BlockGroups;PointsLayer:... | 70224.0 | 1350.0 | 48.5 | 30852.0 | ... | 1636.0 | 369053.0 | 415710.0 | 0.99 | 1.01 | 0.87 | 89116.0 | 54806.0 | 34310.0 | {"x": -122.87495999999999, "y": 47.04640310000... |
2 rows × 27 columns
Specify Proximity Value and Metric¶
If desiring to use a different measure of distance, such as miles, this can be specified as well using the proximity_metric
parameter.
[11]:
pt3_enrich_df = usa_local.enrich(
geographies=geom_lst,
enrich_variables=kv,
proximity_value=5,
proximity_metric='miles'
)
pt3_enrich_df.info()
pt3_enrich_df.head()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 2 entries, 0 to 1
Data columns (total 27 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 has_data 2 non-null int64
1 area_type 2 non-null object
2 buffer_units 2 non-null object
3 buffer_unit 2 non-null object
4 buffer_radii 2 non-null float64
5 aggregation_method 2 non-null object
6 keyusfacts_totpop_cy 2 non-null float64
7 keyusfacts_gqpop_cy 2 non-null float64
8 keyusfacts_divindx_cy 2 non-null float64
9 keyusfacts_tothh_cy 2 non-null float64
10 keyusfacts_avghhsz_cy 2 non-null float64
11 keyusfacts_medhinc_cy 2 non-null float64
12 keyusfacts_avghinc_cy 2 non-null float64
13 keyusfacts_pci_cy 2 non-null float64
14 keyusfacts_tothu_cy 2 non-null float64
15 keyusfacts_owner_cy 2 non-null float64
16 keyusfacts_renter_cy 2 non-null float64
17 keyusfacts_vacant_cy 2 non-null float64
18 keyusfacts_medval_cy 2 non-null float64
19 keyusfacts_avgval_cy 2 non-null float64
20 keyusfacts_popgrw10cy 2 non-null float64
21 keyusfacts_hhgrw10cy 2 non-null float64
22 keyusfacts_famgrw10cy 2 non-null float64
23 keyusfacts_dpop_cy 2 non-null float64
24 keyusfacts_dpopwrk_cy 2 non-null float64
25 keyusfacts_dpopres_cy 2 non-null float64
26 SHAPE 2 non-null geometry
dtypes: float64(21), geometry(1), int64(1), object(4)
memory usage: 560.0+ bytes
[11]:
has_data | area_type | buffer_units | buffer_unit | buffer_radii | aggregation_method | keyusfacts_totpop_cy | keyusfacts_gqpop_cy | keyusfacts_divindx_cy | keyusfacts_tothh_cy | ... | keyusfacts_vacant_cy | keyusfacts_medval_cy | keyusfacts_avgval_cy | keyusfacts_popgrw10cy | keyusfacts_hhgrw10cy | keyusfacts_famgrw10cy | keyusfacts_dpop_cy | keyusfacts_dpopwrk_cy | keyusfacts_dpopres_cy | SHAPE | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 1 | RingBuffer | esriMiles | esriMiles | 5.0 | BlockApportionment:US.BlockGroups;PointsLayer:... | 121938.0 | 2969.0 | 47.4 | 51980.0 | ... | 2798.0 | 370690.0 | 428025.0 | 1.31 | 1.29 | 1.14 | 140743.0 | 79614.0 | 61129.0 | {"x": -122.90748349999996, "y": 47.04502490000... |
1 | 1 | RingBuffer | esriMiles | esriMiles | 5.0 | BlockApportionment:US.BlockGroups;PointsLayer:... | 148032.0 | 2696.0 | 51.4 | 61965.0 | ... | 3451.0 | 359612.0 | 412573.0 | 1.42 | 1.37 | 1.26 | 162638.0 | 87230.0 | 75408.0 | {"x": -122.87495999999999, "y": 47.04640310000... |
2 rows × 27 columns
Specify Proximity Type¶
The above examples all use the default proximity_type
of straight_line
. However, based on what transportation network you have available with the GIS source you are using, other methods are also available. These can be discovered using the travel_modes
property of the Country
. Any of the vaues in the names
column are valid values for proximity_type
in addition to the default straight_line
.
[13]:
usa_local.travel_modes
[13]:
name | alias | description | type | impedance | impedance_category | time_attribute_name | distance_attribute_name | |
---|---|---|---|---|---|---|---|---|
0 | driving_time | Driving Time | Models the movement of cars and other similar ... | AUTOMOBILE | TravelTime | temporal | TravelTime | Kilometers |
1 | driving_distance | Driving Distance | Models the movement of cars and other similar ... | AUTOMOBILE | Kilometers | distance | TravelTime | Kilometers |
2 | trucking_time | Trucking Time | Models basic truck travel by preferring design... | TRUCK | TruckTravelTime | temporal | TruckTravelTime | Kilometers |
3 | trucking_distance | Trucking Distance | Models basic truck travel by preferring design... | TRUCK | Kilometers | distance | TruckTravelTime | Kilometers |
4 | walking_time | Walking Time | Follows paths and roads that allow pedestrian ... | WALK | WalkTime | temporal | WalkTime | Kilometers |
5 | walking_distance | Walking Distance | Follows paths and roads that allow pedestrian ... | WALK | Kilometers | distance | WalkTime | Kilometers |
6 | rural_driving_time | Rural Driving Time | Models the movement of cars and other similar ... | AUTOMOBILE | TravelTime | temporal | TravelTime | Kilometers |
7 | rural_driving_distance | Rural Driving Distance | Models the movement of cars and other similar ... | AUTOMOBILE | Kilometers | distance | TravelTime | Kilometers |
Hence, if we want to use both paved and gravel roads (because gravel roads are fun), we can use rural_driving_time
. Before selecting, we can investigate the details of the method by looking at the description.
[18]:
usa_local.travel_modes[usa_local.travel_modes.name == 'rural_driving_distance'].iloc[0]['description']
[18]:
'Models the movement of cars and other similar small automobiles, such as pickup trucks, and finds solutions that optimize travel distance. Travel obeys one-way roads, avoids illegal turns, and follows other rules that are specific to cars, but does not discourage travel on unpaved roads.'
Most people aren’t going to be driving as fast on a gravel road as they are on an interstate. This enables us to take into consideration the differences in speed based on the road type. Using drive time as a method to define proximity around a location is a much better represenation of how people actually move around and interact with their surrouding environemnt…such as finding food at a grocery store.
[20]:
pt4_enrich_df = usa_local.enrich(
geographies=geom_lst,
enrich_variables=kv,
proximity_type='rural_driving_time',
proximity_value=12,
proximity_metric='minutes'
)
pt4_enrich_df.info()
pt4_enrich_df.head()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 2 entries, 0 to 1
Data columns (total 27 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 has_data 2 non-null int64
1 area_type 2 non-null object
2 buffer_units 2 non-null object
3 buffer_unit 2 non-null object
4 buffer_radii 2 non-null float64
5 aggregation_method 2 non-null object
6 keyusfacts_totpop_cy 2 non-null float64
7 keyusfacts_gqpop_cy 2 non-null float64
8 keyusfacts_divindx_cy 2 non-null float64
9 keyusfacts_tothh_cy 2 non-null float64
10 keyusfacts_avghhsz_cy 2 non-null float64
11 keyusfacts_medhinc_cy 2 non-null float64
12 keyusfacts_avghinc_cy 2 non-null float64
13 keyusfacts_pci_cy 2 non-null float64
14 keyusfacts_tothu_cy 2 non-null float64
15 keyusfacts_owner_cy 2 non-null float64
16 keyusfacts_renter_cy 2 non-null float64
17 keyusfacts_vacant_cy 2 non-null float64
18 keyusfacts_medval_cy 2 non-null float64
19 keyusfacts_avgval_cy 2 non-null float64
20 keyusfacts_popgrw10cy 2 non-null float64
21 keyusfacts_hhgrw10cy 2 non-null float64
22 keyusfacts_famgrw10cy 2 non-null float64
23 keyusfacts_dpop_cy 2 non-null float64
24 keyusfacts_dpopwrk_cy 2 non-null float64
25 keyusfacts_dpopres_cy 2 non-null float64
26 SHAPE 2 non-null geometry
dtypes: float64(21), geometry(1), int64(1), object(4)
memory usage: 560.0+ bytes
[20]:
has_data | area_type | buffer_units | buffer_unit | buffer_radii | aggregation_method | keyusfacts_totpop_cy | keyusfacts_gqpop_cy | keyusfacts_divindx_cy | keyusfacts_tothh_cy | ... | keyusfacts_vacant_cy | keyusfacts_medval_cy | keyusfacts_avgval_cy | keyusfacts_popgrw10cy | keyusfacts_hhgrw10cy | keyusfacts_famgrw10cy | keyusfacts_dpop_cy | keyusfacts_dpopwrk_cy | keyusfacts_dpopres_cy | SHAPE | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 1 | Rural Driving Time | Minutes | Minutes | 12.0 | BlockApportionment:US.BlockGroups;PointsLayer:... | 89167.0 | 2085.0 | 47.4 | 39012.0 | ... | 2045.0 | 370989.0 | 423914.0 | 1.38 | 1.36 | 1.2 | 113049.0 | 68827.0 | 44222.0 | {"x": -122.90748349999996, "y": 47.04502490000... |
1 | 1 | Rural Driving Time | Minutes | Minutes | 12.0 | BlockApportionment:US.BlockGroups;PointsLayer:... | 125186.0 | 2604.0 | 52.1 | 53293.0 | ... | 3074.0 | 359599.0 | 403034.0 | 1.43 | 1.41 | 1.3 | 144557.0 | 81408.0 | 63149.0 | {"x": -122.87495999999999, "y": 47.04640310000... |
2 rows × 27 columns
Use a Spatially Enabled Data Frame¶
Finally, although using a list of Geometry
objects may be useful, likely a much more common paradigm is the Spatially Enabled Data Frame. Above, we created a Spatially Enabled Data Frame. We can use this as input into the geographies
parameter.
Also, just to demonstrate the interchangability of GIS sources, we are going to create a Country
instance using the connection to ArcGIS Online, and perform the same workflow.
[23]:
# create country
usa_gis = Country('usa', gis=gis_agol)
# select variables
ev = usa_gis.enrich_variables
kv = ev[
(ev.data_collection.str.lower().str.contains('key'))
& (ev.name.str.lower().str.endswith('cy'))
].reset_index(drop=True)
# perform enrichment
pt5_enrich_df = usa_gis.enrich(
geographies=geom_lst,
enrich_variables=kv,
proximity_type='rural_driving_time',
proximity_value=12,
proximity_metric='minutes'
)
pt5_enrich_df.info()
pt5_enrich_df.head()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 2 entries, 0 to 1
Data columns (total 27 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 has_data 2 non-null int64
1 area_type 2 non-null object
2 buffer_units 2 non-null object
3 buffer_unit 2 non-null object
4 buffer_radii 2 non-null float64
5 aggregation_method 2 non-null object
6 keyusfacts_totpop_cy 2 non-null float64
7 keyusfacts_gqpop_cy 2 non-null float64
8 keyusfacts_divindx_cy 2 non-null float64
9 keyusfacts_tothh_cy 2 non-null float64
10 keyusfacts_avghhsz_cy 2 non-null float64
11 keyusfacts_medhinc_cy 2 non-null float64
12 keyusfacts_avghinc_cy 2 non-null float64
13 keyusfacts_pci_cy 2 non-null float64
14 keyusfacts_tothu_cy 2 non-null float64
15 keyusfacts_owner_cy 2 non-null float64
16 keyusfacts_renter_cy 2 non-null float64
17 keyusfacts_vacant_cy 2 non-null float64
18 keyusfacts_medval_cy 2 non-null float64
19 keyusfacts_avgval_cy 2 non-null float64
20 keyusfacts_popgrw10cy 2 non-null float64
21 keyusfacts_hhgrw10cy 2 non-null float64
22 keyusfacts_famgrw10cy 2 non-null float64
23 keyusfacts_dpop_cy 2 non-null float64
24 keyusfacts_dpopwrk_cy 2 non-null float64
25 keyusfacts_dpopres_cy 2 non-null float64
26 SHAPE 2 non-null geometry
dtypes: float64(21), geometry(1), int64(1), object(4)
memory usage: 560.0+ bytes
[23]:
has_data | area_type | buffer_units | buffer_unit | buffer_radii | aggregation_method | keyusfacts_totpop_cy | keyusfacts_gqpop_cy | keyusfacts_divindx_cy | keyusfacts_tothh_cy | ... | keyusfacts_vacant_cy | keyusfacts_medval_cy | keyusfacts_avgval_cy | keyusfacts_popgrw10cy | keyusfacts_hhgrw10cy | keyusfacts_famgrw10cy | keyusfacts_dpop_cy | keyusfacts_dpopwrk_cy | keyusfacts_dpopres_cy | SHAPE | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 1 | Rural Driving Time | Minutes | Minutes | 12.0 | BlockApportionment:US.BlockGroups;PointsLayer:... | 89167.0 | 2085.0 | 47.4 | 39012.0 | ... | 2045.0 | 370989.0 | 423914.0 | 1.38 | 1.36 | 1.2 | 113049.0 | 68827.0 | 44222.0 | {"x": -122.90748349999996, "y": 47.04502490000... |
1 | 1 | Rural Driving Time | Minutes | Minutes | 12.0 | BlockApportionment:US.BlockGroups;PointsLayer:... | 125186.0 | 2604.0 | 52.1 | 53293.0 | ... | 3074.0 | 359599.0 | 403034.0 | 1.43 | 1.41 | 1.3 | 144557.0 | 81408.0 | 63149.0 | {"x": -122.87495999999999, "y": 47.04640310000... |
2 rows × 27 columns