{ "cells": [ { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# Geoenrichment - Introduction to Enrich\n", "\n", "Discovering what is available and distilling this down to something usable is the first step in analysis. Consequently, this is the first piece of functionality we added support for, _introspection_. This provides the ability to discover what countries are available, and within a country, what enrichment variables are available and what travel modes are available for dynamically creating trade areas." ] }, { "cell_type": "code", "execution_count": 16, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "from arcgis.gis import GIS\n", "from arcgis.geoenrichment import get_countries, Country\n", "\n", "from demo_data import demo_data" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## GIS _Source_\n", "\n", "The source GIS being used determines the countries available. Business Analyst can be accessed either locally (ArcGIS Pro with Business Analyst and Data) or through a connection to a Web GIS (ArcGIS Enterprise or ArcGIS Online). In this case, we are connecting to an instance of ArcGIS Online." ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/html": [ "GIS @ https://bateam.maps.arcgis.com" ], "text/plain": [ "GIS @ https://bateam.maps.arcgis.com version:10.2" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "gis_agol = GIS(profile='ba')\n", "\n", "gis_agol" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Discovering Countries\n", "\n", "Since the data is organized into countries, this is the first instrospection step, discovering countries.\n", "\n", "### Country Source - local\n", "\n", "A local `gis` source can be used by passing in an instance of the `GIS` object created using the `'pro'` keyword. As you can see, I have quite a few datasets installed on my machine." ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
iso2iso3namevintagecountry_iddata_source_id
0CACANCanada2021CAN_ESRI_2021LOCAL;;CAN_ESRI_2021
1USUSAUnited States2020USA_ESRI_2020LOCAL;;USA_ESRI_2020
2USUSAUnited States2022USA_ESRI_2022LOCAL;;USA_ESRI_2022
\n", "
" ], "text/plain": [ " iso2 iso3 name vintage country_id data_source_id\n", "0 CA CAN Canada 2021 CAN_ESRI_2021 LOCAL;;CAN_ESRI_2021\n", "1 US USA United States 2020 USA_ESRI_2020 LOCAL;;USA_ESRI_2020\n", "2 US USA United States 2022 USA_ESRI_2022 LOCAL;;USA_ESRI_2022" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "get_countries(GIS('pro'))" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### Country Source - Web GIS\n", "\n", "Similarly, we can access the countries available on the Web GIS through the `arcpy.gis.GIS` object. Obviously, if this is ArcGIS Online, this is a _lot_ of countries." ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
iso2iso3namealt_namedatasetsdefault_datasetcontinent
0ALALBAlbaniaALBANIA[ALB_MBR_2021]ALB_MBR_2021Europe
1DZDZAAlgeriaALGERIA[DZA_MBR_2021]DZA_MBR_2021Africa
2ADANDAndorraANDORRA[AND_MBR_2021]AND_MBR_2021Europe
3AOAGOAngolaANGOLA[AGO_MBR_2021]AGO_MBR_2021Africa
4AIAIAAnguillaANGUILLA[AIA_MBR_2020]AIA_MBR_2020North America
........................
172VEVENVenezuelaVENEZUELA, BOLIVARIAN REPUBLIC OF[VEN_MBR_2020]VEN_MBR_2020South America
173VNVNMVietnamVIET NAM[VNM_MBR_2020]VNM_MBR_2020Asia
174VIVIRVirgin IslandsUNITED STATES VIRGIN ISLANDS[VIR_MBR_2020]VIR_MBR_2020North America
175ZMZMBZambiaZAMBIA[ZMB_MBR_2021]ZMB_MBR_2021Africa
176ZWZWEZimbabweZIMBABWE[ZWE_MBR_2021]ZWE_MBR_2021Africa
\n", "

177 rows × 7 columns

\n", "
" ], "text/plain": [ " iso2 iso3 name alt_name \\\n", "0 AL ALB Albania ALBANIA \n", "1 DZ DZA Algeria ALGERIA \n", "2 AD AND Andorra ANDORRA \n", "3 AO AGO Angola ANGOLA \n", "4 AI AIA Anguilla ANGUILLA \n", ".. ... ... ... ... \n", "172 VE VEN Venezuela VENEZUELA, BOLIVARIAN REPUBLIC OF \n", "173 VN VNM Vietnam VIET NAM \n", "174 VI VIR Virgin Islands UNITED STATES VIRGIN ISLANDS \n", "175 ZM ZMB Zambia ZAMBIA \n", "176 ZW ZWE Zimbabwe ZIMBABWE \n", "\n", " datasets default_dataset continent \n", "0 [ALB_MBR_2021] ALB_MBR_2021 Europe \n", "1 [DZA_MBR_2021] DZA_MBR_2021 Africa \n", "2 [AND_MBR_2021] AND_MBR_2021 Europe \n", "3 [AGO_MBR_2021] AGO_MBR_2021 Africa \n", "4 [AIA_MBR_2020] AIA_MBR_2020 North America \n", ".. ... ... ... \n", "172 [VEN_MBR_2020] VEN_MBR_2020 South America \n", "173 [VNM_MBR_2020] VNM_MBR_2020 Asia \n", "174 [VIR_MBR_2020] VIR_MBR_2020 North America \n", "175 [ZMB_MBR_2021] ZMB_MBR_2021 Africa \n", "176 [ZWE_MBR_2021] ZWE_MBR_2021 Africa \n", "\n", "[177 rows x 7 columns]" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "get_countries(gis_agol)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Creating a `Country`\n", "\n", "Before digging into enrichment variables, we need to create a `Country` object instance. A `Country` is created using the ISO3 code displayed in the data frame above along with the corresponding `gis` source." ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "usa = Country('USA', gis=GIS('pro'))\n", "\n", "usa" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "#### Country - Explicit `year`\n", "\n", "If recalling from the introspection previously, three vintages of data are available on my machine for the USA; 2019, 2020, and 2021. If a model was developed against a specific country and data vintage, being able to specifically reference this data vintage is possible using the `year` parameter." ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "usa2020 = Country('USA', gis=GIS('pro'), year=2020)\n", "\n", "usa2020" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Enrichment Variables\n", "\n", "Discovering enrichment variables available is possible through the `Country` object's `enrich_variables` property." ] }, { "cell_type": "code", "execution_count": 22, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "RangeIndex: 27975 entries, 0 to 27974\n", "Data columns (total 5 columns):\n", " # Column Non-Null Count Dtype \n", "--- ------ -------------- ----- \n", " 0 name 27975 non-null object\n", " 1 alias 27975 non-null object\n", " 2 data_collection 27975 non-null object\n", " 3 enrich_name 27975 non-null object\n", " 4 enrich_field_name 27975 non-null object\n", "dtypes: object(5)\n", "memory usage: 1.1+ MB\n" ] }, { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
namealiasdata_collectionenrich_nameenrich_field_name
10444MP14143a_B_I2022 Used Insomnia Prescription Drug : IndexHealthPersonalCareHealthPersonalCare.MP14143a_B_IHealthPersonalCare_MP14143a_B_I
25221RACE3BPO202020 Pop 3 Races: BL-PI-OTHraceandhispanicoriginraceandhispanicorigin.RACE3BPO20raceandhispanicorigin_RACE3BPO20
14056X8030_I2022 Index: Prescription DrugsHealthHealth.X8030_IHealth_X8030_I
2885AIF60C102010 American Indian Females 60-64agebyracebysexagebyracebysex.AIF60C10agebyracebysex_AIF60C10
18019X3042FY_X_A2027 Tools/Equipment-Paint/Paper (Renter) : Av...HousingHouseholdHousingHousehold.X3042FY_X_AHousingHousehold_X3042FY_X_A
\n", "
" ], "text/plain": [ " name alias \\\n", "10444 MP14143a_B_I 2022 Used Insomnia Prescription Drug : Index \n", "25221 RACE3BPO20 2020 Pop 3 Races: BL-PI-OTH \n", "14056 X8030_I 2022 Index: Prescription Drugs \n", "2885 AIF60C10 2010 American Indian Females 60-64 \n", "18019 X3042FY_X_A 2027 Tools/Equipment-Paint/Paper (Renter) : Av... \n", "\n", " data_collection enrich_name \\\n", "10444 HealthPersonalCare HealthPersonalCare.MP14143a_B_I \n", "25221 raceandhispanicorigin raceandhispanicorigin.RACE3BPO20 \n", "14056 Health Health.X8030_I \n", "2885 agebyracebysex agebyracebysex.AIF60C10 \n", "18019 HousingHousehold HousingHousehold.X3042FY_X_A \n", "\n", " enrich_field_name \n", "10444 HealthPersonalCare_MP14143a_B_I \n", "25221 raceandhispanicorigin_RACE3BPO20 \n", "14056 Health_X8030_I \n", "2885 agebyracebysex_AIF60C10 \n", "18019 HousingHousehold_X3042FY_X_A " ] }, "execution_count": 22, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ev = usa.enrich_variables\n", "\n", "ev.info()\n", "ev.sample(5)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Filtering Variables\n", "\n", "The usefulness of relevant metadata, especially categorical data, cannot be overstated. Using relevant criteria we can quickly identify variables to use for enrichment." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### Get Current Income Metrics\n", "\n", "Since a Pandas DataFrame, finding income indicies for use in analysis is relatively straightforward." ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
namealiasdata_collectionenrich_nameenrich_field_name
0AVGHINC_CY2022 Average Household IncomeAtRiskAtRisk.AVGHINC_CYAtRisk_AVGHINC_CY
1HINC0_CY2022 HH Income <$15000PolicyPolicy.HINC0_CYPolicy_HINC0_CY
2HINC15_CY2022 HH Income $15000-24999PolicyPolicy.HINC15_CYPolicy_HINC15_CY
3HINC25_CY2022 HH Income $25000-34999PolicyPolicy.HINC25_CYPolicy_HINC25_CY
4HINC35_CY2022 HH Income $35000-49999PolicyPolicy.HINC35_CYPolicy_HINC35_CY
5HINC50_CY2022 HH Income $50000-74999PolicyPolicy.HINC50_CYPolicy_HINC50_CY
6HINC75_CY2022 HH Income $75000-99999PolicyPolicy.HINC75_CYPolicy_HINC75_CY
7HINC100_CY2022 HH Income $100000-149999PolicyPolicy.HINC100_CYPolicy_HINC100_CY
8HINC150_CY2022 HH Income $150000-199999PolicyPolicy.HINC150_CYPolicy_HINC150_CY
9HINC200_CY2022 HH Income $200000+PolicyPolicy.HINC200_CYPolicy_HINC200_CY
10MEDHINC_CY2022 Median Household IncomePolicyPolicy.MEDHINC_CYPolicy_MEDHINC_CY
11PCI_CY2022 Per Capita IncomePolicyPolicy.PCI_CYPolicy_PCI_CY
12AGGINC_CY2022 Aggregate IncomePolicyPolicy.AGGINC_CYPolicy_AGGINC_CY
13AGGHINC_CY2022 Aggregate HH IncomePolicyPolicy.AGGHINC_CYPolicy_AGGHINC_CY
14HINCBASECY2022 Households by Income BasePolicyPolicy.HINCBASECYPolicy_HINCBASECY
15MEDDI_CY2022 Median Disposable Incomedisposableincomedisposableincome.MEDDI_CYdisposableincome_MEDDI_CY
16AVGDI_CY2022 Average Disposable Incomedisposableincomedisposableincome.AVGDI_CYdisposableincome_AVGDI_CY
17AGGDI_CY2022 Aggregate Disposable Incomedisposableincomedisposableincome.AGGDI_CYdisposableincome_AGGDI_CY
18DIBASE_CY2022 Disposable Income Basedisposableincomedisposableincome.DIBASE_CYdisposableincome_DIBASE_CY
19INCMORT_CY2022 Pct of Income for Mortgagehouseholdtotalshouseholdtotals.INCMORT_CYhouseholdtotals_INCMORT_CY
20AVGIA15_CY2022 Avg HH Income: HHr 15-24incomebyageincomebyage.AVGIA15_CYincomebyage_AVGIA15_CY
21IA15BASECY2022 HH Income Base: HHr 15-24incomebyageincomebyage.IA15BASECYincomebyage_IA15BASECY
22AVGIA25_CY2022 Avg HH Income: HHr 25-34incomebyageincomebyage.AVGIA25_CYincomebyage_AVGIA25_CY
23IA25BASECY2022 HH Income Base: HHr 25-34incomebyageincomebyage.IA25BASECYincomebyage_IA25BASECY
24AVGIA35_CY2022 Avg HH Income: HHr 35-44incomebyageincomebyage.AVGIA35_CYincomebyage_AVGIA35_CY
25IA35BASECY2022 HH Income Base: HHr 35-44incomebyageincomebyage.IA35BASECYincomebyage_IA35BASECY
26AVGIA45_CY2022 Avg HH Income: HHr 45-54incomebyageincomebyage.AVGIA45_CYincomebyage_AVGIA45_CY
27IA45BASECY2022 HH Income Base: HHr 45-54incomebyageincomebyage.IA45BASECYincomebyage_IA45BASECY
28AVGIA55_CY2022 Avg HH Income: HHr 55-64incomebyageincomebyage.AVGIA55_CYincomebyage_AVGIA55_CY
29IA55BASECY2022 HH Income Base: HHr 55-64incomebyageincomebyage.IA55BASECYincomebyage_IA55BASECY
30AVGIA65_CY2022 Avg HH Income: HHr 65-74incomebyageincomebyage.AVGIA65_CYincomebyage_AVGIA65_CY
31IA65BASECY2022 HH Income Base: HHr 65-74incomebyageincomebyage.IA65BASECYincomebyage_IA65BASECY
32AVGIA75_CY2022 Avg HH Income: HHr 75+incomebyageincomebyage.AVGIA75_CYincomebyage_AVGIA75_CY
33IA75BASECY2022 HH Income Base: HHr 75+incomebyageincomebyage.IA75BASECYincomebyage_IA75BASECY
34IA55UBASCY2022 HH Income Base: HHr 55+incomebyageincomebyage.IA55UBASCYincomebyage_IA55UBASCY
35AVGIA55UCY2022 Avg HH Income: HHr 55+incomebyageincomebyage.AVGIA55UCYincomebyage_AVGIA55UCY
36IA65UBASCY2022 HH Income Base: HHr 65+incomebyageincomebyage.IA65UBASCYincomebyage_IA65UBASCY
37AVGIA65UCY2022 Avg HH Income: HHr 65+incomebyageincomebyage.AVGIA65UCYincomebyage_AVGIA65UCY
\n", "
" ], "text/plain": [ " name alias data_collection \\\n", "0 AVGHINC_CY 2022 Average Household Income AtRisk \n", "1 HINC0_CY 2022 HH Income <$15000 Policy \n", "2 HINC15_CY 2022 HH Income $15000-24999 Policy \n", "3 HINC25_CY 2022 HH Income $25000-34999 Policy \n", "4 HINC35_CY 2022 HH Income $35000-49999 Policy \n", "5 HINC50_CY 2022 HH Income $50000-74999 Policy \n", "6 HINC75_CY 2022 HH Income $75000-99999 Policy \n", "7 HINC100_CY 2022 HH Income $100000-149999 Policy \n", "8 HINC150_CY 2022 HH Income $150000-199999 Policy \n", "9 HINC200_CY 2022 HH Income $200000+ Policy \n", "10 MEDHINC_CY 2022 Median Household Income Policy \n", "11 PCI_CY 2022 Per Capita Income Policy \n", "12 AGGINC_CY 2022 Aggregate Income Policy \n", "13 AGGHINC_CY 2022 Aggregate HH Income Policy \n", "14 HINCBASECY 2022 Households by Income Base Policy \n", "15 MEDDI_CY 2022 Median Disposable Income disposableincome \n", "16 AVGDI_CY 2022 Average Disposable Income disposableincome \n", "17 AGGDI_CY 2022 Aggregate Disposable Income disposableincome \n", "18 DIBASE_CY 2022 Disposable Income Base disposableincome \n", "19 INCMORT_CY 2022 Pct of Income for Mortgage householdtotals \n", "20 AVGIA15_CY 2022 Avg HH Income: HHr 15-24 incomebyage \n", "21 IA15BASECY 2022 HH Income Base: HHr 15-24 incomebyage \n", "22 AVGIA25_CY 2022 Avg HH Income: HHr 25-34 incomebyage \n", "23 IA25BASECY 2022 HH Income Base: HHr 25-34 incomebyage \n", "24 AVGIA35_CY 2022 Avg HH Income: HHr 35-44 incomebyage \n", "25 IA35BASECY 2022 HH Income Base: HHr 35-44 incomebyage \n", "26 AVGIA45_CY 2022 Avg HH Income: HHr 45-54 incomebyage \n", "27 IA45BASECY 2022 HH Income Base: HHr 45-54 incomebyage \n", "28 AVGIA55_CY 2022 Avg HH Income: HHr 55-64 incomebyage \n", "29 IA55BASECY 2022 HH Income Base: HHr 55-64 incomebyage \n", "30 AVGIA65_CY 2022 Avg HH Income: HHr 65-74 incomebyage \n", "31 IA65BASECY 2022 HH Income Base: HHr 65-74 incomebyage \n", "32 AVGIA75_CY 2022 Avg HH Income: HHr 75+ incomebyage \n", "33 IA75BASECY 2022 HH Income Base: HHr 75+ incomebyage \n", "34 IA55UBASCY 2022 HH Income Base: HHr 55+ incomebyage \n", "35 AVGIA55UCY 2022 Avg HH Income: HHr 55+ incomebyage \n", "36 IA65UBASCY 2022 HH Income Base: HHr 65+ incomebyage \n", "37 AVGIA65UCY 2022 Avg HH Income: HHr 65+ incomebyage \n", "\n", " enrich_name enrich_field_name \n", "0 AtRisk.AVGHINC_CY AtRisk_AVGHINC_CY \n", "1 Policy.HINC0_CY Policy_HINC0_CY \n", "2 Policy.HINC15_CY Policy_HINC15_CY \n", "3 Policy.HINC25_CY Policy_HINC25_CY \n", "4 Policy.HINC35_CY Policy_HINC35_CY \n", "5 Policy.HINC50_CY Policy_HINC50_CY \n", "6 Policy.HINC75_CY Policy_HINC75_CY \n", "7 Policy.HINC100_CY Policy_HINC100_CY \n", "8 Policy.HINC150_CY Policy_HINC150_CY \n", "9 Policy.HINC200_CY Policy_HINC200_CY \n", "10 Policy.MEDHINC_CY Policy_MEDHINC_CY \n", "11 Policy.PCI_CY Policy_PCI_CY \n", "12 Policy.AGGINC_CY Policy_AGGINC_CY \n", "13 Policy.AGGHINC_CY Policy_AGGHINC_CY \n", "14 Policy.HINCBASECY Policy_HINCBASECY \n", "15 disposableincome.MEDDI_CY disposableincome_MEDDI_CY \n", "16 disposableincome.AVGDI_CY disposableincome_AVGDI_CY \n", "17 disposableincome.AGGDI_CY disposableincome_AGGDI_CY \n", "18 disposableincome.DIBASE_CY disposableincome_DIBASE_CY \n", "19 householdtotals.INCMORT_CY householdtotals_INCMORT_CY \n", "20 incomebyage.AVGIA15_CY incomebyage_AVGIA15_CY \n", "21 incomebyage.IA15BASECY incomebyage_IA15BASECY \n", "22 incomebyage.AVGIA25_CY incomebyage_AVGIA25_CY \n", "23 incomebyage.IA25BASECY incomebyage_IA25BASECY \n", "24 incomebyage.AVGIA35_CY incomebyage_AVGIA35_CY \n", "25 incomebyage.IA35BASECY incomebyage_IA35BASECY \n", "26 incomebyage.AVGIA45_CY incomebyage_AVGIA45_CY \n", "27 incomebyage.IA45BASECY incomebyage_IA45BASECY \n", "28 incomebyage.AVGIA55_CY incomebyage_AVGIA55_CY \n", "29 incomebyage.IA55BASECY incomebyage_IA55BASECY \n", "30 incomebyage.AVGIA65_CY incomebyage_AVGIA65_CY \n", "31 incomebyage.IA65BASECY incomebyage_IA65BASECY \n", "32 incomebyage.AVGIA75_CY incomebyage_AVGIA75_CY \n", "33 incomebyage.IA75BASECY incomebyage_IA75BASECY \n", "34 incomebyage.IA55UBASCY incomebyage_IA55UBASCY \n", "35 incomebyage.AVGIA55UCY incomebyage_AVGIA55UCY \n", "36 incomebyage.IA65UBASCY incomebyage_IA65UBASCY \n", "37 incomebyage.AVGIA65UCY incomebyage_AVGIA65UCY " ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "inc_vars = ev[\n", " (ev.alias.str.lower().str.contains('income'))\n", " & (ev.name.str.endswith('CY'))\n", "].drop_duplicates('name').reset_index(drop=True)\n", "\n", "inc_vars" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### Get Current Key Metrics\n", "\n", "One of the more common example datasets I use when exploring if an idea is vialbe are the current year key metrics." ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
namealiasdata_collectionenrich_nameenrich_field_name
0TOTPOP_CY2022 Total PopulationKeyUSFactsKeyUSFacts.TOTPOP_CYKeyUSFacts_TOTPOP_CY
1GQPOP_CY2022 Group Quarters PopulationKeyUSFactsKeyUSFacts.GQPOP_CYKeyUSFacts_GQPOP_CY
2DIVINDX_CY2022 Diversity IndexKeyUSFactsKeyUSFacts.DIVINDX_CYKeyUSFacts_DIVINDX_CY
3TOTHH_CY2022 Total HouseholdsKeyUSFactsKeyUSFacts.TOTHH_CYKeyUSFacts_TOTHH_CY
4AVGHHSZ_CY2022 Average Household SizeKeyUSFactsKeyUSFacts.AVGHHSZ_CYKeyUSFacts_AVGHHSZ_CY
5MEDHINC_CY2022 Median Household IncomeKeyUSFactsKeyUSFacts.MEDHINC_CYKeyUSFacts_MEDHINC_CY
6AVGHINC_CY2022 Average Household IncomeKeyUSFactsKeyUSFacts.AVGHINC_CYKeyUSFacts_AVGHINC_CY
7PCI_CY2022 Per Capita IncomeKeyUSFactsKeyUSFacts.PCI_CYKeyUSFacts_PCI_CY
8TOTHU_CY2022 Total Housing UnitsKeyUSFactsKeyUSFacts.TOTHU_CYKeyUSFacts_TOTHU_CY
9OWNER_CY2022 Owner Occupied HUsKeyUSFactsKeyUSFacts.OWNER_CYKeyUSFacts_OWNER_CY
10RENTER_CY2022 Renter Occupied HUsKeyUSFactsKeyUSFacts.RENTER_CYKeyUSFacts_RENTER_CY
11VACANT_CY2022 Vacant Housing UnitsKeyUSFactsKeyUSFacts.VACANT_CYKeyUSFacts_VACANT_CY
12MEDVAL_CY2022 Median Home ValueKeyUSFactsKeyUSFacts.MEDVAL_CYKeyUSFacts_MEDVAL_CY
13AVGVAL_CY2022 Average Home ValueKeyUSFactsKeyUSFacts.AVGVAL_CYKeyUSFacts_AVGVAL_CY
14POPGRWCYFY2022-2027 Growth Rate: PopulationKeyUSFactsKeyUSFacts.POPGRWCYFYKeyUSFacts_POPGRWCYFY
15HHGRWCYFY2022-2027 Growth Rate: HouseholdsKeyUSFactsKeyUSFacts.HHGRWCYFYKeyUSFacts_HHGRWCYFY
16FAMGRWCYFY2022-2027 Growth Rate: FamiliesKeyUSFactsKeyUSFacts.FAMGRWCYFYKeyUSFacts_FAMGRWCYFY
17MHIGRWCYFY2022-2027 Growth Rate: Median HH IncKeyUSFactsKeyUSFacts.MHIGRWCYFYKeyUSFacts_MHIGRWCYFY
18PCIGRWCYFY2022-2027 Growth Rate: Per Capita IncKeyUSFactsKeyUSFacts.PCIGRWCYFYKeyUSFacts_PCIGRWCYFY
19DPOP_CY2022 Total Daytime PopulationKeyUSFactsKeyUSFacts.DPOP_CYKeyUSFacts_DPOP_CY
20DPOPWRK_CY2022 Daytime Pop: WorkersKeyUSFactsKeyUSFacts.DPOPWRK_CYKeyUSFacts_DPOPWRK_CY
21DPOPRES_CY2022 Daytime Pop: ResidentsKeyUSFactsKeyUSFacts.DPOPRES_CYKeyUSFacts_DPOPRES_CY
22GQPOP_CY_P2022 Group Quarters Population : PercentKeyUSFactsKeyUSFacts.GQPOP_CY_PKeyUSFacts_GQPOP_CY_P
23AVGHHSZ_CY_I2022 Average Household Size : IndexKeyUSFactsKeyUSFacts.AVGHHSZ_CY_IKeyUSFacts_AVGHHSZ_CY_I
24MEDHINC_CY_I2022 Median Household Income : IndexKeyUSFactsKeyUSFacts.MEDHINC_CY_IKeyUSFacts_MEDHINC_CY_I
25AVGHINC_CY_I2022 Average Household Income : IndexKeyUSFactsKeyUSFacts.AVGHINC_CY_IKeyUSFacts_AVGHINC_CY_I
26PCI_CY_I2022 Per Capita Income : IndexKeyUSFactsKeyUSFacts.PCI_CY_IKeyUSFacts_PCI_CY_I
27OWNER_CY_P2022 Owner Occupied HUs : PercentKeyUSFactsKeyUSFacts.OWNER_CY_PKeyUSFacts_OWNER_CY_P
28RENTER_CY_P2022 Renter Occupied HUs : PercentKeyUSFactsKeyUSFacts.RENTER_CY_PKeyUSFacts_RENTER_CY_P
29VACANT_CY_P2022 Vacant Housing Units : PercentKeyUSFactsKeyUSFacts.VACANT_CY_PKeyUSFacts_VACANT_CY_P
30MEDVAL_CY_I2022 Median Home Value : IndexKeyUSFactsKeyUSFacts.MEDVAL_CY_IKeyUSFacts_MEDVAL_CY_I
31AVGVAL_CY_I2022 Average Home Value : IndexKeyUSFactsKeyUSFacts.AVGVAL_CY_IKeyUSFacts_AVGVAL_CY_I
32POPGRWCYFY_I2022-2027 Growth Rate: Population : IndexKeyUSFactsKeyUSFacts.POPGRWCYFY_IKeyUSFacts_POPGRWCYFY_I
33HHGRWCYFY_I2022-2027 Growth Rate: Households : IndexKeyUSFactsKeyUSFacts.HHGRWCYFY_IKeyUSFacts_HHGRWCYFY_I
34FAMGRWCYFY_I2022-2027 Growth Rate: Families : IndexKeyUSFactsKeyUSFacts.FAMGRWCYFY_IKeyUSFacts_FAMGRWCYFY_I
35MHIGRWCYFY_I2022-2027 Growth Rate: Median HH Inc : IndexKeyUSFactsKeyUSFacts.MHIGRWCYFY_IKeyUSFacts_MHIGRWCYFY_I
36PCIGRWCYFY_I2022-2027 Growth Rate: Per Capita Inc : IndexKeyUSFactsKeyUSFacts.PCIGRWCYFY_IKeyUSFacts_PCIGRWCYFY_I
37DPOPWRK_CY_P2022 Daytime Pop: Workers : PercentKeyUSFactsKeyUSFacts.DPOPWRK_CY_PKeyUSFacts_DPOPWRK_CY_P
38DPOPRES_CY_P2022 Daytime Pop: Residents : PercentKeyUSFactsKeyUSFacts.DPOPRES_CY_PKeyUSFacts_DPOPRES_CY_P
\n", "
" ], "text/plain": [ " name alias \\\n", "0 TOTPOP_CY 2022 Total Population \n", "1 GQPOP_CY 2022 Group Quarters Population \n", "2 DIVINDX_CY 2022 Diversity Index \n", "3 TOTHH_CY 2022 Total Households \n", "4 AVGHHSZ_CY 2022 Average Household Size \n", "5 MEDHINC_CY 2022 Median Household Income \n", "6 AVGHINC_CY 2022 Average Household Income \n", "7 PCI_CY 2022 Per Capita Income \n", "8 TOTHU_CY 2022 Total Housing Units \n", "9 OWNER_CY 2022 Owner Occupied HUs \n", "10 RENTER_CY 2022 Renter Occupied HUs \n", "11 VACANT_CY 2022 Vacant Housing Units \n", "12 MEDVAL_CY 2022 Median Home Value \n", "13 AVGVAL_CY 2022 Average Home Value \n", "14 POPGRWCYFY 2022-2027 Growth Rate: Population \n", "15 HHGRWCYFY 2022-2027 Growth Rate: Households \n", "16 FAMGRWCYFY 2022-2027 Growth Rate: Families \n", "17 MHIGRWCYFY 2022-2027 Growth Rate: Median HH Inc \n", "18 PCIGRWCYFY 2022-2027 Growth Rate: Per Capita Inc \n", "19 DPOP_CY 2022 Total Daytime Population \n", "20 DPOPWRK_CY 2022 Daytime Pop: Workers \n", "21 DPOPRES_CY 2022 Daytime Pop: Residents \n", "22 GQPOP_CY_P 2022 Group Quarters Population : Percent \n", "23 AVGHHSZ_CY_I 2022 Average Household Size : Index \n", "24 MEDHINC_CY_I 2022 Median Household Income : Index \n", "25 AVGHINC_CY_I 2022 Average Household Income : Index \n", "26 PCI_CY_I 2022 Per Capita Income : Index \n", "27 OWNER_CY_P 2022 Owner Occupied HUs : Percent \n", "28 RENTER_CY_P 2022 Renter Occupied HUs : Percent \n", "29 VACANT_CY_P 2022 Vacant Housing Units : Percent \n", "30 MEDVAL_CY_I 2022 Median Home Value : Index \n", "31 AVGVAL_CY_I 2022 Average Home Value : Index \n", "32 POPGRWCYFY_I 2022-2027 Growth Rate: Population : Index \n", "33 HHGRWCYFY_I 2022-2027 Growth Rate: Households : Index \n", "34 FAMGRWCYFY_I 2022-2027 Growth Rate: Families : Index \n", "35 MHIGRWCYFY_I 2022-2027 Growth Rate: Median HH Inc : Index \n", "36 PCIGRWCYFY_I 2022-2027 Growth Rate: Per Capita Inc : Index \n", "37 DPOPWRK_CY_P 2022 Daytime Pop: Workers : Percent \n", "38 DPOPRES_CY_P 2022 Daytime Pop: Residents : Percent \n", "\n", " data_collection enrich_name enrich_field_name \n", "0 KeyUSFacts KeyUSFacts.TOTPOP_CY KeyUSFacts_TOTPOP_CY \n", "1 KeyUSFacts KeyUSFacts.GQPOP_CY KeyUSFacts_GQPOP_CY \n", "2 KeyUSFacts KeyUSFacts.DIVINDX_CY KeyUSFacts_DIVINDX_CY \n", "3 KeyUSFacts KeyUSFacts.TOTHH_CY KeyUSFacts_TOTHH_CY \n", "4 KeyUSFacts KeyUSFacts.AVGHHSZ_CY KeyUSFacts_AVGHHSZ_CY \n", "5 KeyUSFacts KeyUSFacts.MEDHINC_CY KeyUSFacts_MEDHINC_CY \n", "6 KeyUSFacts KeyUSFacts.AVGHINC_CY KeyUSFacts_AVGHINC_CY \n", "7 KeyUSFacts KeyUSFacts.PCI_CY KeyUSFacts_PCI_CY \n", "8 KeyUSFacts KeyUSFacts.TOTHU_CY KeyUSFacts_TOTHU_CY \n", "9 KeyUSFacts KeyUSFacts.OWNER_CY KeyUSFacts_OWNER_CY \n", "10 KeyUSFacts KeyUSFacts.RENTER_CY KeyUSFacts_RENTER_CY \n", "11 KeyUSFacts KeyUSFacts.VACANT_CY KeyUSFacts_VACANT_CY \n", "12 KeyUSFacts KeyUSFacts.MEDVAL_CY KeyUSFacts_MEDVAL_CY \n", "13 KeyUSFacts KeyUSFacts.AVGVAL_CY KeyUSFacts_AVGVAL_CY \n", "14 KeyUSFacts KeyUSFacts.POPGRWCYFY KeyUSFacts_POPGRWCYFY \n", "15 KeyUSFacts KeyUSFacts.HHGRWCYFY KeyUSFacts_HHGRWCYFY \n", "16 KeyUSFacts KeyUSFacts.FAMGRWCYFY KeyUSFacts_FAMGRWCYFY \n", "17 KeyUSFacts KeyUSFacts.MHIGRWCYFY KeyUSFacts_MHIGRWCYFY \n", "18 KeyUSFacts KeyUSFacts.PCIGRWCYFY KeyUSFacts_PCIGRWCYFY \n", "19 KeyUSFacts KeyUSFacts.DPOP_CY KeyUSFacts_DPOP_CY \n", "20 KeyUSFacts KeyUSFacts.DPOPWRK_CY KeyUSFacts_DPOPWRK_CY \n", "21 KeyUSFacts KeyUSFacts.DPOPRES_CY KeyUSFacts_DPOPRES_CY \n", "22 KeyUSFacts KeyUSFacts.GQPOP_CY_P KeyUSFacts_GQPOP_CY_P \n", "23 KeyUSFacts KeyUSFacts.AVGHHSZ_CY_I KeyUSFacts_AVGHHSZ_CY_I \n", "24 KeyUSFacts KeyUSFacts.MEDHINC_CY_I KeyUSFacts_MEDHINC_CY_I \n", "25 KeyUSFacts KeyUSFacts.AVGHINC_CY_I KeyUSFacts_AVGHINC_CY_I \n", "26 KeyUSFacts KeyUSFacts.PCI_CY_I KeyUSFacts_PCI_CY_I \n", "27 KeyUSFacts KeyUSFacts.OWNER_CY_P KeyUSFacts_OWNER_CY_P \n", "28 KeyUSFacts KeyUSFacts.RENTER_CY_P KeyUSFacts_RENTER_CY_P \n", "29 KeyUSFacts KeyUSFacts.VACANT_CY_P KeyUSFacts_VACANT_CY_P \n", "30 KeyUSFacts KeyUSFacts.MEDVAL_CY_I KeyUSFacts_MEDVAL_CY_I \n", "31 KeyUSFacts KeyUSFacts.AVGVAL_CY_I KeyUSFacts_AVGVAL_CY_I \n", "32 KeyUSFacts KeyUSFacts.POPGRWCYFY_I KeyUSFacts_POPGRWCYFY_I \n", "33 KeyUSFacts KeyUSFacts.HHGRWCYFY_I KeyUSFacts_HHGRWCYFY_I \n", "34 KeyUSFacts KeyUSFacts.FAMGRWCYFY_I KeyUSFacts_FAMGRWCYFY_I \n", "35 KeyUSFacts KeyUSFacts.MHIGRWCYFY_I KeyUSFacts_MHIGRWCYFY_I \n", "36 KeyUSFacts KeyUSFacts.PCIGRWCYFY_I KeyUSFacts_PCIGRWCYFY_I \n", "37 KeyUSFacts KeyUSFacts.DPOPWRK_CY_P KeyUSFacts_DPOPWRK_CY_P \n", "38 KeyUSFacts KeyUSFacts.DPOPRES_CY_P KeyUSFacts_DPOPRES_CY_P " ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "kv = ev[\n", " (ev.name.str.contains('CY'))\n", " & (ev.data_collection.str.lower().str.contains('key'))\n", "].reset_index(drop=True)\n", "\n", "kv" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Onto Enrich\n", "\n", "From here the next step is enriching data using the retrieved variables. As of the 2.0.1 release of the Python API for ArcGIS you can pass this data frame directly into the `enrich` function." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Enrich Study Area Polygons" ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "RangeIndex: 650 entries, 0 to 649\n", "Data columns (total 2 columns):\n", " # Column Non-Null Count Dtype \n", "--- ------ -------------- ----- \n", " 0 LOCNUM 650 non-null object \n", " 1 SHAPE 650 non-null geometry\n", "dtypes: geometry(1), object(1)\n", "memory usage: 10.3+ KB\n" ] }, { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
LOCNUMSHAPE
0105550909{\"rings\": [[[-13659712.491841972, 5689050.6789...
1105633002{\"rings\": [[[-13683812.381583042, 5712092.0429...
2105759815{\"rings\": [[[-13655812.509662066, 5706201.2159...
3177692910{\"rings\": [[[-13651962.527299367, 5690511.2941...
4180308389{\"rings\": [[[-13640787.578395057, 5708949.7851...
\n", "
" ], "text/plain": [ " LOCNUM SHAPE\n", "0 105550909 {\"rings\": [[[-13659712.491841972, 5689050.6789...\n", "1 105633002 {\"rings\": [[[-13683812.381583042, 5712092.0429...\n", "2 105759815 {\"rings\": [[[-13655812.509662066, 5706201.2159...\n", "3 177692910 {\"rings\": [[[-13651962.527299367, 5690511.2941...\n", "4 180308389 {\"rings\": [[[-13640787.578395057, 5708949.7851..." ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" } ], "source": [ "study_area_df = demo_data.pdx_coffee_study_areas_3min.df\n", "\n", "study_area_df.info()\n", "study_area_df.head()" ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "RangeIndex: 650 entries, 0 to 649\n", "Data columns (total 43 columns):\n", " # Column Non-Null Count Dtype \n", "--- ------ -------------- ----- \n", " 0 locnum 650 non-null object \n", " 1 has_data 650 non-null int32 \n", " 2 aggregation_method 650 non-null object \n", " 3 totpop_cy 650 non-null float64 \n", " 4 gqpop_cy 650 non-null float64 \n", " 5 divindx_cy 650 non-null float64 \n", " 6 tothh_cy 650 non-null float64 \n", " 7 avghhsz_cy 650 non-null float64 \n", " 8 medhinc_cy 650 non-null float64 \n", " 9 avghinc_cy 650 non-null float64 \n", " 10 pci_cy 650 non-null float64 \n", " 11 tothu_cy 650 non-null float64 \n", " 12 owner_cy 650 non-null float64 \n", " 13 renter_cy 650 non-null float64 \n", " 14 vacant_cy 650 non-null float64 \n", " 15 medval_cy 650 non-null float64 \n", " 16 avgval_cy 650 non-null float64 \n", " 17 popgrwcyfy 650 non-null float64 \n", " 18 hhgrwcyfy 650 non-null float64 \n", " 19 famgrwcyfy 650 non-null float64 \n", " 20 mhigrwcyfy 650 non-null float64 \n", " 21 pcigrwcyfy 650 non-null float64 \n", " 22 dpop_cy 650 non-null float64 \n", " 23 dpopwrk_cy 650 non-null float64 \n", " 24 dpopres_cy 650 non-null float64 \n", " 25 gqpop_cy_p 650 non-null float64 \n", " 26 avghhsz_cy_i 650 non-null int32 \n", " 27 medhinc_cy_i 650 non-null int32 \n", " 28 avghinc_cy_i 650 non-null int32 \n", " 29 pci_cy_i 650 non-null int32 \n", " 30 owner_cy_p 650 non-null float64 \n", " 31 renter_cy_p 650 non-null float64 \n", " 32 vacant_cy_p 650 non-null float64 \n", " 33 medval_cy_i 650 non-null int32 \n", " 34 avgval_cy_i 650 non-null int32 \n", " 35 popgrwcyfy_i 650 non-null int32 \n", " 36 hhgrwcyfy_i 650 non-null int32 \n", " 37 famgrwcyfy_i 650 non-null int32 \n", " 38 mhigrwcyfy_i 650 non-null int32 \n", " 39 pcigrwcyfy_i 650 non-null int32 \n", " 40 dpopwrk_cy_p 650 non-null float64 \n", " 41 dpopres_cy_p 650 non-null float64 \n", " 42 SHAPE 650 non-null geometry\n", "dtypes: float64(28), geometry(1), int32(12), object(2)\n", "memory usage: 188.0+ KB\n" ] }, { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
locnumhas_dataaggregation_methodtotpop_cygqpop_cydivindx_cytothh_cyavghhsz_cymedhinc_cyavghinc_cy...medval_cy_iavgval_cy_ipopgrwcyfy_ihhgrwcyfy_ifamgrwcyfy_imhigrwcyfy_ipcigrwcyfy_idpopwrk_cy_pdpopres_cy_pSHAPE
01055509091BlockApportionment:US.BlockGroups;PointsLayer:...6138.0143.041.92575.02.33119103.0176581.0...2311914082942751097973.5426.46{\"rings\": [[[-122.70728508062291, 45.427343047...
11056330021BlockApportionment:US.BlockGroups;PointsLayer:...39.00.061.013.03.00111816.0135177.0...163130-208004612199.090.91{\"rings\": [[[-122.92377807415231, 45.572420302...
21057598151BlockApportionment:US.BlockGroups;PointsLayer:...11643.01335.055.87136.01.4466266.0107837.0...22218130424828930213485.7914.21{\"rings\": [[[-122.6722509453407, 45.5353649659...
31776929101BlockApportionment:US.BlockGroups;PointsLayer:...5949.062.053.52612.02.2555293.075609.0...1471162816-188811850.2349.77{\"rings\": [[[-122.6376659650056, 45.4365507371...
41803083891BlockApportionment:US.BlockGroups;PointsLayer:...12853.0677.068.04924.02.4763848.081311.0...132108-100-119-14613612043.2656.74{\"rings\": [[[-122.53727969104438, 45.552657456...
\n", "

5 rows × 43 columns

\n", "
" ], "text/plain": [ " locnum has_data aggregation_method \\\n", "0 105550909 1 BlockApportionment:US.BlockGroups;PointsLayer:... \n", "1 105633002 1 BlockApportionment:US.BlockGroups;PointsLayer:... \n", "2 105759815 1 BlockApportionment:US.BlockGroups;PointsLayer:... \n", "3 177692910 1 BlockApportionment:US.BlockGroups;PointsLayer:... \n", "4 180308389 1 BlockApportionment:US.BlockGroups;PointsLayer:... \n", "\n", " totpop_cy gqpop_cy divindx_cy tothh_cy avghhsz_cy medhinc_cy \\\n", "0 6138.0 143.0 41.9 2575.0 2.33 119103.0 \n", "1 39.0 0.0 61.0 13.0 3.00 111816.0 \n", "2 11643.0 1335.0 55.8 7136.0 1.44 66266.0 \n", "3 5949.0 62.0 53.5 2612.0 2.25 55293.0 \n", "4 12853.0 677.0 68.0 4924.0 2.47 63848.0 \n", "\n", " avghinc_cy ... medval_cy_i avgval_cy_i popgrwcyfy_i hhgrwcyfy_i \\\n", "0 176581.0 ... 231 191 408 294 \n", "1 135177.0 ... 163 130 -208 0 \n", "2 107837.0 ... 222 181 304 248 \n", "3 75609.0 ... 147 116 28 16 \n", "4 81311.0 ... 132 108 -100 -119 \n", "\n", " famgrwcyfy_i mhigrwcyfy_i pcigrwcyfy_i dpopwrk_cy_p dpopres_cy_p \\\n", "0 275 109 79 73.54 26.46 \n", "1 0 46 121 99.09 0.91 \n", "2 289 302 134 85.79 14.21 \n", "3 -18 88 118 50.23 49.77 \n", "4 -146 136 120 43.26 56.74 \n", "\n", " SHAPE \n", "0 {\"rings\": [[[-122.70728508062291, 45.427343047... \n", "1 {\"rings\": [[[-122.92377807415231, 45.572420302... \n", "2 {\"rings\": [[[-122.6722509453407, 45.5353649659... \n", "3 {\"rings\": [[[-122.6376659650056, 45.4365507371... \n", "4 {\"rings\": [[[-122.53727969104438, 45.552657456... \n", "\n", "[5 rows x 43 columns]" ] }, "execution_count": 20, "metadata": {}, "output_type": "execute_result" } ], "source": [ "study_area_enrich_df = usa.enrich(study_area_df, enrich_variables=kv)\n", "\n", "study_area_enrich_df.info()\n", "study_area_enrich_df.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Enrich Point Locations" ] }, { "cell_type": "code", "execution_count": 24, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "RangeIndex: 650 entries, 0 to 649\n", "Data columns (total 2 columns):\n", " # Column Non-Null Count Dtype \n", "--- ------ -------------- ----- \n", " 0 LOCNUM 650 non-null object \n", " 1 SHAPE 650 non-null geometry\n", "dtypes: geometry(1), object(1)\n", "memory usage: 10.3+ KB\n" ] }, { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
LOCNUMSHAPE
0105550909{\"x\": -13661187.466078103, \"y\": 5686302.261006...
1105633002{\"x\": -13683283.32746541, \"y\": 5709438.5596519...
2105759815{\"x\": -13656422.54659419, \"y\": 5704599.4372188...
3177692910{\"x\": -13651479.293286022, \"y\": 5687492.553844...
4180308389{\"x\": -13640908.50575978, \"y\": 5706009.5641631...
\n", "
" ], "text/plain": [ " LOCNUM SHAPE\n", "0 105550909 {\"x\": -13661187.466078103, \"y\": 5686302.261006...\n", "1 105633002 {\"x\": -13683283.32746541, \"y\": 5709438.5596519...\n", "2 105759815 {\"x\": -13656422.54659419, \"y\": 5704599.4372188...\n", "3 177692910 {\"x\": -13651479.293286022, \"y\": 5687492.553844...\n", "4 180308389 {\"x\": -13640908.50575978, \"y\": 5706009.5641631..." ] }, "execution_count": 24, "metadata": {}, "output_type": "execute_result" } ], "source": [ "pt_df = demo_data.pdx_coffee_locations.df\n", "\n", "pt_df.info()\n", "pt_df.head()" ] }, { "cell_type": "code", "execution_count": 25, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
namealiasdescriptiontypeimpedanceimpedance_categorytime_attribute_namedistance_attribute_name
0driving_timeDriving TimeModels the movement of cars and other similar ...AUTOMOBILETravelTimetemporalTravelTimeKilometers
1driving_distanceDriving DistanceModels the movement of cars and other similar ...AUTOMOBILEKilometersdistanceTravelTimeKilometers
2trucking_timeTrucking TimeModels basic truck travel by preferring design...TRUCKTruckTravelTimetemporalTruckTravelTimeKilometers
3trucking_distanceTrucking DistanceModels basic truck travel by preferring design...TRUCKKilometersdistanceTruckTravelTimeKilometers
4walking_timeWalking TimeFollows paths and roads that allow pedestrian ...WALKWalkTimetemporalWalkTimeKilometers
5walking_distanceWalking DistanceFollows paths and roads that allow pedestrian ...WALKKilometersdistanceWalkTimeKilometers
6rural_driving_timeRural Driving TimeModels the movement of cars and other similar ...AUTOMOBILETravelTimetemporalTravelTimeKilometers
7rural_driving_distanceRural Driving DistanceModels the movement of cars and other similar ...AUTOMOBILEKilometersdistanceTravelTimeKilometers
\n", "
" ], "text/plain": [ " name alias \\\n", "0 driving_time Driving Time \n", "1 driving_distance Driving Distance \n", "2 trucking_time Trucking Time \n", "3 trucking_distance Trucking Distance \n", "4 walking_time Walking Time \n", "5 walking_distance Walking Distance \n", "6 rural_driving_time Rural Driving Time \n", "7 rural_driving_distance Rural Driving Distance \n", "\n", " description type \\\n", "0 Models the movement of cars and other similar ... AUTOMOBILE \n", "1 Models the movement of cars and other similar ... AUTOMOBILE \n", "2 Models basic truck travel by preferring design... TRUCK \n", "3 Models basic truck travel by preferring design... TRUCK \n", "4 Follows paths and roads that allow pedestrian ... WALK \n", "5 Follows paths and roads that allow pedestrian ... WALK \n", "6 Models the movement of cars and other similar ... AUTOMOBILE \n", "7 Models the movement of cars and other similar ... AUTOMOBILE \n", "\n", " impedance impedance_category time_attribute_name \\\n", "0 TravelTime temporal TravelTime \n", "1 Kilometers distance TravelTime \n", "2 TruckTravelTime temporal TruckTravelTime \n", "3 Kilometers distance TruckTravelTime \n", "4 WalkTime temporal WalkTime \n", "5 Kilometers distance WalkTime \n", "6 TravelTime temporal TravelTime \n", "7 Kilometers distance TravelTime \n", "\n", " distance_attribute_name \n", "0 Kilometers \n", "1 Kilometers \n", "2 Kilometers \n", "3 Kilometers \n", "4 Kilometers \n", "5 Kilometers \n", "6 Kilometers \n", "7 Kilometers " ] }, "execution_count": 25, "metadata": {}, "output_type": "execute_result" } ], "source": [ "usa.travel_modes" ] }, { "cell_type": "code", "execution_count": 26, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "RangeIndex: 650 entries, 0 to 649\n", "Data columns (total 47 columns):\n", " # Column Non-Null Count Dtype \n", "--- ------ -------------- ----- \n", " 0 locnum 650 non-null object \n", " 1 has_data 650 non-null int32 \n", " 2 area_type 650 non-null object \n", " 3 buffer_units 650 non-null object \n", " 4 buffer_unit 650 non-null object \n", " 5 buffer_radii 650 non-null float64 \n", " 6 aggregation_method 650 non-null object \n", " 7 totpop_cy 650 non-null float64 \n", " 8 gqpop_cy 650 non-null float64 \n", " 9 divindx_cy 650 non-null float64 \n", " 10 tothh_cy 650 non-null float64 \n", " 11 avghhsz_cy 650 non-null float64 \n", " 12 medhinc_cy 650 non-null float64 \n", " 13 avghinc_cy 650 non-null float64 \n", " 14 pci_cy 650 non-null float64 \n", " 15 tothu_cy 650 non-null float64 \n", " 16 owner_cy 650 non-null float64 \n", " 17 renter_cy 650 non-null float64 \n", " 18 vacant_cy 650 non-null float64 \n", " 19 medval_cy 650 non-null float64 \n", " 20 avgval_cy 650 non-null float64 \n", " 21 popgrwcyfy 650 non-null float64 \n", " 22 hhgrwcyfy 650 non-null float64 \n", " 23 famgrwcyfy 650 non-null float64 \n", " 24 mhigrwcyfy 650 non-null float64 \n", " 25 pcigrwcyfy 650 non-null float64 \n", " 26 dpop_cy 650 non-null float64 \n", " 27 dpopwrk_cy 650 non-null float64 \n", " 28 dpopres_cy 650 non-null float64 \n", " 29 gqpop_cy_p 650 non-null float64 \n", " 30 avghhsz_cy_i 650 non-null int32 \n", " 31 medhinc_cy_i 650 non-null int32 \n", " 32 avghinc_cy_i 650 non-null int32 \n", " 33 pci_cy_i 650 non-null int32 \n", " 34 owner_cy_p 650 non-null float64 \n", " 35 renter_cy_p 650 non-null float64 \n", " 36 vacant_cy_p 650 non-null float64 \n", " 37 medval_cy_i 650 non-null int32 \n", " 38 avgval_cy_i 650 non-null int32 \n", " 39 popgrwcyfy_i 650 non-null int32 \n", " 40 hhgrwcyfy_i 650 non-null int32 \n", " 41 famgrwcyfy_i 650 non-null int32 \n", " 42 mhigrwcyfy_i 650 non-null int32 \n", " 43 pcigrwcyfy_i 650 non-null int32 \n", " 44 dpopwrk_cy_p 650 non-null float64 \n", " 45 dpopres_cy_p 650 non-null float64 \n", " 46 SHAPE 650 non-null geometry\n", "dtypes: float64(29), geometry(1), int32(12), object(5)\n", "memory usage: 208.3+ KB\n" ] }, { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
locnumhas_dataarea_typebuffer_unitsbuffer_unitbuffer_radiiaggregation_methodtotpop_cygqpop_cydivindx_cy...medval_cy_iavgval_cy_ipopgrwcyfy_ihhgrwcyfy_ifamgrwcyfy_imhigrwcyfy_ipcigrwcyfy_idpopwrk_cy_pdpopres_cy_pSHAPE
01055509091Driving TimeMinutesMinutes3.0BlockApportionment:US.BlockGroups;PointsLayer:...4132.040.040.5...2301984443102861178275.6724.33{\"x\": -122.72053500019666, \"y\": 45.41001299996...
11056330021Driving TimeMinutesMinutes3.0BlockApportionment:US.BlockGroups;PointsLayer:...223.00.078.6...230190-36-77-11102795.984.02{\"x\": -122.91902550031074, \"y\": 45.55573200030...
21057598151Driving TimeMinutesMinutes3.0BlockApportionment:US.BlockGroups;PointsLayer:...13632.01818.056.0...21317441237734637414286.6813.32{\"x\": -122.67773100005218, \"y\": 45.52528499988...
31776929101Driving TimeMinutesMinutes3.0BlockApportionment:US.BlockGroups;PointsLayer:...11053.0145.048.6...1511271201005715911444.0255.98{\"x\": -122.63332500012554, \"y\": 45.41751899971...
41803083891Driving TimeMinutesMinutes3.0BlockApportionment:US.BlockGroups;PointsLayer:...11636.0669.067.6...133108-104-126-15015012143.1556.85{\"x\": -122.53836600036125, \"y\": 45.53415900023...
\n", "

5 rows × 47 columns

\n", "
" ], "text/plain": [ " locnum has_data area_type buffer_units buffer_unit buffer_radii \\\n", "0 105550909 1 Driving Time Minutes Minutes 3.0 \n", "1 105633002 1 Driving Time Minutes Minutes 3.0 \n", "2 105759815 1 Driving Time Minutes Minutes 3.0 \n", "3 177692910 1 Driving Time Minutes Minutes 3.0 \n", "4 180308389 1 Driving Time Minutes Minutes 3.0 \n", "\n", " aggregation_method totpop_cy gqpop_cy \\\n", "0 BlockApportionment:US.BlockGroups;PointsLayer:... 4132.0 40.0 \n", "1 BlockApportionment:US.BlockGroups;PointsLayer:... 223.0 0.0 \n", "2 BlockApportionment:US.BlockGroups;PointsLayer:... 13632.0 1818.0 \n", "3 BlockApportionment:US.BlockGroups;PointsLayer:... 11053.0 145.0 \n", "4 BlockApportionment:US.BlockGroups;PointsLayer:... 11636.0 669.0 \n", "\n", " divindx_cy ... medval_cy_i avgval_cy_i popgrwcyfy_i hhgrwcyfy_i \\\n", "0 40.5 ... 230 198 444 310 \n", "1 78.6 ... 230 190 -36 -77 \n", "2 56.0 ... 213 174 412 377 \n", "3 48.6 ... 151 127 120 100 \n", "4 67.6 ... 133 108 -104 -126 \n", "\n", " famgrwcyfy_i mhigrwcyfy_i pcigrwcyfy_i dpopwrk_cy_p dpopres_cy_p \\\n", "0 286 117 82 75.67 24.33 \n", "1 -111 0 27 95.98 4.02 \n", "2 346 374 142 86.68 13.32 \n", "3 57 159 114 44.02 55.98 \n", "4 -150 150 121 43.15 56.85 \n", "\n", " SHAPE \n", "0 {\"x\": -122.72053500019666, \"y\": 45.41001299996... \n", "1 {\"x\": -122.91902550031074, \"y\": 45.55573200030... \n", "2 {\"x\": -122.67773100005218, \"y\": 45.52528499988... \n", "3 {\"x\": -122.63332500012554, \"y\": 45.41751899971... \n", "4 {\"x\": -122.53836600036125, \"y\": 45.53415900023... \n", "\n", "[5 rows x 47 columns]" ] }, "execution_count": 26, "metadata": {}, "output_type": "execute_result" } ], "source": [ "enrich_pt_df = usa.enrich(pt_df, kv, proximity_type='driving_time', proximity_metric='minutes', proximity_value=3)\n", "\n", "enrich_pt_df.info()\n", "enrich_pt_df.head()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.11" } }, "nbformat": 4, "nbformat_minor": 4 }