ModelingAccessor (mdl)

The ModelingAccessor object (df.mdl), a Pandas DataFrame Accessor, likely is going to be one of the most often used objects in this package. The ModelingAccessor object is rarely, if ever, created directly. Rather, it is accessed as a property of a Spatially Enabled DataFrame.

from dm import Country

brand_name = 'ace hardware'

# start by creating a country object instance
usa = Country('USA')

# get a geography to work with from locally installed data
aoi_df = usa.cbsas.get('seattle')

# use the DemographicModeling accessor to get block groups in the AOI
bg_df = aoi_df.block_groups.get()

# get the brand business locations
biz_df = aoi_df.mdl.business.get_by_name(brand_name)

# get the competition locations
comp_df = aoi_df.mdl.business.get_competition(biz_df, local_threshold=3)

# get current year key variables for enrichment
e_vars = cntry.enrich_variables
key_vars = e_vars[
    (e_vars.data_collection.str.startswith('Key'))
    & (e_vars.name.str.endswith('CY'))
]

# use the DemographicModeling accessor to now enrich the block groups
enrich_df = bg_df.mdl.enrich(key_vars)

# get the drive distance and drive time to nearest three brand store locations for each block group
bg_near_biz_df = enrich_df.mdl.proximity.get_nearest(biz_df, origin_id_column='ID', near_prefix='brand'))

# now, do the same for competitor locations
bg_near_biz_comp_df = bg_near_biz_df.mdl.proximity.get_nearest(
    origin_id_column='ID',
    near_prefix='comp',
    destination_count=6
    destination_columns_to_keep=['brand_name', 'brand_name_category']
)
class modeling.ModelingAccessor(obj)

The ModelingAccessor is a Pandas DataFrame accessor, a standalone namespace for accessing geographic modeling functionality. If the DataFrame was created using a Country object, then the Modeling (mdl) namespace will automatically be available. However, if you want to use this functionality, and have not created the DataFrame using the Country object, you must import arcgis.modeling.Modeling to have this functionality available.

enrich(enrich_variables=None, country=None)

Enrich the DataFrame using the provided enrich variable list.

Parameters
  • enrich_variables (Union[list, array, Series, DataFrame, None]) – List of data variables for enrichment. This can optionally be a filtered subset of the dataframe property of an instance of the Country object.

  • country (Optional[Country]) – Optional Country object instance. This must be included if the parent dataframe was not created using this package’s standard geography methods, or if the enrichment variables are not defined by passing in an enrich variables dataframe created using this package’s introspection methods.

Return type

DataFrame

Returns

pd.DataFrame with enriched data.

from pathlib import Path

from arcgis import GeoAccessor
from dm import Country, DemographicModeling
import pandas as pd

# get a path to the trade area data
prj_pth = Path(__file__).parent
gdb_pth = dir_data/'data.gdb'
fc_pth = gdb/'trade_areas'

# load the trade areas into a Spatially Enabled DataFrame
ta_df = pd.DataFrame.spatial.from_featureclass(fc_pth)

# create a country object instance
usa = Country('USA', source='local')

# get all the available enrichment variables
e_vars = usa.enrich_variables

# filter to just the current year key variables
key_vars = e_vars[(e_vars.data_collection.str.startswith('Key')) &
                  (e_vars.name.str.endswith('CY'))]

# enrich the Spatially Enabled DataFrame
ta_df = ta_df.dm.enrich(key_vars)
get_nearest(destination_dataframe, source=None, single_row_per_origin=True, origin_id_column='LOCNUM', destination_id_column='LOCNUM', destination_count=4, near_prefix=None, destination_columns_to_keep=None)

Create a closest destination dataframe using a destination Spatially Enabled Dataframe relative to the parent Spatially enabled DataFrame, but keep each origin and destination still in a discrete row instead of collapsing to a single row per origin. The main reason to use this is if needing the geometry for visualization.

Parameters
  • destination_dataframe (DataFrame) – Destination points in one of the supported input formats.

  • source (Union[str, Path, Country, GIS, None]) – Optional - Either the path to the network dataset, the Country object associated with the Business Analyst source being used, or a GIS object instance. If invoked from a dataframe created for a country’s standard geography levels using the dm accessor, get_nearest will use the parent country properties to ascertain how to perform the networks solve.

  • single_row_per_origin (bool) – Optional - Whether or not to pivot the results to return only one row for each origin location. Default is True.

  • origin_id_column (str) – Optional - Column in the origin points Spatially Enabled Dataframe uniquely identifying each feature. Default is ‘LOCNUM’.

  • destination_id_column (str) – Column in the destination points Spatially Enabled Dataframe uniquely identifying each feature

  • destination_count (int) – Integer number of destinations to search for from every origin point.

  • near_prefix (Optional[str]) – String prefix to prepend onto near column names in the output.

  • destination_columns_to_keep (Union[str, list, None]) – List of columns to keep in the output. Commonly, if businesses, this includes the column with the business names.

Return type

DataFrame

Returns

Spatially Enabled Dataframe with a row for each origin id, and metrics for each nth destinations.

level(geographic_level)

Retrieve a Spatially Enabled DataFrame of geometries corresponding to the index returned by the Country.geography_levels property. This is most useful when retrieving the lowest, most granular, level of geography within a country.

Parameters

geographic_level (int) – Integer referencing the index of the geographic level desired.

Return type

GeographyLevel

Returns

GeographyLevel object instance

from dm import Country

# create an instance of the country object
cntry = Country('USA')

# the get function returns a dataframe with the 'dm' property
metro_df = cntry.cbsas('seattle')

# level returns a CountryLevel object enabling getting all geography_levels
# falling within the parent dataframe
lvl_df = metro_df.mdl.level(0).get()
project(output_spatial_reference=4326)

Project to a new spatial reference, applying an applicable transformation if necessary.

Parameters

output_spatial_reference (Union[SpatialReference, int]) – Optional - The output spatial reference. Default is 4326 (WGS84).

Returns

Spatially Enabled DataFrame projected to the new spatial reference.

Business

class modeling.Business(mdl)

Just like it sounds, this is a way to search for and find businesses of your own brand for analysis, but more importantly competitor locations facilitating modeling the effects of competition as well. The business object is accessed as a property of the ModelingAccessor (df.mdl.business).

from modeling import Country

# start by creating a country object instance
usa = Country('USA')

# get a geography to work with from locally installed data
aoi_df = usa.cbsas.get('Seattle')

# get all Ace Hardware locations
brnd_df = aoi_df.mdl.business.get_by_name('Ace Hardware')

# get all competitors for Ace Hardware in Seattle using the
# template of the brand dataframe
comp_df = aoi_df.mdl.business.get_competition(brnd_df)

# ...or get competitors by using the same search term
comp_df = aoi_df.mdl.business.get_competition('Ace Hardware')
calculate_brand_name_category(local_threshold=0, inplace=False)

For the output of any Business.get* function, calculate a column named ‘brand_name_category’. This function is frequently used to re-calculate the category identifying unique local retailers, and group them collectively into a ‘local_brand’. This is useful in markets where there is a distinct preference for local retailers. This is particularly true for speciality coffee shops in many urban markets. While this is performed automatically for the ‘get_by_code’ and ‘get_competitors’ methods, this function enables you to recalculate it if you need to massage some of the brand name outputs.

Parameters
  • local_threshold (int) – Integer count below which a brand name will be consider a local brand.

  • inplace (bool) – Boolean indicating if the dataframe should be modified in place, or a new one created and returned. The default is False to not inadvertently

Return type

Optional[DataFrame]

Returns

Pandas Spatially Enabled DataFrame of store locations with the updated column if inplace is False. Otherwise, returns None.

from arcgis.gis import GIS
from modeling import Country

# connect to a Web GIS
gis = GIS('https://path.to.arcgis.enterprise.com/portal',
          username='batman', password='P3nnyw0rth!')

# create a country object instance
cntry = Country('USA', source=gis)

# create an area of interest
aoi_df = cntry.cbsas.get('Seattle')

# use this area of interest to get brand locations
brand_df = aoi_df.mdl.business.get_by_name('Ace Hardware')

# get competitors and categorize all brands with less than
# three locations as a local brand
comp_df = aoi_df.mdl.business.get_competition(brand_df, local_threshold=3)

# with hardware stores, each True Value has a unique name,
# so it helps to rename these to be correctly recognized
# as a brand of stores
replace_lst = [
    ('TRUE VALUE|TRUE VL', 'TRUE VALUE'),
    ('MC LENDON|MCLENDON', 'MCLENDON HARDWARE')
]
for repl in replace_lst:
    brand_filter = comp_df.brand_name.str.contains(repl[0], regex=True)
    comp_df.loc[brand_filter, 'brand_name'] = repl[1]

# now, with the True Values renamed, we need to recalculate which
# locations are actually local brands
comp_df.mdl.business.calculate_brand_name_category(3, inplace=True)

The output of comp_df.head() from the above sample looks similar to the following.

LOCNUM

CONAME

NAICSDESC

NAICS

SIC

SOURCE

PUBPRV

FRNCOD

ISCODE

CITY

ZIP

STATE

SHAPE

location_id

brand_name

brand_name_category

0

002890986

MC LENDON HARDWARE

HARDWARE-RETAIL

44413005

525104

INFOGROUP

SUMNER

98390

WA

{‘x’: -122.242365, ‘y’: 47.2046040000001, ‘spatialReference’: {‘wkid’: 4326}}

002890986

MCLENDON HARDWARE

MCLENDON HARDWARE

1

006128854

MCLENDON HARDWARE INC

HARDWARE-RETAIL

44413005

525104

INFOGROUP

RENTON

98057

WA

{‘x’: -122.2140195, ‘y’: 47.477943, ‘spatialReference’: {‘wkid’: 4326}}

006128854

MCLENDON HARDWARE

MCLENDON HARDWARE

2

174245191

DUVALL TRUE VALUE HARDWARE

HARDWARE-RETAIL

44413005

525104

INFOGROUP

2

DUVALL

98019

WA

{‘x’: -121.9853835, ‘y’: 47.738907, ‘spatialReference’: {‘wkid’: 4326}}

174245191

TRUE VALUE

TRUE VALUE

3

174262691

GATEWAY TRUE VALUE HARDWARE

HARDWARE-RETAIL

44413005

525104

INFOGROUP

2

ENUMCLAW

98022

WA

{‘x’: -121.9876155, ‘y’: 47.2019940000001, ‘spatialReference’: {‘wkid’: 4326}}

174262691

TRUE VALUE

TRUE VALUE

4

174471722

TWEEDY & POPP HARDWARE

HARDWARE-RETAIL

44413005

525104

INFOGROUP

2

SEATTLE

98103

WA

{‘x’: -122.3357134, ‘y’: 47.6612959300001, ‘spatialReference’: {‘wkid’: 4326}}

174471722

TWEEDY & POPP HARDWARE

local_brand

drop_by_id(drop_dataframe, source_id_column='location_id', drop_id_column='location_id')

Drop values from the parent dataframe based on unique identifiers found in another dataframe. This is a common task when removing brand locations from a dataframe of all locations to create a dataframe of only competitors.

Parameters
  • drop_dataframe (DataFrame) – Required Pandas DataFrame with a unique identifier column. Values in this column will be used to identify and remove values from the dataframe.

  • source_id_column (str) – Optional string for the column in the original dataframe with values to be used for identifying rows to either drop or retain. Default is ‘location_id’.

  • drop_id_column (str) – Optional string for the column in the drop dataframe with values to be used for identifying rows to drop or retain. Default is ‘location_id’.

Return type

DataFrame

Returns

Pandas DataFrame with rows removed based on common identifier values.

from arcgis.gis import GIS
from modeling import Country

# connect to a Web GIS
gis = GIS('https://path.to.arcgis.enterprise.com/portal',
          username='batman', password='P3nnyw0rth!')

# create a country object instance
cntry = Country('USA', source=gis)

# create an area of interest
aoi_df = cntry.cbsas.get('Seattle')

# use this area of interest to get brand locations
brand_df = aoi_df.mdl.business.get_by_name('Ace Hardware')

# get the top NAICS codes
top_codes = brand_df.mdl.business.get_top_codes()

# truncate the top code retrieved to widen the scope of codes
# retrieved - a broader category
top_code = top_codes.iloc[0]
naics_code = top_code[:-4]

# use this truncated code to retrieve competitors
naics_df = aoi_df.mdl.business.get_by_code(naics_code)

# now, remove the brand locations from the retrieved dataframe
# to retain just the competition
comp_df = naics_df.mdl.business.drop_by_id(brand_df)
get_by_code(category_code, code_type='NAICS', name_column='CONAME', id_column='LOCNUM', local_threshold=0)

Search for businesses based on business category code. In North America, this typically is either the NAICS or SIC code.

Parameters
  • category_code ([<class 'str'>, <class 'list'>]) – Required Business category code, such as 4568843, input as a string. This does not have to be a complete code. The tool will search for the category code with a partial code starting from the beginning.

  • code_type (str) – Optional The column in the business listing data to search for the input business code. In the United States, this is either NAICS or SIC. The default is NAICS.

  • name_column (str) – Optional Name of the column with business names to be searched. Default is ‘CONAME’

  • id_column (str) – Optional Name of the column with the value uniquely identifying each business location. Default is ‘LOCNUM’.

  • local_threshold (int) – Optional Number of locations to consider, albeit only in the study area, to categorize the each business location as either a major brand, and keep the name, or as a local brand with ‘local_brand’ in a new column.

Return type

DataFrame

Returns

Spatially Enabled pd.DataFrame

from modeling import Country

# start by creating a country object instance
usa = Country('USA', source='local')

# get a geography to work with from locally installed data
aoi_df = usa.cbsas.get('seattle')

# get all the businesses for a NAICS code category
naics_big_df = aoi_df.mdl.business.get_by_code(4441, local_threshold=2)
get_by_name(business_name, name_column='CONAME', id_column='LOCNUM', local_threshold=0)

Search business listings for a specific business name string.

Parameters
  • business_name (str) – String business name to search for.

  • name_column (str) – Optional - Name of the column with business names to be searched. Default is ‘CONAME’

  • id_column (str) – Optional - Name of the column with the value uniquely identifying each business location. Default is ‘LOCNUM’.

  • local_threshold (int) – Number of locations to consider, albeit only in the study area, to categorize the each business location as either a major brand, and keep the name, or as a local brand with ‘local_brand’ in a new column. This enables considering local brands in a market collectively to quantitatively evaluate the power of “buying local.”

Return type

DataFrame

Returns

Spatially Enabled DataFrame of businesses

from modeling import Country

# start by creating a country object instance
usa = Country('USA', source='local')

# get a geography to work with from locally installed data
aoi_df = usa.cbsas.get('seattle')

# get all Ace Hardware locations in Seattle
comp_df = aoi_df.mdl.business.get_by_name('Ace Hardware')
get_competition(brand_businesses, code_column='NAICS', name_column='CONAME', id_column='LOCNUM', local_threshold=0)

Get competitors from previously retrieved business listings.

Parameters
  • brand_businesses (DataFrame) – Previously retrieved business listings.

  • name_column (str) – Optional - Name of the column with business names to be searched. Default is ‘CONAME’

  • code_column (str) – Optional - The column in the data to search for business category codes. Default is ‘NAICS’

  • id_column (str) – Optional - Name of the column with the value uniquely identifying each business location. Default is ‘LOCNUM’.

  • local_threshold (int) – Number of locations to consider, albeit only in the study area, to categorize the each business location as either a major brand, and keep the name, or as a local brand with ‘local_brand’ in a new column.

Return type

DataFrame

Returns

Spatially Enabled DataFrame

from modeling import Country

# start by creating a country object instance
usa = Country('USA', source='local')

# get a geography to work with from locally installed data
aoi_df = usa.cbsas.get('seattle')

# get all competitors for Ace Hardware in Seattle
comp_df = aoi_df.mdl.business.get_competition('Ace Hardware')
get_top_codes(code_type='NAICS', threshold=0.5)

Get the industry identifier codes used to identify MOST of the records in a business DataFrame. This is useful for getting the identifier values to retrieve other business locations to identify competitors.

Parameters
  • code_type (str) – Optional string identifying the industry codes being used. Must be either NAICS or SIC. Default is ‘NAICS’.

  • threshold (float) – Optional float determining what percentage to use as threshold cutoff from input records to select codes from. Default is 0.5. This means the top 50%, or half, of the rows in the DataFrame will be sampled to return the industry code values identifying the locations.

Return type

Series

Returns

Pandas Series of code values.

from arcgis.gis import GIS
from modeling import Country

# connect to a Web GIS
gis = GIS('https://path.to.arcgis.enterprise.com/portal',
          username='batman', password='P3nnyw0rth!')

# create a country object instance
cntry = Country('USA', source=gis)

# create an area of interest
aoi_df = cntry.cbsas.get('Minneapolis')

# use this area of interest to get brand locations
brand_df = aoi_df.mdl.business.get_by_name('Ulta Beauty')

# get the top NAICS codes
top_codes = brand_df.mdl.business.get_top_codes()

# truncate the top code retrieved to widen the scope of codes
# retrieved - a broader category
top_code = top_codes.iloc[0]
naics_code = top_code[:-2]

# use this truncated code to retrieve competitors
naics_df = aoi_df.mdl.business.get_by_code(naics_code)

# now, remove the brand locations from the retrieved dataframe
# to retain just the competition
comp_df = naics_df.mdl.business.drop_by_id(brand_df)

Proximity

class modeling.Proximity(mdl)

Provides access to proximity calculation functions.

get_nearest(destination_dataframe, source=None, single_row_per_origin=True, origin_id_column='LOCNUM', destination_id_column='LOCNUM', destination_count=4, near_prefix=None, destination_columns_to_keep=None)

Get nearest enables getting the nth (default is four) nearest locations based on drive distance between two Spatially Enabled DataFrames. If the origins are polygons, the centroids will be used as the start locations. This is useful for getting the nearest store brand locations to every origin block group in a metropolitan area along with the nearest competition locations to every block group in the same metropolitan area.

Parameters
  • destination_dataframe (DataFrame) – Destination points in one of the supported input formats.

  • source (Union[str, Path, Country, GIS, None]) – Either the path to the network dataset, the Country object associated with the Business Analyst source being used, or a GIS object instance.

  • single_row_per_origin (Optional[bool]) – Optional - Whether or not to pivot the results to return only one row for each origin location. Default is True.

  • origin_id_column (Optional[str]) – Optional - Column in the origin points Spatially Enabled Dataframe uniquely identifying each feature. Default is ‘LOCNUM’.

  • destination_id_column (Optional[str]) – Column in the destination points Spatially Enabled Dataframe uniquely identifying each feature

  • destination_count (Optional[int]) – Integer number of destinations to search for from every origin point.

  • near_prefix (Optional[str]) – String prefix to prepend onto near column names in the output.

  • destination_columns_to_keep (Union[str, list, None]) – List of columns to keep in the output. Commonly, if businesses, this includes the column with the business names.

Return type

DataFrame

Returns

Spatially Enabled Dataframe with a row for each origin id, and metrics for each nth destinations.

from modeling import Country

brand_name = 'ace hardware'

# create a country ojbect to work with
usa = Country('USA')

# get a metropolitan area, a CBSA, to use as the study area
aoi_df = usa.cbsas.get('seattle')

# get the current year key variables to use for enrichment
evars = usa.enrich_variables
key_vars = evars[
    (evars.name.str.endswith('CY'))
    & (evars.data_collection.str.lower().str.contains('key'))
].reset_index(drop=True)

# get the block groups and enrich them with the ~20 key variables
bg_df = aoi_df.mdl.level(0).get().mdl.enrich(key_vars)

# get the store brand locations and competition locations
biz_df = aoi_df.mdl.business.get_by_name(brand_name)
comp_df = aoi_df.mdl.business.get_competition(biz_df)

# get the nearest three brand locations to every block group
bg_near_biz = bg_df.mdl.proximity.get_nearest(biz_df,
    origin_id_column='ID', destination_count=3, near_prefix='brand')

# get the nearest six competition locations to every block group
bg_near_df = bg_near_biz.mdl.proximity.get_nearest(bg_near_biz,
    origin_id_column='ID', near_prefix='comp', destination_count=6,
    destination_columns_to_keep=['brand_name', 'brand_name_category'])

Country

The country object is the foundational building block for working with demographic data. This is due to data collection, aggregation and dissemination methods used in Business Analyst. Succinctly, this is how the data is organized.

class modeling.Country(name, source=None, year=None)

Country objects are instantiated by providing the three letter country identifier and optionally also specifying the source. If the source is not explicitly specified Country will use local resources if the environment is part of an ArcGIS Pro installation with Business Analyst enabled and local data installed. If this is not the case, Country will then attempt to use the active GIS object instance if available. Also, if a GIS object is explicitly passed in, this will be used.

Parameters
  • name (str) – Three letter country identifier.

  • source (Union[str, GIS, None]) – Optional ‘local’ or a GIS object instance referencing an ArcGIS Enterprise instance with enrichment configured or ArcGIS Online. If not explicitly specified, will attempt to use locally installed data with Pro and Business analyst first. If this is not available, will look for an active GIS. If no active GIS, a GIS object will need to be explicitly provided with permissions to perform enrichment.

  • year (Optional[int]) – Optional and only applicable if using local data. In cases where models have been developed against a specific vintage (year) of data, this affords the ability to enrich data for this specific year to support these models.

from arcgis.modeling import Country

# instantiate a country
usa = Country('USA', source='local')

# get the seattle CBSA as a study area
aoi_df = usa.cbsas.get('seattle')

# use the Modeling DataFrame accessor to retrieve the block groups in seattle
bg_df = aoi_df.mdl.block_groups.get()

# get the available enrich variables as as DataFrame
e_vars = usa.enrich_variables

# filter the variables to just the current year key variables
key_vars = e_vars[(e_vars.data_collection.str.startswith('Key')) &
                  (e_vars.name.str.endswith('CY'))]

# enrich the data through the Modeling DataFrame accessor
e_df = bg_df.mdl.enrich(key_vars)
add_enrich_aliases(feature_class)

Add human readable aliases to an enriched feature class.

Note

This function requires ArcPy to be available in the environment, an environment created on a Windows operating system with ArcGIS Pro installed and ArcPy installed in the environment.

Parameters

feature_class (Union[Path, str]) – Path to the enriched feature class.

Return type

Path

Returns

Path to feature class with aliases added.

from modeling import Country

out_fc_pth = Path(r'C:/path/to/geodatabase.gdb/block_groups')

# create a country object
cntry =  Country('USA')

# get the current year key variables
evars = cntry.enrich_variables
key_vars = e_vars[
    (e_vars.data_collection.str.startswith('Key'))
    & (e_vars.name.str.endswith('CY'))
]

# get the block groups for the area of interest
bg_df = cntry.cbsas.get('seattle').mdl.block_groups.get()

# enrich the block groups with the key variables
enrich_df = bg_df.mdl.enrich(key_vars)

# save to a feature class
enrich_fc = enrich_df.spatial.to_featureclass(out_fc_pth)

# finally, add enrich aliases to the output feature class
cntry.add_aliases(enrich_fc)
enrich(data, enrich_variables)

Enrich a spatially enabled dataframe using either a enrichment variables defined using a Python List, NumPy Array or Pandas Series of enrich names. Also, a filtered enrich variables Pandas DataFrame can also be used.

Parameters
  • data (DataFrame) – Spatially Enabled DataFrame with geography_levels to be enriched.

  • enrich_variables (Union[list, array, Series, DataFrame]) – Optional iterable of enrich variables to use for enriching data. Filtered output from Country.enrich_variables can also be used.

Return type

DataFrame

Returns

Spatially Enabled DataFrame with enriched data added.

property enrich_variables

DataFrame of all the available enrichment variables.

from arcgis import GIS
from modeling import Country

# connect to an Enterprise GIS instance with Business Analyst installed
gis = GIS('https://mydomain.com/portal', username='geowizard', password='Y3ll0wBr!ck$')

# create a country to work with
cntry = Country('USA')

# get the available enrich variables as as DataFrame
e_vars = usa.enrich_variables

# filter the variables to just the current year key variables
key_vars = e_vars[(e_vars.data_collection.str.startswith('Key')) &
                  (e_vars.name.str.endswith('CY'))]

The key_vars table retrieved using the code sample above will look similar to the following.

name

alias

data_collection

enrich_name

enrich_field_name

description

vintage

units

0

TOTPOP_CY

2020 Total Population

KeyUSFacts

KeyUSFacts.TOTPOP_CY

KeyUSFacts_TOTPOP_CY

2020 Total Population (Esri)

2020

count

1

GQPOP_CY

2020 Group Quarters Population

KeyUSFacts

KeyUSFacts.GQPOP_CY

KeyUSFacts_GQPOP_CY

2020 Group Quarters Population (Esri)

2020

count

2

DIVINDX_CY

2020 Diversity Index

KeyUSFacts

KeyUSFacts.DIVINDX_CY

KeyUSFacts_DIVINDX_CY

2020 Diversity Index (Esri)

2020

count

3

TOTHH_CY

2020 Total Households

KeyUSFacts

KeyUSFacts.TOTHH_CY

KeyUSFacts_TOTHH_CY

2020 Total Households (Esri)

2020

count

4

AVGHHSZ_CY

2020 Average Household Size

KeyUSFacts

KeyUSFacts.AVGHHSZ_CY

KeyUSFacts_AVGHHSZ_CY

2020 Average Household Size (Esri)

2020

count

5

MEDHINC_CY

2020 Median Household Income

KeyUSFacts

KeyUSFacts.MEDHINC_CY

KeyUSFacts_MEDHINC_CY

2020 Median Household Income (Esri)

2020

currency

6

AVGHINC_CY

2020 Average Household Income

KeyUSFacts

KeyUSFacts.AVGHINC_CY

KeyUSFacts_AVGHINC_CY

2020 Average Household Income (Esri)

2020

currency

7

PCI_CY

2020 Per Capita Income

KeyUSFacts

KeyUSFacts.PCI_CY

KeyUSFacts_PCI_CY

2020 Per Capita Income (Esri)

2020

currency

8

TOTHU_CY

2020 Total Housing Units

KeyUSFacts

KeyUSFacts.TOTHU_CY

KeyUSFacts_TOTHU_CY

2020 Total Housing Units (Esri)

2020

count

9

OWNER_CY

2020 Owner Occupied HUs

KeyUSFacts

KeyUSFacts.OWNER_CY

KeyUSFacts_OWNER_CY

2020 Owner Occupied Housing Units (Esri)

2020

count

10

RENTER_CY

2020 Renter Occupied HUs

KeyUSFacts

KeyUSFacts.RENTER_CY

KeyUSFacts_RENTER_CY

2020 Renter Occupied Housing Units (Esri)

2020

count

11

VACANT_CY

2020 Vacant Housing Units

KeyUSFacts

KeyUSFacts.VACANT_CY

KeyUSFacts_VACANT_CY

2020 Vacant Housing Units (Esri)

2020

count

12

MEDVAL_CY

2020 Median Home Value

KeyUSFacts

KeyUSFacts.MEDVAL_CY

KeyUSFacts_MEDVAL_CY

2020 Median Home Value (Esri)

2020

currency

13

AVGVAL_CY

2020 Average Home Value

KeyUSFacts

KeyUSFacts.AVGVAL_CY

KeyUSFacts_AVGVAL_CY

2020 Average Home Value (Esri)

2020

currency

14

POPGRW10CY

2010-2020 Growth Rate: Population

KeyUSFacts

KeyUSFacts.POPGRW10CY

KeyUSFacts_POPGRW10CY

2010-2020 Population: Annual Growth Rate (Esri)

2020

pct

15

HHGRW10CY

2010-2020 Growth Rate: Households

KeyUSFacts

KeyUSFacts.HHGRW10CY

KeyUSFacts_HHGRW10CY

2010-2020 Households: Annual Growth Rate (Esri)

2020

pct

16

FAMGRW10CY

2010-2020 Growth Rate: Families

KeyUSFacts

KeyUSFacts.FAMGRW10CY

KeyUSFacts_FAMGRW10CY

2010-2020 Families: Annual Growth Rate (Esri)

2020

pct

17

DPOP_CY

2020 Total Daytime Population

KeyUSFacts

KeyUSFacts.DPOP_CY

KeyUSFacts_DPOP_CY

2020 Total Daytime Population (Esri)

2020

count

18

DPOPWRK_CY

2020 Daytime Pop: Workers

KeyUSFacts

KeyUSFacts.DPOPWRK_CY

KeyUSFacts_DPOPWRK_CY

2020 Daytime Population: Workers (Esri)

2020

count

19

DPOPRES_CY

2020 Daytime Pop: Residents

KeyUSFacts

KeyUSFacts.DPOPRES_CY

KeyUSFacts_DPOPRES_CY

2020 Daytime Population: Residents (Esri)

2020

count

property geography_levels

DataFrame of available geography levels.

get_enrich_variables_dataframe_from_variable_list(enrich_variables, drop_duplicates=True)

Get a dataframe of enrich variables associated with the list of variables passed in. This is especially useful when needing aliases (human readable names), or are interested in enriching more data using previously enriched data as a template.

Parameters
  • enrich_variables (Union[list, tuple, ndarray, Series]) – Iterable (normally a list) of variables correlating to enrichment variables. These variable names can be simply the name, the name prefixed by the collection separated by a dot, or the output from enrichment in ArcGIS Pro with the field name modified to fit field naming and length constraints.

  • drop_duplicates – Optional boolean (default True) indicating whether to drop duplicates. Since the same variables appear in multiple data collections, multiple instances of the same variable can be found. Dropping duplicates removes redundant matches.

Return type

DataFrame

Returns

Pandas DataFrame of enrich variables with the different available aliases.

from pathlib import Path

import arcpy
from modeling import Country

# path to previously enriched data
enriched_fc_pth = Path(r'C:/path/to/geodatabase.gdb/enriched_data')
new_fc_pth = Path(r'C:/path/to/geodatabase.gdb/block_groups_pdx')

# get a list of column names from previously enriched data
attr_lst = [c.name for c in arcpy.ListFields(str(enriched_fc_pth))

# get a country to work in
cntry = Country('USA', source='local')

# get dataframe of variables used for previously enriched data
enrich_vars = cntry.get_enrich_variables_dataframe_from_variable_list(attr_lst)

# enrich block groups in new area of interest using the same variables
enrich_df = cntry.cbsas.get('portland-vancouver').mdl.block_groups.get().mdl.enrich(enrich_vars)

# save the new data and add aliases
enrich_df.spatial.to_featureclass(new_fc_pth)
cntry.add_enrich_aliases(new_fc_pth)
level(geography_index)

Get an available geography_level in the country.

Parameters

geography_index ((<class 'str'>, <class 'int'>)) – Either the geographic_level geo_name or the index of the geography_level level. This can be discovered using the Country.geography_levels method.

Return type

DataFrame

Returns

Spatially Enabled DataFrame of the requested geography_level with the Modeling accessor properties initialized.

property levels

Dataframe of available geography levels. (Alias of geography_levels.)

verify_can_enrich()

If the country enrich instance can enrich based on the permissions. Only relevant if source is a GIS instance.

verify_can_perform_network_analysis(network_function=None)

If the country enrich instance can perform transportation network analysis based on permissions. Only relevant if the source is a GIS instance.

Parameters

network_function (Optional[str]) – Optional string describing specific network function to check for. Valid values include ‘closestfacility’, ‘locationallocation’, ‘optimizedrouting’, ‘origindestinationcostmatrix’, ‘routing’, ‘servicearea’, or ‘vehiclerouting’.

Return type

bool

Returns

Boolean indicating if the country instance, based on permissions, has network analysis privileges.