ModelingAccessor (`mdl`)¶

The ModelingAccessor object (df.mdl), a Pandas DataFrame Accessor, likely is going to be one of the most often used objects in this package. The ModelingAccessor object is rarely, if ever, created directly. Rather, it is accessed as a property of a Spatially Enabled DataFrame.

from dm import Country

brand_name = 'ace hardware'

# start by creating a country object instance
usa = Country('USA')

# get a geography to work with from locally installed data
aoi_df = usa.cbsas.get('seattle')

# use the DemographicModeling accessor to get block groups in the AOI
bg_df = aoi_df.block_groups.get()

# get the brand business locations
biz_df = aoi_df.mdl.business.get_by_name(brand_name)

# get the competition locations
comp_df = aoi_df.mdl.business.get_competition(biz_df, local_threshold=3)

# get current year key variables for enrichment
e_vars = cntry.enrich_variables
key_vars = e_vars[
    (e_vars.data_collection.str.startswith('Key'))
    & (e_vars.name.str.endswith('CY'))
]

# use the DemographicModeling accessor to now enrich the block groups
enrich_df = bg_df.mdl.enrich(key_vars)

# get the drive distance and drive time to nearest three brand store locations for each block group
bg_near_biz_df = enrich_df.mdl.proximity.get_nearest(biz_df, origin_id_column='ID', near_prefix='brand'))

# now, do the same for competitor locations
bg_near_biz_comp_df = bg_near_biz_df.mdl.proximity.get_nearest(
    origin_id_column='ID',
    near_prefix='comp',
    destination_count=6
    destination_columns_to_keep=['brand_name', 'brand_name_category']
)

class modeling.ModelingAccessor(obj)¶

The ModelingAccessor is a Pandas DataFrame accessor, a standalone namespace for accessing geographic modeling functionality. If the DataFrame was created using a Country object, then the Modeling (mdl) namespace will automatically be available. However, if you want to use this functionality, and have not created the DataFrame using the Country object, you must import arcgis.modeling.Modeling to have this functionality available.

enrich(enrich_variables=None, country=None)¶

Enrich the DataFrame using the provided enrich variable list.

Parameters

enrich_variables (Union[list, array, Series, DataFrame, None]) – List of data variables for enrichment. This can optionally be a filtered subset of the dataframe property of an instance of the Country object.
country (Optional[Country]) – Optional Country object instance. This must be included if the parent dataframe was not created using this package’s standard geography methods, or if the enrichment variables are not defined by passing in an enrich variables dataframe created using this package’s introspection methods.

Return type

DataFrame

Returns

pd.DataFrame with enriched data.

from pathlib import Path

from arcgis import GeoAccessor
from dm import Country, DemographicModeling
import pandas as pd

# get a path to the trade area data
prj_pth = Path(__file__).parent
gdb_pth = dir_data/'data.gdb'
fc_pth = gdb/'trade_areas'

# load the trade areas into a Spatially Enabled DataFrame
ta_df = pd.DataFrame.spatial.from_featureclass(fc_pth)

# create a country object instance
usa = Country('USA', source='local')

# get all the available enrichment variables
e_vars = usa.enrich_variables

# filter to just the current year key variables
key_vars = e_vars[(e_vars.data_collection.str.startswith('Key')) &
                  (e_vars.name.str.endswith('CY'))]

# enrich the Spatially Enabled DataFrame
ta_df = ta_df.dm.enrich(key_vars)

get_nearest(destination_dataframe, source=None, single_row_per_origin=True, origin_id_column='LOCNUM', destination_id_column='LOCNUM', destination_count=4, near_prefix=None, destination_columns_to_keep=None)¶

Create a closest destination dataframe using a destination Spatially Enabled Dataframe relative to the parent Spatially enabled DataFrame, but keep each origin and destination still in a discrete row instead of collapsing to a single row per origin. The main reason to use this is if needing the geometry for visualization.

Parameters

destination_dataframe (DataFrame) – Destination points in one of the supported input formats.
source (Union[str, Path, Country, GIS, None]) – Optional - Either the path to the network dataset, the Country object associated with the Business Analyst source being used, or a GIS object instance. If invoked from a dataframe created for a country’s standard geography levels using the dm accessor, get_nearest will use the parent country properties to ascertain how to perform the networks solve.
single_row_per_origin (bool) – Optional - Whether or not to pivot the results to return only one row for each origin location. Default is True.
origin_id_column (str) – Optional - Column in the origin points Spatially Enabled Dataframe uniquely identifying each feature. Default is ‘LOCNUM’.
destination_id_column (str) – Column in the destination points Spatially Enabled Dataframe uniquely identifying each feature
destination_count (int) – Integer number of destinations to search for from every origin point.
near_prefix (Optional[str]) – String prefix to prepend onto near column names in the output.
destination_columns_to_keep (Union[str, list, None]) – List of columns to keep in the output. Commonly, if businesses, this includes the column with the business names.

Return type

DataFrame

Returns

Spatially Enabled Dataframe with a row for each origin id, and metrics for each nth destinations.

level(geographic_level)¶

Retrieve a Spatially Enabled DataFrame of geometries corresponding to the index returned by the Country.geography_levels property. This is most useful when retrieving the lowest, most granular, level of geography within a country.

Parameters: geographic_level (int) – Integer referencing the index of the geographic level desired.
Return type: GeographyLevel
Returns: GeographyLevel object instance

from dm import Country

# create an instance of the country object
cntry = Country('USA')

# the get function returns a dataframe with the 'dm' property
metro_df = cntry.cbsas('seattle')

# level returns a CountryLevel object enabling getting all geography_levels
# falling within the parent dataframe
lvl_df = metro_df.mdl.level(0).get()

project(output_spatial_reference=4326)¶

Project to a new spatial reference, applying an applicable transformation if necessary.

Parameters: output_spatial_reference (Union[SpatialReference, int]) – Optional - The output spatial reference. Default is 4326 (WGS84).
Returns: Spatially Enabled DataFrame projected to the new spatial reference.

Business¶

class modeling.Business(mdl)¶

Just like it sounds, this is a way to search for and find businesses of your own brand for analysis, but more importantly competitor locations facilitating modeling the effects of competition as well. The business object is accessed as a property of the ModelingAccessor (df.mdl.business).

from modeling import Country

# start by creating a country object instance
usa = Country('USA')

# get a geography to work with from locally installed data
aoi_df = usa.cbsas.get('Seattle')

# get all Ace Hardware locations
brnd_df = aoi_df.mdl.business.get_by_name('Ace Hardware')

# get all competitors for Ace Hardware in Seattle using the
# template of the brand dataframe
comp_df = aoi_df.mdl.business.get_competition(brnd_df)

# ...or get competitors by using the same search term
comp_df = aoi_df.mdl.business.get_competition('Ace Hardware')

calculate_brand_name_category(local_threshold=0, inplace=False)¶

For the output of any Business.get* function, calculate a column named ‘brand_name_category’. This function is frequently used to re-calculate the category identifying unique local retailers, and group them collectively into a ‘local_brand’. This is useful in markets where there is a distinct preference for local retailers. This is particularly true for speciality coffee shops in many urban markets. While this is performed automatically for the ‘get_by_code’ and ‘get_competitors’ methods, this function enables you to recalculate it if you need to massage some of the brand name outputs.

Parameters

local_threshold (int) – Integer count below which a brand name will be consider a local brand.
inplace (bool) – Boolean indicating if the dataframe should be modified in place, or a new one created and returned. The default is False to not inadvertently

Return type

Optional[DataFrame]

Returns

Pandas Spatially Enabled DataFrame of store locations with the updated column if inplace is False. Otherwise, returns None.

from arcgis.gis import GIS
from modeling import Country

# connect to a Web GIS
gis = GIS('https://path.to.arcgis.enterprise.com/portal',
          username='batman', password='P3nnyw0rth!')

# create a country object instance
cntry = Country('USA', source=gis)

# create an area of interest
aoi_df = cntry.cbsas.get('Seattle')

# use this area of interest to get brand locations
brand_df = aoi_df.mdl.business.get_by_name('Ace Hardware')

# get competitors and categorize all brands with less than
# three locations as a local brand
comp_df = aoi_df.mdl.business.get_competition(brand_df, local_threshold=3)

# with hardware stores, each True Value has a unique name,
# so it helps to rename these to be correctly recognized
# as a brand of stores
replace_lst = [
    ('TRUE VALUE|TRUE VL', 'TRUE VALUE'),
    ('MC LENDON|MCLENDON', 'MCLENDON HARDWARE')
]
for repl in replace_lst:
    brand_filter = comp_df.brand_name.str.contains(repl[0], regex=True)
    comp_df.loc[brand_filter, 'brand_name'] = repl[1]

# now, with the True Values renamed, we need to recalculate which
# locations are actually local brands
comp_df.mdl.business.calculate_brand_name_category(3, inplace=True)

The output of comp_df.head() from the above sample looks similar to the following.

	LOCNUM	CONAME	NAICSDESC	NAICS	SIC	SOURCE	FRNCOD	CITY	ZIP	STATE	SHAPE	location_id	brand_name	brand_name_category
0	002890986	MC LENDON HARDWARE	HARDWARE-RETAIL	44413005	525104	INFOGROUP		SUMNER	98390	WA	{‘x’: -122.242365, ‘y’: 47.2046040000001, ‘spatialReference’: {‘wkid’: 4326}}	002890986	MCLENDON HARDWARE	MCLENDON HARDWARE
1	006128854	MCLENDON HARDWARE INC	HARDWARE-RETAIL	44413005	525104	INFOGROUP		RENTON	98057	WA	{‘x’: -122.2140195, ‘y’: 47.477943, ‘spatialReference’: {‘wkid’: 4326}}	006128854	MCLENDON HARDWARE	MCLENDON HARDWARE
2	174245191	DUVALL TRUE VALUE HARDWARE	HARDWARE-RETAIL	44413005	525104	INFOGROUP	2	DUVALL	98019	WA	{‘x’: -121.9853835, ‘y’: 47.738907, ‘spatialReference’: {‘wkid’: 4326}}	174245191	TRUE VALUE	TRUE VALUE
3	174262691	GATEWAY TRUE VALUE HARDWARE	HARDWARE-RETAIL	44413005	525104	INFOGROUP	2	ENUMCLAW	98022	WA	{‘x’: -121.9876155, ‘y’: 47.2019940000001, ‘spatialReference’: {‘wkid’: 4326}}	174262691	TRUE VALUE	TRUE VALUE
4	174471722	TWEEDY & POPP HARDWARE	HARDWARE-RETAIL	44413005	525104	INFOGROUP	2	SEATTLE	98103	WA	{‘x’: -122.3357134, ‘y’: 47.6612959300001, ‘spatialReference’: {‘wkid’: 4326}}	174471722	TWEEDY & POPP HARDWARE	local_brand

drop_by_id(drop_dataframe, source_id_column='location_id', drop_id_column='location_id')¶

Drop values from the parent dataframe based on unique identifiers found in another dataframe. This is a common task when removing brand locations from a dataframe of all locations to create a dataframe of only competitors.

Parameters

drop_dataframe (DataFrame) – Required Pandas DataFrame with a unique identifier column. Values in this column will be used to identify and remove values from the dataframe.
source_id_column (str) – Optional string for the column in the original dataframe with values to be used for identifying rows to either drop or retain. Default is ‘location_id’.
drop_id_column (str) – Optional string for the column in the drop dataframe with values to be used for identifying rows to drop or retain. Default is ‘location_id’.

Return type

DataFrame

Returns

Pandas DataFrame with rows removed based on common identifier values.

from arcgis.gis import GIS
from modeling import Country

# connect to a Web GIS
gis = GIS('https://path.to.arcgis.enterprise.com/portal',
          username='batman', password='P3nnyw0rth!')

# create a country object instance
cntry = Country('USA', source=gis)

# create an area of interest
aoi_df = cntry.cbsas.get('Seattle')

# use this area of interest to get brand locations
brand_df = aoi_df.mdl.business.get_by_name('Ace Hardware')

# get the top NAICS codes
top_codes = brand_df.mdl.business.get_top_codes()

# truncate the top code retrieved to widen the scope of codes
# retrieved - a broader category
top_code = top_codes.iloc[0]
naics_code = top_code[:-4]

# use this truncated code to retrieve competitors
naics_df = aoi_df.mdl.business.get_by_code(naics_code)

# now, remove the brand locations from the retrieved dataframe
# to retain just the competition
comp_df = naics_df.mdl.business.drop_by_id(brand_df)

get_by_code(category_code, code_type='NAICS', name_column='CONAME', id_column='LOCNUM', local_threshold=0)¶

Search for businesses based on business category code. In North America, this typically is either the NAICS or SIC code.

Parameters

category_code ([<class 'str'>, <class 'list'>]) – Required Business category code, such as 4568843, input as a string. This does not have to be a complete code. The tool will search for the category code with a partial code starting from the beginning.
code_type (str) – Optional The column in the business listing data to search for the input business code. In the United States, this is either NAICS or SIC. The default is NAICS.
name_column (str) – Optional Name of the column with business names to be searched. Default is ‘CONAME’
id_column (str) – Optional Name of the column with the value uniquely identifying each business location. Default is ‘LOCNUM’.
local_threshold (int) – Optional Number of locations to consider, albeit only in the study area, to categorize the each business location as either a major brand, and keep the name, or as a local brand with ‘local_brand’ in a new column.

Return type

DataFrame

Returns

Spatially Enabled pd.DataFrame

from modeling import Country

# start by creating a country object instance
usa = Country('USA', source='local')

# get a geography to work with from locally installed data
aoi_df = usa.cbsas.get('seattle')

# get all the businesses for a NAICS code category
naics_big_df = aoi_df.mdl.business.get_by_code(4441, local_threshold=2)

get_by_name(business_name, name_column='CONAME', id_column='LOCNUM', local_threshold=0)¶

Search business listings for a specific business name string.

Parameters

business_name (str) – String business name to search for.
name_column (str) – Optional - Name of the column with business names to be searched. Default is ‘CONAME’
id_column (str) – Optional - Name of the column with the value uniquely identifying each business location. Default is ‘LOCNUM’.
local_threshold (int) – Number of locations to consider, albeit only in the study area, to categorize the each business location as either a major brand, and keep the name, or as a local brand with ‘local_brand’ in a new column. This enables considering local brands in a market collectively to quantitatively evaluate the power of “buying local.”

Return type

DataFrame

Returns

Spatially Enabled DataFrame of businesses

from modeling import Country

# start by creating a country object instance
usa = Country('USA', source='local')

# get a geography to work with from locally installed data
aoi_df = usa.cbsas.get('seattle')

# get all Ace Hardware locations in Seattle
comp_df = aoi_df.mdl.business.get_by_name('Ace Hardware')

get_competition(brand_businesses, code_column='NAICS', name_column='CONAME', id_column='LOCNUM', local_threshold=0)¶

Get competitors from previously retrieved business listings.

Parameters

brand_businesses (DataFrame) – Previously retrieved business listings.
name_column (str) – Optional - Name of the column with business names to be searched. Default is ‘CONAME’
code_column (str) – Optional - The column in the data to search for business category codes. Default is ‘NAICS’
id_column (str) – Optional - Name of the column with the value uniquely identifying each business location. Default is ‘LOCNUM’.
local_threshold (int) – Number of locations to consider, albeit only in the study area, to categorize the each business location as either a major brand, and keep the name, or as a local brand with ‘local_brand’ in a new column.

Return type

DataFrame

Returns

Spatially Enabled DataFrame

from modeling import Country

# start by creating a country object instance
usa = Country('USA', source='local')

# get a geography to work with from locally installed data
aoi_df = usa.cbsas.get('seattle')

# get all competitors for Ace Hardware in Seattle
comp_df = aoi_df.mdl.business.get_competition('Ace Hardware')

get_top_codes(code_type='NAICS', threshold=0.5)¶

Get the industry identifier codes used to identify MOST of the records in a business DataFrame. This is useful for getting the identifier values to retrieve other business locations to identify competitors.

Parameters

code_type (str) – Optional string identifying the industry codes being used. Must be either NAICS or SIC. Default is ‘NAICS’.
threshold (float) – Optional float determining what percentage to use as threshold cutoff from input records to select codes from. Default is 0.5. This means the top 50%, or half, of the rows in the DataFrame will be sampled to return the industry code values identifying the locations.

Return type

Series

Returns

Pandas Series of code values.

from arcgis.gis import GIS
from modeling import Country

# connect to a Web GIS
gis = GIS('https://path.to.arcgis.enterprise.com/portal',
          username='batman', password='P3nnyw0rth!')

# create a country object instance
cntry = Country('USA', source=gis)

# create an area of interest
aoi_df = cntry.cbsas.get('Minneapolis')

# use this area of interest to get brand locations
brand_df = aoi_df.mdl.business.get_by_name('Ulta Beauty')

# get the top NAICS codes
top_codes = brand_df.mdl.business.get_top_codes()

# truncate the top code retrieved to widen the scope of codes
# retrieved - a broader category
top_code = top_codes.iloc[0]
naics_code = top_code[:-2]

# use this truncated code to retrieve competitors
naics_df = aoi_df.mdl.business.get_by_code(naics_code)

# now, remove the brand locations from the retrieved dataframe
# to retain just the competition
comp_df = naics_df.mdl.business.drop_by_id(brand_df)

Proximity¶

class modeling.Proximity(mdl)¶

Provides access to proximity calculation functions.

get_nearest(destination_dataframe, source=None, single_row_per_origin=True, origin_id_column='LOCNUM', destination_id_column='LOCNUM', destination_count=4, near_prefix=None, destination_columns_to_keep=None)¶

Get nearest enables getting the nth (default is four) nearest locations based on drive distance between two Spatially Enabled DataFrames. If the origins are polygons, the centroids will be used as the start locations. This is useful for getting the nearest store brand locations to every origin block group in a metropolitan area along with the nearest competition locations to every block group in the same metropolitan area.

Parameters

destination_dataframe (DataFrame) – Destination points in one of the supported input formats.
source (Union[str, Path, Country, GIS, None]) – Either the path to the network dataset, the Country object associated with the Business Analyst source being used, or a GIS object instance.
single_row_per_origin (Optional[bool]) – Optional - Whether or not to pivot the results to return only one row for each origin location. Default is True.
origin_id_column (Optional[str]) – Optional - Column in the origin points Spatially Enabled Dataframe uniquely identifying each feature. Default is ‘LOCNUM’.
destination_id_column (Optional[str]) – Column in the destination points Spatially Enabled Dataframe uniquely identifying each feature
destination_count (Optional[int]) – Integer number of destinations to search for from every origin point.
near_prefix (Optional[str]) – String prefix to prepend onto near column names in the output.
destination_columns_to_keep (Union[str, list, None]) – List of columns to keep in the output. Commonly, if businesses, this includes the column with the business names.

Return type

DataFrame

Returns

Spatially Enabled Dataframe with a row for each origin id, and metrics for each nth destinations.

from modeling import Country

brand_name = 'ace hardware'

# create a country ojbect to work with
usa = Country('USA')

# get a metropolitan area, a CBSA, to use as the study area
aoi_df = usa.cbsas.get('seattle')

# get the current year key variables to use for enrichment
evars = usa.enrich_variables
key_vars = evars[
    (evars.name.str.endswith('CY'))
    & (evars.data_collection.str.lower().str.contains('key'))
].reset_index(drop=True)

# get the block groups and enrich them with the ~20 key variables
bg_df = aoi_df.mdl.level(0).get().mdl.enrich(key_vars)

# get the store brand locations and competition locations
biz_df = aoi_df.mdl.business.get_by_name(brand_name)
comp_df = aoi_df.mdl.business.get_competition(biz_df)

# get the nearest three brand locations to every block group
bg_near_biz = bg_df.mdl.proximity.get_nearest(biz_df,
    origin_id_column='ID', destination_count=3, near_prefix='brand')

# get the nearest six competition locations to every block group
bg_near_df = bg_near_biz.mdl.proximity.get_nearest(bg_near_biz,
    origin_id_column='ID', near_prefix='comp', destination_count=6,
    destination_columns_to_keep=['brand_name', 'brand_name_category'])

Country¶

The country object is the foundational building block for working with demographic data. This is due to data collection, aggregation and dissemination methods used in Business Analyst. Succinctly, this is how the data is organized.

class modeling.Country(name, source=None, year=None)¶

Country objects are instantiated by providing the three letter country identifier and optionally also specifying the source. If the source is not explicitly specified Country will use local resources if the environment is part of an ArcGIS Pro installation with Business Analyst enabled and local data installed. If this is not the case, Country will then attempt to use the active GIS object instance if available. Also, if a GIS object is explicitly passed in, this will be used.

Parameters

name (str) – Three letter country identifier.
source (Union[str, GIS, None]) – Optional ‘local’ or a GIS object instance referencing an ArcGIS Enterprise instance with enrichment configured or ArcGIS Online. If not explicitly specified, will attempt to use locally installed data with Pro and Business analyst first. If this is not available, will look for an active GIS. If no active GIS, a GIS object will need to be explicitly provided with permissions to perform enrichment.
year (Optional[int]) – Optional and only applicable if using local data. In cases where models have been developed against a specific vintage (year) of data, this affords the ability to enrich data for this specific year to support these models.

from arcgis.modeling import Country

# instantiate a country
usa = Country('USA', source='local')

# get the seattle CBSA as a study area
aoi_df = usa.cbsas.get('seattle')

# use the Modeling DataFrame accessor to retrieve the block groups in seattle
bg_df = aoi_df.mdl.block_groups.get()

# get the available enrich variables as as DataFrame
e_vars = usa.enrich_variables

# filter the variables to just the current year key variables
key_vars = e_vars[(e_vars.data_collection.str.startswith('Key')) &
                  (e_vars.name.str.endswith('CY'))]

# enrich the data through the Modeling DataFrame accessor
e_df = bg_df.mdl.enrich(key_vars)

add_enrich_aliases(feature_class)¶

Add human readable aliases to an enriched feature class.

Note

This function requires ArcPy to be available in the environment, an environment created on a Windows operating system with ArcGIS Pro installed and ArcPy installed in the environment.

Parameters: feature_class (Union[Path, str]) – Path to the enriched feature class.
Return type: Path
Returns: Path to feature class with aliases added.

from modeling import Country

out_fc_pth = Path(r'C:/path/to/geodatabase.gdb/block_groups')

# create a country object
cntry =  Country('USA')

# get the current year key variables
evars = cntry.enrich_variables
key_vars = e_vars[
    (e_vars.data_collection.str.startswith('Key'))
    & (e_vars.name.str.endswith('CY'))
]

# get the block groups for the area of interest
bg_df = cntry.cbsas.get('seattle').mdl.block_groups.get()

# enrich the block groups with the key variables
enrich_df = bg_df.mdl.enrich(key_vars)

# save to a feature class
enrich_fc = enrich_df.spatial.to_featureclass(out_fc_pth)

# finally, add enrich aliases to the output feature class
cntry.add_aliases(enrich_fc)

enrich(data, enrich_variables)¶

Enrich a spatially enabled dataframe using either a enrichment variables defined using a Python List, NumPy Array or Pandas Series of enrich names. Also, a filtered enrich variables Pandas DataFrame can also be used.

Parameters

data (DataFrame) – Spatially Enabled DataFrame with geography_levels to be enriched.
enrich_variables (Union[list, array, Series, DataFrame]) – Optional iterable of enrich variables to use for enriching data. Filtered output from Country.enrich_variables can also be used.

Return type

DataFrame

Returns

Spatially Enabled DataFrame with enriched data added.

property enrich_variables¶

DataFrame of all the available enrichment variables.

from arcgis import GIS
from modeling import Country

# connect to an Enterprise GIS instance with Business Analyst installed
gis = GIS('https://mydomain.com/portal', username='geowizard', password='Y3ll0wBr!ck$')

# create a country to work with
cntry = Country('USA')

# get the available enrich variables as as DataFrame
e_vars = usa.enrich_variables

# filter the variables to just the current year key variables
key_vars = e_vars[(e_vars.data_collection.str.startswith('Key')) &
                  (e_vars.name.str.endswith('CY'))]

The key_vars table retrieved using the code sample above will look similar to the following.

	name	alias	data_collection	enrich_name	enrich_field_name	description	vintage	units
0	TOTPOP_CY	2020 Total Population	KeyUSFacts	KeyUSFacts.TOTPOP_CY	KeyUSFacts_TOTPOP_CY	2020 Total Population (Esri)	2020	count
1	GQPOP_CY	2020 Group Quarters Population	KeyUSFacts	KeyUSFacts.GQPOP_CY	KeyUSFacts_GQPOP_CY	2020 Group Quarters Population (Esri)	2020	count
2	DIVINDX_CY	2020 Diversity Index	KeyUSFacts	KeyUSFacts.DIVINDX_CY	KeyUSFacts_DIVINDX_CY	2020 Diversity Index (Esri)	2020	count
3	TOTHH_CY	2020 Total Households	KeyUSFacts	KeyUSFacts.TOTHH_CY	KeyUSFacts_TOTHH_CY	2020 Total Households (Esri)	2020	count
4	AVGHHSZ_CY	2020 Average Household Size	KeyUSFacts	KeyUSFacts.AVGHHSZ_CY	KeyUSFacts_AVGHHSZ_CY	2020 Average Household Size (Esri)	2020	count
5	MEDHINC_CY	2020 Median Household Income	KeyUSFacts	KeyUSFacts.MEDHINC_CY	KeyUSFacts_MEDHINC_CY	2020 Median Household Income (Esri)	2020	currency
6	AVGHINC_CY	2020 Average Household Income	KeyUSFacts	KeyUSFacts.AVGHINC_CY	KeyUSFacts_AVGHINC_CY	2020 Average Household Income (Esri)	2020	currency
7	PCI_CY	2020 Per Capita Income	KeyUSFacts	KeyUSFacts.PCI_CY	KeyUSFacts_PCI_CY	2020 Per Capita Income (Esri)	2020	currency
8	TOTHU_CY	2020 Total Housing Units	KeyUSFacts	KeyUSFacts.TOTHU_CY	KeyUSFacts_TOTHU_CY	2020 Total Housing Units (Esri)	2020	count
9	OWNER_CY	2020 Owner Occupied HUs	KeyUSFacts	KeyUSFacts.OWNER_CY	KeyUSFacts_OWNER_CY	2020 Owner Occupied Housing Units (Esri)	2020	count
10	RENTER_CY	2020 Renter Occupied HUs	KeyUSFacts	KeyUSFacts.RENTER_CY	KeyUSFacts_RENTER_CY	2020 Renter Occupied Housing Units (Esri)	2020	count
11	VACANT_CY	2020 Vacant Housing Units	KeyUSFacts	KeyUSFacts.VACANT_CY	KeyUSFacts_VACANT_CY	2020 Vacant Housing Units (Esri)	2020	count
12	MEDVAL_CY	2020 Median Home Value	KeyUSFacts	KeyUSFacts.MEDVAL_CY	KeyUSFacts_MEDVAL_CY	2020 Median Home Value (Esri)	2020	currency
13	AVGVAL_CY	2020 Average Home Value	KeyUSFacts	KeyUSFacts.AVGVAL_CY	KeyUSFacts_AVGVAL_CY	2020 Average Home Value (Esri)	2020	currency
14	POPGRW10CY	2010-2020 Growth Rate: Population	KeyUSFacts	KeyUSFacts.POPGRW10CY	KeyUSFacts_POPGRW10CY	2010-2020 Population: Annual Growth Rate (Esri)	2020	pct
15	HHGRW10CY	2010-2020 Growth Rate: Households	KeyUSFacts	KeyUSFacts.HHGRW10CY	KeyUSFacts_HHGRW10CY	2010-2020 Households: Annual Growth Rate (Esri)	2020	pct
16	FAMGRW10CY	2010-2020 Growth Rate: Families	KeyUSFacts	KeyUSFacts.FAMGRW10CY	KeyUSFacts_FAMGRW10CY	2010-2020 Families: Annual Growth Rate (Esri)	2020	pct
17	DPOP_CY	2020 Total Daytime Population	KeyUSFacts	KeyUSFacts.DPOP_CY	KeyUSFacts_DPOP_CY	2020 Total Daytime Population (Esri)	2020	count
18	DPOPWRK_CY	2020 Daytime Pop: Workers	KeyUSFacts	KeyUSFacts.DPOPWRK_CY	KeyUSFacts_DPOPWRK_CY	2020 Daytime Population: Workers (Esri)	2020	count
19	DPOPRES_CY	2020 Daytime Pop: Residents	KeyUSFacts	KeyUSFacts.DPOPRES_CY	KeyUSFacts_DPOPRES_CY	2020 Daytime Population: Residents (Esri)	2020	count

property geography_levels¶: DataFrame of available geography levels.

get_enrich_variables_dataframe_from_variable_list(enrich_variables, drop_duplicates=True)¶

Get a dataframe of enrich variables associated with the list of variables passed in. This is especially useful when needing aliases (human readable names), or are interested in enriching more data using previously enriched data as a template.

Parameters

enrich_variables (Union[list, tuple, ndarray, Series]) – Iterable (normally a list) of variables correlating to enrichment variables. These variable names can be simply the name, the name prefixed by the collection separated by a dot, or the output from enrichment in ArcGIS Pro with the field name modified to fit field naming and length constraints.
drop_duplicates – Optional boolean (default True) indicating whether to drop duplicates. Since the same variables appear in multiple data collections, multiple instances of the same variable can be found. Dropping duplicates removes redundant matches.

Return type

DataFrame

Returns

Pandas DataFrame of enrich variables with the different available aliases.

from pathlib import Path

import arcpy
from modeling import Country

# path to previously enriched data
enriched_fc_pth = Path(r'C:/path/to/geodatabase.gdb/enriched_data')
new_fc_pth = Path(r'C:/path/to/geodatabase.gdb/block_groups_pdx')

# get a list of column names from previously enriched data
attr_lst = [c.name for c in arcpy.ListFields(str(enriched_fc_pth))

# get a country to work in
cntry = Country('USA', source='local')

# get dataframe of variables used for previously enriched data
enrich_vars = cntry.get_enrich_variables_dataframe_from_variable_list(attr_lst)

# enrich block groups in new area of interest using the same variables
enrich_df = cntry.cbsas.get('portland-vancouver').mdl.block_groups.get().mdl.enrich(enrich_vars)

# save the new data and add aliases
enrich_df.spatial.to_featureclass(new_fc_pth)
cntry.add_enrich_aliases(new_fc_pth)

level(geography_index)¶

Get an available geography_level in the country.

Parameters: geography_index ((<class 'str'>, <class 'int'>)) – Either the geographic_level geo_name or the index of the geography_level level. This can be discovered using the Country.geography_levels method.
Return type: DataFrame
Returns: Spatially Enabled DataFrame of the requested geography_level with the Modeling accessor properties initialized.

property levels¶: Dataframe of available geography levels. (Alias of geography_levels.)

verify_can_enrich()¶: If the country enrich instance can enrich based on the permissions. Only relevant if source is a GIS instance.

verify_can_perform_network_analysis(network_function=None)¶

If the country enrich instance can perform transportation network analysis based on permissions. Only relevant if the source is a GIS instance.

Parameters: network_function (Optional[str]) – Optional string describing specific network function to check for. Valid values include ‘closestfacility’, ‘locationallocation’, ‘optimizedrouting’, ‘origindestinationcostmatrix’, ‘routing’, ‘servicearea’, or ‘vehiclerouting’.
Return type: bool
Returns: Boolean indicating if the country instance, based on permissions, has network analysis privileges.

ModelingAccessor (mdl)¶

Business¶

Proximity¶

Country¶

ModelingAccessor (`mdl`)¶