Preprocessing¶
With a massive amount of data available through Business Analyst, it is fairly straightforward to integrate this into data pipelines for subsequent analysis. These examples demonstrate how enrich can be integrated into a SciKit-Learn Transformer as a preprocessor.
- class ba_examples.preprocessing.ArrayToDataFrame(columns_template, index=None)¶
Bases:
BaseEstimator
,TransformerMixin
Helper to convert the output
np.ndarray
back into a Pandas DataFrame.- Parameters:
- fit(X)¶
Fit method, which just sets properties. :type X:
ndarray
:param X:np.ndarray
to be converted into a Pandas Data Frame.
- transform(X)¶
Convert the
np.ndarray
into a Pandas DataFrame.- Parameters:
X (
ndarray
) –np.ndarray
to be converted into a Pandas Data Frame.- Returns:
Data from the
nd.ndarray
in the columns from the Pandas Data Frame.
- class ba_examples.preprocessing.EnrichBase¶
Bases:
BaseEstimator
,TransformerMixin
The
arcpy.geoenrichment.Country.enrich
method provides access to a massive amount of data for analysis, a treasure trove of valuable data you can use through enrichment. This object streamlines the process of accessing this method as part of a SciKit-Learn Pipeline by wrapping the functionality into a Transformer, specifically a preprocessor, and is used to create other transformers performing more specific tasks.- property country¶
arcgis.geoenrichment.Country
object instance being used.
- property enrich_var_aliases¶
List of enrich aliases, so you can understand what the variables are.
- property enrich_variables¶
Pandas data frame of variables being used for enrichment.
- fit(X)¶
Since just building a preprocessor nothing is happening here.
- property return_geometry¶
Do you want the geometry when enriching?
- class ba_examples.preprocessing.EnrichPolygon(country, enrich_variables, return_geometry=True)¶
Bases:
EnrichBase
The
arcpy.geoenrichment.Country.enrich
wrapped in a preprocessor for enriching input areas delineated witharcgis.geometry.Polygon
geometries. Inherits fromEnrichBase
.- Parameters:
- class ba_examples.preprocessing.EnrichStandardGeography(country, enrich_variables, standard_geography_level=<class 'str'>, return_geometry=True)¶
Bases:
EnrichBase
The
arcpy.geoenrichment.Country.enrich
wrapped in a preprocessor for enriching a list of standard geographies identified by their unique identifiers. A common example is postal or ZIP codes.- Parameters:
country (
Country
) – Country to be used for enrichment.enrich_variables (
Union
[List
[str
],DataFrame
]) – A list of enrich variable names or filtered dataframe of enrich variables to be used.standard_geography_level – Standard geography level to use for enrichment.
return_geometry (
bool
) – Do you want the shapes or not?
- class ba_examples.preprocessing.KeepOnlyEnrichColumns(country, id_column=None, keep_geometry=True)¶
Bases:
BaseEstimator
,TransformerMixin
Remove any non-enrich variable columns from a Pandas data frame.
- Parameters:
country (
Country
) –arcgis.geoenrichment.Country
object used for original enrichment.id_column (
Optional
[str
]) – Column with unique identifiers. This will become the output index. If no column specified, the existing index will be used.keep_geometry (
bool
) – Whether to keep the geometry, if applicable.
- fit(X)¶
Sets properties based on the input parameters and data.
- Parameters:
X (
DataFrame
) – Pandas data frame created from thearcgis.geoenrichment.Country.enrich
method.- Returns:
Pandas DataFrame pruned to just retain columns from enrichment.
- transform(X)¶
- Parameters:
X (
DataFrame
) – Pandas data frame output fromarcgis.geoenrichment.Country.enrich
method.- Returns:
Pandas data frame with only enrich columns, the identifier column as the index, and the geometry column, if applicable.