Preprocessing¶
With a massive amount of data available through Business Analyst, it is fairly straightforward to integrate this into data pipelines for subsequent analysis. These examples demonstrate how enrich can be integrated into a SciKit-Learn Transformer as a preprocessor.
- class ba_examples.preprocessing.ArrayToDataFrame(columns_template, index=None)¶
Bases:
BaseEstimator,TransformerMixinHelper to convert the output
np.ndarrayback into a Pandas DataFrame.- Parameters:
- fit(X)¶
Fit method, which just sets properties. :type X:
ndarray:param X:np.ndarrayto be converted into a Pandas Data Frame.
- transform(X)¶
Convert the
np.ndarrayinto a Pandas DataFrame.- Parameters:
X (
ndarray) –np.ndarrayto be converted into a Pandas Data Frame.- Returns:
Data from the
nd.ndarrayin the columns from the Pandas Data Frame.
- class ba_examples.preprocessing.EnrichBase¶
Bases:
BaseEstimator,TransformerMixinThe
arcpy.geoenrichment.Country.enrichmethod provides access to a massive amount of data for analysis, a treasure trove of valuable data you can use through enrichment. This object streamlines the process of accessing this method as part of a SciKit-Learn Pipeline by wrapping the functionality into a Transformer, specifically a preprocessor, and is used to create other transformers performing more specific tasks.- property country¶
arcgis.geoenrichment.Countryobject instance being used.
- property enrich_var_aliases¶
List of enrich aliases, so you can understand what the variables are.
- property enrich_variables¶
Pandas data frame of variables being used for enrichment.
- fit(X)¶
Since just building a preprocessor nothing is happening here.
- property return_geometry¶
Do you want the geometry when enriching?
- class ba_examples.preprocessing.EnrichPolygon(country, enrich_variables, return_geometry=True)¶
Bases:
EnrichBaseThe
arcpy.geoenrichment.Country.enrichwrapped in a preprocessor for enriching input areas delineated witharcgis.geometry.Polygongeometries. Inherits fromEnrichBase.- Parameters:
- class ba_examples.preprocessing.EnrichStandardGeography(country, enrich_variables, standard_geography_level=<class 'str'>, return_geometry=True)¶
Bases:
EnrichBaseThe
arcpy.geoenrichment.Country.enrichwrapped in a preprocessor for enriching a list of standard geographies identified by their unique identifiers. A common example is postal or ZIP codes.- Parameters:
country (
Country) – Country to be used for enrichment.enrich_variables (
Union[List[str],DataFrame]) – A list of enrich variable names or filtered dataframe of enrich variables to be used.standard_geography_level – Standard geography level to use for enrichment.
return_geometry (
bool) – Do you want the shapes or not?
- class ba_examples.preprocessing.KeepOnlyEnrichColumns(country, id_column=None, keep_geometry=True)¶
Bases:
BaseEstimator,TransformerMixinRemove any non-enrich variable columns from a Pandas data frame.
- Parameters:
country (
Country) –arcgis.geoenrichment.Countryobject used for original enrichment.id_column (
Optional[str]) – Column with unique identifiers. This will become the output index. If no column specified, the existing index will be used.keep_geometry (
bool) – Whether to keep the geometry, if applicable.
- fit(X)¶
Sets properties based on the input parameters and data.
- Parameters:
X (
DataFrame) – Pandas data frame created from thearcgis.geoenrichment.Country.enrichmethod.- Returns:
Pandas DataFrame pruned to just retain columns from enrichment.
- transform(X)¶
- Parameters:
X (
DataFrame) – Pandas data frame output fromarcgis.geoenrichment.Country.enrichmethod.- Returns:
Pandas data frame with only enrich columns, the identifier column as the index, and the geometry column, if applicable.