medicaid_utils.common_utils package¶
Submodules¶
medicaid_utils.common_utils.dataframe_utils module¶
- medicaid_utils.common_utils.dataframe_utils.convert_ddcols_to_datetime(df: DataFrame, lst_col: List[str]) DataFrame[source]¶
Convert list of columns specified in a dataframe to datetime type :param pandas_df df: dataframe :param list(str) lst_col: list of column names :rtype: None
- medicaid_utils.common_utils.dataframe_utils.copy_ddcols(df: DataFrame, lst_col: List[str], lst_new_names: List[str]) DataFrame[source]¶
- medicaid_utils.common_utils.dataframe_utils.export(df: DataFrame, pq_engine: str, output_filename: str, pq_location: str, _csv_location: str, lst_datetime_col: List[str], is_dask: bool = True, n_rows: int = -1, do_csv: bool = True, df_schema: DataFrame | None = None, logger_name: str = 'Dataframe utils', rewrite: bool = False, do_parquet: bool = True) None[source]¶
Exports a Dask DataFrame to parquet and/or CSV.
- medicaid_utils.common_utils.dataframe_utils.fix_index(df: DataFrame, index_name: str, drop_column: bool = True) DataFrame[source]¶
- medicaid_utils.common_utils.dataframe_utils.get_first_day_gap(df: DataFrame, index_col: str, time_col: str, start_date_col: str, threshold: int) DataFrame[source]¶
- medicaid_utils.common_utils.dataframe_utils.get_reduced_column_names(multiidx_df_columns: MultiIndex, combine_levels: bool = False) List[str][source]¶
- medicaid_utils.common_utils.dataframe_utils.prepare_dtypes_for_csv(df_temp: DataFrame, df_schema: DataFrame) DataFrame[source]¶
- medicaid_utils.common_utils.dataframe_utils.safe_convert_int_to_str(df: DataFrame, lst_col: List[str]) DataFrame[source]¶
medicaid_utils.common_utils.links module¶
medicaid_utils.common_utils.recipes module¶
medicaid_utils.common_utils.stats_utils module¶
- medicaid_utils.common_utils.stats_utils.compute_contingency_table(pdf: DataFrame, lst_states: List[str], lst_metrics: List[str], lst_count_metrics: List[str], output_fname: str, pop_col_name: str = 'gt_50pc_hrsa_fqhc', dct_labels: Dict[int, str] | None = None, state_col_name: str = 'STATE_CD') pd.io.formats.style.Styler[source]¶
- medicaid_utils.common_utils.stats_utils.compute_descriptives(pdf: DataFrame, lst_states: List[str], lst_metrics: List[str], output_fname: str, state_col_name: str = 'STATE_CD') DataFrame[source]¶
- medicaid_utils.common_utils.stats_utils.compute_missing_stats(df: DataFrame, output_fname: str, state_col_name: str = 'STATE_CD') DataFrame[source]¶
- medicaid_utils.common_utils.stats_utils.compute_t_stats(pdf: DataFrame, lst_states: List[str], lst_metrics: List[str], output_fname: str, pop_col_name: str = 'gt_50pc_hrsa_fqhc', dct_labels: Dict[int, str] | None = None, state_col_name: str = 'STATE_CD') pd.io.formats.style.Styler[source]¶
- medicaid_utils.common_utils.stats_utils.cramers_corrected_stat(confusion_matrix: DataFrame) float[source]¶
calculate Cramers V statistic for categorial-categorial association. uses correction from Bergsma and Wicher, Journal of the Korean Statistical Society 42 (2013): 323-328
- medicaid_utils.common_utils.stats_utils.get_cont_table_statewise(pdf_included: DataFrame, lst_metrics: List[str], pop_col_name: str, lst_count_metrics: List[str], dct_labels: Dict[int, str], lst_st: List[str], state_col_name: str) DataFrame[source]¶
- medicaid_utils.common_utils.stats_utils.get_contingency_table(pdf_dataset: DataFrame, lst_categorical_metrics: List[str], pop_col_name: str, lst_numeric_col_to_binarize: List[str], dct_labels: Dict[int, str]) Tuple[DataFrame, DataFrame][source]¶
- medicaid_utils.common_utils.stats_utils.get_covar_plots(pdf: DataFrame, lst_covar: List[str], lst_hist_covar: List[str], cut_outliers: bool = False) Any[source]¶
- medicaid_utils.common_utils.stats_utils.get_descriptives(pdf: DataFrame, lst_st: List[str], lst_col: List[str], state_col_name: str) DataFrame[source]¶
- medicaid_utils.common_utils.stats_utils.get_missingness_stats(df: DataFrame, outputfname: str) DataFrame[source]¶
- medicaid_utils.common_utils.stats_utils.get_ranksum_table(pdf_dataset: DataFrame, lst_metrics: List[str], pop_col_name: str, dct_labels: Dict[int, str]) Tuple[DataFrame, DataFrame][source]¶
medicaid_utils.common_utils.usps_address module¶
This script shows an example of using requests and the USPS Address Information API. In order to use this, you must first register so you can get your USERID. Your ID must be in the environment variable USPS_USERID. For information on the API see here <https://www.usps.com/business/web-tools-apis/address-information-api.htm>_
- class medicaid_utils.common_utils.usps_address.AddressStandardizationWebTool(street, city, state, name=None, suite=None, zip5=None, zip4=None, userid='')[source]¶
Bases:
USPSShippingAPIObject to get a standardized USPS Address.
- api = 'Verify'¶
- class medicaid_utils.common_utils.usps_address.USPSAddress(name='', suite='', street='', city='', state='', zip5='', zip4='')[source]¶
Bases:
objectRepresentation of an United States Postal Service address.
- class medicaid_utils.common_utils.usps_address.USPSShippingAPI(api, userid='')[source]¶
Bases:
objectRepresentation of the USPS Shipping API https://www.usps.com/business/web-tools-apis/address-information-api.htm
- url = 'https://production.shippingapis.com/ShippingAPI.dll'¶