medicaid_utils.filters.claims package¶
Submodules¶
medicaid_utils.filters.claims.dx_and_proc module¶
This module has functions to add diagnosis/ procedure code based indicator flags to claims
- medicaid_utils.filters.claims.dx_and_proc.flag_diagnoses_and_procedures(dct_diag_codes: dict, dct_proc_codes: dict, df_claims: DataFrame, cms_format: str = 'MAX', lst_claim_diag_col: List[str] = None, dct_column_values: dict | None = None) DataFrame[source]¶
Flags claims based on diagnosis/ procedure codes
- Parameters:
dct_diag_codes (dict) –
Dictionary of diagnosis codes. Should be in the format
{condition_name: {['incl' / 'excl']: {[9/ 10]: list of codes} }
Eg:
{'oud_nqf': {'incl': {9: ['3040','3055']}}}
dct_proc_codes (dict) –
Dictionary of procedure codes. Should be in the format
{procedure_name: {procedure_system_code: list of codes} }
Eg:
{'methadone': {6: 'HZ81ZZZ,HZ84ZZZ,HZ85ZZZ, HZ86ZZZ, HZ91ZZZ,HZ94ZZZ,HZ95ZZZ,'HZ96ZZZ'.split(",")} }
df_claims (dd.DataFrame) – Claims dataframe
cms_format ({'MAX', TAF'}) – CMS file format.
lst_claim_diag_col (List[str], optional) – List of diagnosis column names
dct_column_values (dict) –
Dictionary of column names and numerical values that should be used to flag conditions and procedures. Should be in the format
{condn_procedure_name: {column_name: list of numerical values} }
Eg:
{'dx_delivery': {'RCPNT_DLVRY_CD': [1]} }
- Return type:
dd.DataFrame
- Raises:
ValueError – If non-alphanumeric columns are present in ICD/ CPT procedure codes in dct_diag_codes/ dct_proc_codes
Examples
>>> import pandas as pd >>> import dask.dataframe as dd >>> pdf = pd.DataFrame({ ... 'MSIS_ID': ['A', 'B'], ... 'DIAG_CD_1': ['3040', '250'], ... 'DIAG_CD_2': ['', '3055'], ... 'service_date': pd.to_datetime(['2020-01-01', '2020-02-01']), ... }).set_index('MSIS_ID') >>> ddf = dd.from_pandas(pdf, npartitions=1) >>> dct_diag = {'oud': {'incl': {9: ['3040', '3055']}}} >>> result = flag_diagnoses_and_procedures(dct_diag, {}, ddf, cms_format='MAX') >>> result.compute()['diag_oud'].tolist() [1, 1]
- medicaid_utils.filters.claims.dx_and_proc.get_patient_ids_with_conditions(dct_diag_codes: dict, dct_proc_codes: dict, logger_name: str = '/home/runner/work/medicaid-utils/medicaid-utils/medicaid_utils/filters/claims/dx_and_proc.py', cms_format: str = 'MAX', dct_column_values: dict | None = None, **dct_claims: DataFrame) Tuple[DataFrame, Dict[str, DataFrame]][source]¶
Gets patient ids with conditions denoted by provided diagnosis codes or procedure codes
- Parameters:
dct_diag_codes (dict) –
Dictionary of diagnosis codes. Should be in the format
{condition_name: {['incl' / 'excl']: {[9/ 10]: list of codes} }
Eg:
{'oud_nqf': {'incl': {9: ['3040','3055']}}}
dct_proc_codes (dict) –
Dictionary of procedure codes. Should be in the format
{procedure_name: {procedure_system_code: list of codes} }
Eg:
{'methadone': {6: 'HZ81ZZZ,HZ84ZZZ,HZ85ZZZ, HZ86ZZZ, HZ91ZZZ,HZ94ZZZ, HZ95ZZZ,HZ96ZZZ'.split(",")} }
logger_name (str) – Logger name
cms_format ({'MAX', TAF'}) – CMS file format.
dct_column_values (dict) –
Dictionary of column names and value that should be used to flag conditions and procedures. Should be in the format
{condn_procedure_name: {column_name: list of values} }
Eg:
{'dx_delivery': {'RCPNT_DLVRY_CD': [1]} }
**dct_claims (dict) –
- Keyword arguments of claim dataframes. Should be in the format:
{file_type: dask.dataframe}
- Return type:
Tuple(pd.DataFrame, dict)
- Raises:
IndexError – If the input claim datasets do not have the same index name
Examples
>>> import pandas as pd >>> import dask.dataframe as dd >>> pdf = pd.DataFrame({ ... 'MSIS_ID': ['A', 'B', 'A'], ... 'DIAG_CD_1': ['3040', '250', '3055'], ... 'service_date': pd.to_datetime(['2020-01-01', '2020-02-01', '2020-03-01']), ... }).set_index('MSIS_ID') >>> ddf = dd.from_pandas(pdf, npartitions=1) >>> dct_diag = {'oud': {'incl': {9: ['3040', '3055']}}} >>> pdf_ids, dct_stats = get_patient_ids_with_conditions( ... dct_diag, {}, cms_format='MAX', ip=ddf) >>> 'ip_diag_oud' in pdf_ids.columns True
medicaid_utils.filters.claims.rx module¶
This module has functions to add NDC code based indicator flags to claims
- medicaid_utils.filters.claims.rx.flag_prescriptions(dct_ndc_codes: dict, df_claims: DataFrame, ignore_missing_days_supply: bool = False) DataFrame[source]¶
Flags claims based on NDC codes
- Parameters:
dct_ndc_codes (dict) –
Dictionary of NDC. Should be in the format
{condition_name: list of codes}
Eg:
{'buprenorphine': ['00378451905', '00378451993', '00378617005', '00378617077']}
df_claims (dd.DataFrame) – Claims dataframe
ignore_missing_days_supply (bool, default=False) – Always flag claims with missing, negative or 0 days of supply as 0
- Return type:
dd.DataFrame
Examples
>>> import pandas as pd >>> import dask.dataframe as dd >>> pdf = pd.DataFrame({ ... 'MSIS_ID': ['A', 'B', 'C'], ... 'NDC': ['00378451905', '99999999999', '00378617005'], ... 'DAYS_SUPPLY': ['30', '10', '0'], ... }).set_index('MSIS_ID') >>> ddf = dd.from_pandas(pdf, npartitions=1) >>> dct_ndc = {'buprenorphine': ['00378451905', '00378617005']} >>> result = flag_prescriptions(dct_ndc, ddf) >>> result.compute()['rx_buprenorphine'].tolist() [1, 0, 0]
When
ignore_missing_days_supplyis True, claims with zero or missing days of supply are still flagged:>>> result2 = flag_prescriptions(dct_ndc, ddf, ignore_missing_days_supply=True) >>> result2.compute()['rx_buprenorphine'].tolist() [1, 0, 1]