NHID HealthCost Analysis Methodology
Background
Below is a description of the methodology for calculating the estimated costs for health care services reported in the New Hampshire Insurance Department (NHID) HealthCost website. The estimates are based on the median amounts paid (by both the insurance carrier and the patient) using claims data from the New Hampshire Comprehensive Health Information System (NHCHIS) database. The cost amount is often referred to as the “allowed rate” of payment to health care providers.
It has been well documented in the published literature that there is substantial variation in the cost of health care, even when provided by the same provider. There are many factors that contribute to the variation, and the NHID uses several tools to help address these issues when reporting “costs” to the patient. When the patient is insured, the cost to the patient for covered services is based on a contract between the provider and the insurance company. When the patient is not insured, or the services are not covered under the patient’s health plan, the cost is based on charges minus any discount the provider offers uninsured patients.
The methodology used in HealthCost is consistent across payers and providers by treatment type. So, the same selection and exclusion criteria for including or removing any observations is based on statistical measures and calculations that are consistently applied from one provider to another and from one payer to another.
Included Costs
The focus is on the total cost and the difference in total costs to the patient between providers who provide a similar service. The cost to the patient is a combined total that does not distinguish between what is paid to the hospital (or clinic, ambulatory surgery center, or any other facility), and each physician or multiple physicians who treat the patient. The “lead provider” associated with the costs is considered to be the most easily recognized entity that care is received from. In many cases, it is a hospital, even though the patient actually receives treatment from several different physicians who could also be considered the provider of care. The overall cost is determined by several variables, including: the treatment provided, the contract between the insurer and the lead provider, contracts with all other providers the patient is treated by, the volume of primary and incidental services provided (both those necessary and unnecessary), the typical illness burden of patients treated by the provider(s), and how efficient the providers are.
Calculation of Cost Estimate
The median treatment cost based on patient experience is reported instead of the average. Consistent with the purpose of HealthCost, the median is a better measure of central tendency when predicting the cost liability to the patient and health plan. The median is influenced less than the average by outlier observations that may skew the results. The median also makes determining actual contract terms for payments between the insurer and the provider more difficult.
In this example, both insurance carriers would have the same median cost reported in HealthCost:
| Reimbursement Contract Rates |
|
Proportion of Patients at Make Believe Hospital |
Insurance Company A |
Insurance Company B |
40% |
$90 |
$90 |
50% |
$100 |
$100 |
10% |
$110 |
$500 |
Median = |
$100 |
$100 |
Average = |
$97 |
$136 |
Total Annual Payments (1000 Visits) = |
$97,000 |
$136,000 |
Based on the median, the reimbursement contracts appear identical. The average would be a more accurate representation of the “value” of the contract to the insurer and the provider. However, $100 is a better estimate of the total cost for most patients, regardless of which insurance carrier they are covered by.
Variability
Whenever rates are reported, the NHID will include information on the variability of the rate. If the historical data show low variability, then this is indicated as Precision of the Cost Estimate = “HIGH.” Likewise, if the data show extensive variation, the estimate will indicate the precision level is “LOW.” When the precision level is low, the experience of an individual patient is more likely to be different than what is reported.
The measure of variation in the rate is based on the coefficient of variation for charges, including all payers, and the difference between the median charge for the insurance company product line and the overall median for all insurance companies and product lines at the provider identified. These values, both percentages, are summed together and translated into an ordinal scale. Like most ordinal scales, the distinction between the values at neighboring points on the scale is not necessarily the same. For instance, the range within Very Low and Low might be much less than that in Medium and High. The scale is determined based on how the variability compares to other reported insurance carrier LOB calculations within the health care service selected. The breakdown is based on percentiles, based on 75th, 50th, and 25th break points.
When variability in the data is high (Precision of the Cost Estimate=”VERY LOW”) and there are fewer than four patients in the analysis, than the output for that payer product line is not reported.
Risk Adjustment
Risk adjustment is used in HealthCost by adding a column called Patient Complexity. Risk adjustment provides a relative measure for the difference in the illness burden of patients in the analysis and treated by the selected providers. Risk adjustment can be used to explain why the historical costs at one provider may exceed that at another provider. Risk adjustment considers more than the diagnoses for the visit of interest. Instead, all of the diagnoses throughout the period of the analysis are considered so that the effect of multiple comorbidities can be considered in evaluating how one patient population differs from another. Examples of the conditions checked for in a patient’s history are: congestive heart failure, epilepsy, primary pulmonary hypertension, diabetes, and cancer. Patient populations that average more comorbidities or have the most severe forms of disease are expected to need greater health care resources than a less complex patient population.
The application of risk adjustment is specific for the patients with the identified condition. For example, Hospital A attracts a very “average” patient population when all treatments are considered, but Hospital A attracts very complex patients for normal vaginal deliveries. When viewing the cost rates for deliveries, the Patient Complexity at Hospital A would be described as “HIGH.”
The risk adjustment calculation is a relative index measure, where 1.00 is the mid point, and values above or below are a calculated difference in expected resource consumption. For the HealthCost website, the index measure is translated to an ordinal scale based on the index value when compared to other reported insurance carrier LOB calculations within the health care service selected. The breakdown is based on percentiles, using the 90th, 75th, 25th, and 10th separation points. Like most ordinal scales, the distinction between the values at neighboring points on the scale is not necessarily the same.
The rates provided in HealthCost are not risk adjusted. They are the actual calculated rates based on the NHCHIS data and the HealthCost algorithms. The risk adjustment field is provided in order to provide a possible explanation why the costs shown may be different than that of another provider.
Outliers
A process exists to remove outliers. Outliers are data values that do not represent the typical experience for a particular service at a particular provider location, and they can exist for several reasons. In some cases the historical claims experience is incomplete. These circumstances may exist when the providers have not billed for all services, or the insurance carrier has not processed all of the claims submitted for the visit. Alternatively, human error may result in a particular service that is coded incorrectly. An extreme example might be a service related to a kidney transplant that is coded as a kidney stone removal. In this example the cost for the kidney stone removal would appear to be excessive. Because the median is calculated instead of the average, outliers have a small effect on the estimated costs reported in HealthCost, but they can have a substantial impact in the formula used to assess the variability in the rates.
Removal of the outliers takes place at two points. First, a ceiling and a floor for total charges in the analyses is established. The ceiling is where 95 percent of all charges fall below, across all providers. The floor is set at 1 percent. Observations above the ceiling or below the floor are removed.
The second point where outliers are removed is after analyzing a specific provider’s experience. Patients with total charges in the lowest one percentile or highest fifth percentile are removed from the analysis. The calculations of the percentiles is done using standard statistical conventions, so if the observation values to do not vary much from each other, it is unlikely any will be removed.
Outpatient Procedures:
Records are selected based on the American Medical Association’s Current Procedural Terminology (CPT) code. Since many of the codes are quite specific, a record count by CPT code is performed among codes that are for a similar service (e.g. all CPT codes for mammograms) and the frequency distribution is evaluated to see what are the most common procedures within the health care service. A review of the CPT code descriptions takes place, to determine what is the simplest and most easily recognized procedure by a layperson. A combination of frequency, simplicity, and consumer familiarity is used to determine which procedure code is selected to identify visits. When available, clinical insight is also considered.
Once the procedure code is selected, all other procedures, services, supplies, or other costs performed or other items billed on the same day are added together to compile a visit. This includes procedures performed by different providers. If there are any codes included that are known to dramatically impact the visit, but only performed some of the time, then that particular patient’s entire visit is excluded from the analysis.
Individual patient records are summarized for the day of service so that total charges and total amounts paid by the insurance company and patient can be reported.
A lead provider is assigned to the visit as the one entity responsible for all of the treatment costs. This is necessary for comparison purposes, and is most often the facility where the procedure took place. If there is no facility, then it will be a physician’s office or clinic.
Prior to separating the data by payer (and payer insurance product), a statistical analysis of the data takes place. The number of observations, mean, median, mode, coefficient of variation, skewness, kurtosis, extreme observation values, and graphical distributions (stem leaf plot, boxplot, and normal probability plot) of the data are evaluated. The data are reviewed to determine whether the median can be a useful estimate of cost. Although median is reported, the evaluation of variability is around the mean. Data that are not considered acceptable are usually one of the following: not normally distributed, have a bimodal distribution, are unusually skewed, have a high kurtosis value, the mean is substantially less than the median, or there is a very high degree of variation.
The following is a real example of a diagnostic mammogram procedure that at this time we do not feel meets the criteria for inclusion in the HealthCost website (CPT 76091).
The summary statistics for one provider are:
N= 221
mean= $546
median= $703
mode= $703
A graphical representation of the data looks like:
The numbers on the left represent the various charges for the procedures, and the numbers on the right the frequency. The frequency is also represented by the number of asterisks across from left to right. The first warning that there is a problem for HealthCost is that the mean is less than the median. Usually when looking at health care cost data, the distribution is skewed to the right, or positive. That means there are high cost outliers that pull the average charge up, even when the charges for most of the procedures are much lower. When the median exceeds the mean, this is a sign the distribution is not typical of what we would expect when looking at the data. If the distribution is not what we expect, than our assumptions in the model may not be correct.
The second major issue is that the distribution is bimodal in nature. A bimodal distribution typically indicates that the distribution is in fact the sum of two different distributions, each with a single notable peak. However, it can be difficult to find the differentiating factor between the samples in one distribution and those in the other. It may the the patient age, prior medical history, or any factor that may influence clinical judgement and the services provided. The summary statistics do not show the multiple charge distributions for this procedure code. Therefore, we cannot make a reliable prediction whether a patient will be faced with a procedure that has a charge close to $775, or less than $325.
After an initial statistical analysis, the data falling above the 95 percentile and below the one percentile are removed from the analysis.
After excluding any extreme observations, the statistical analysis is performed again, and the same measures are checked to see if there are problems with the data distribution. Since the median is the primary calculation of interest, removing outliers normally has a minimal impact to the reported figures. Calculation of the median charge and median allowed are then performed for each payer.
An additional review of the output takes place to determine if the results are reasonable. Unless it can be explained, major differences in charge amounts between payers for the same service would be considered an issue. We assume patients will not face different charges due to which insurance company they are covered by. Major deviations from the expected costs would also undermine the use of the payment data. Such deviations may include small insurance companies with dramatically lower payment rates, or unlikely differences between managed care and indemnity lines of business within the same insurer. Usually the smallest insurance companies have the least favorable contracts, and managed care insurance products have the deepest discounts.
The following is an example of how the data are selected to report on outpatient bilateral mammograms:
Inpatient mammograms are removed.
Patient records with a bilateral mammogram CPT code of 76092 (mammogram "screening") are selected. Then, anything else the provider(s) performed during the visit is bundled into the analysis.
All patients who had a bilateral mammogram 76091 ("mammogram diagnostic") or G0202 ("diagnostic digitization") on the same day are removed from the analysis. They are expected to cost more, inflate the results, and create comparability issues.
Patients with total charges exceeding the top five percentile (across all providers) or below the one percentile are removed.
Patients with total charges in the lowest one percentile or highest fifth percentile are removed from the analysis (specific to the provider organization).
Results are reviewed the median calculations are checked for reasonableness.
Inpatient Admissions
The primary difference between inpatient and outpatient analyses are the criteria for selecting patients, and the bundling of claim records over several days.
For admissions that are considered “medical” in nature, rather than surgical, patients are selected based on the primary diagnosis instead of a procedure code. The primary diagnosis codes are chosen using the Medicare Diagnosis Related Groups (DRG) clustering methodology, but not the actual DRG assignment.
For surgical admissions, the patients will be selected based on a procedure code.
Building an admission is performed by combining all claims that take place within a day of one another, when there is at least one claim on any of the days with a primary diagnosis code included within the DRG diagnosis assignment. Bundling also considers the admission date and discharge date when those fields are available.
Although a bundling of claim records takes place on both the inpatient and outpatient basis, we do not consider this to be an “episode” of care. The more common definition of an episode of care includes all visits and treatment for a particular condition, often over several weeks or months. HealthCost focuses solely on a single health care visit or admission, not the whole episode of care.
Updated 6.4.2007