In an era where every dollar matters, hospitals are turning to data analysis to revolutionize their operations and uncover avenues for substantial cost savings. The insights gathered from this analysis will allow hospitals identify inefficiencies, streamline processes and optimize resource utilization and allow data-driven decisions that can lead to strategic adjustments in staffing schedules and facility usage, refining of treatment protocols, reduce unnecessary procedures and negotiate better procurement deals resulting in immediate cost reductions.
SQL is the analytical backbone that empowers hospitals to dive deep into patient data and allows for rapid and precise analysis of large datasets through meticulous SQL queries utilizing SQL server tool and identifying patterns and insights that will allow the hospitals the transformative power of data analysis, making every dollar count in the pursuit if efficient health care delivery, ultimately saving millions on operational costs.
The dataset used was acquired from https://www.kaggle.com/code/iabhishekofficial/prediction-on-hospital-readmission/data?select=diabetic_data.csv
As a data analyst the dataset I analyzed was for Diabetes patients hospital readmission and care where the objective was to understand the patients that are getting re-admitted again within a certain period. Hospital readmission rates for certain conditions like diabetes care are considered indicator of hospital quality and adversely affects the care of costs. A program was introduced to reduce readmissions that to improve quality of care for patients and reduce health care spending. It was found that American hospitals spent $41 billion on diabetic patients who got readmitted within 30days of discharge. To understand the predictors of readmissions and save millions of dollars and to improve quality of care a data from 130 US hospitals for 10years was collected and analyzed here. This dataset had 101,766 rows and 50 columns and presented prediction of hospital readmission some of the use cases I have explored here are as follows:
1 -The deans of hospitals was curious to find what the distribution of time spent in the hospital looked like for patients and believed that most patients stayed less than 7 days and those that exceeded the 7 days stay were having other health complications.
Although SQL is not a visualization tool it can give good representation of frequency studies. Using the following query i was able to build a histogram.
The query was constructed to separate the time spent in hospital in buckets and the frequency of number of days was counted and aggregated using GROUP BY for time in hospital and ORDER BY number of frequency which formed the bar.
It was evident that most patients stayed less than 7 days with 3 days being the max duration of hospital stay. Longer stays can be attributed to other underlying conditions that need more interventions.
2- Explored the relationship between number of lab procedures and time spent in the hospital. Specifically, they asked for a few, average, and many procedure group.
The director of data analysis group wanted to know if there was any relation between number of lab procedures and time spent in hospital. As this was large data I decided to break the number of lab procedures data as 0-25 tests as 'few' into 26-55 tests as 'average' and over 55 tests as 'many' and ran the query in SQL server to get the following result:
It can be observed that patients that had average days of over 8-13 days hospital
had 'many' test those that stayed for 4 days had 'average' and fewer than 4 days stay had 'few' number of lab procedures performed. From this result it can be attributed that number of lab procedures has direct co-correlation with time in hospital however, there can be presence of any other health conditions that needed enhanced intervention which will need to be further analyzed.
3-To make sure the hospital followed all compliance, ethical and risk procedures the Nurse Director wanted to understand it the hospital is subconsciously treating races differently for which I wrote this following query which break down the population by race and number of lab procedures. Is there a subconscious bias in treating races differently. Show the average number of lab procedures broken down by race. This required to form a join between two tables 'health' and 'patient' as below:
The following result was collected from the query where it was cleared that there was no subconscious bias in treating patients from different races differently as it can be observed that the number of lab procedures ordered and performed was almost same for all races
4- Hospital director requested a list of medial specialties that have an average number of procedure count above 2.5 and total procedure count above 50. To accomplish this task following query was wrote that drilled down the medical specialty having an average of more than 50 procedures and aggregated it and further filtering the result using HAVING clause as having clause can filter the results of GROUP BY query while WHERE clause specially filters on aggregated data.
It showed that 6 medical specialties with Thoracic surgery was the doing most procedures and out of 6 specialties 5 groups were performing over average 2.5 procedures with Cardiology having the maximum count of 5352 surgeries and Radiology being the least at 53.
5- Provide a list of all patients who had an emergency but left the hospital faster than the average. I wrote a CTE (common table expression) as this query involved calculating avg_time based on certain conditions as it creates a temporary result set which can be used in the main query for as many times the query will run, it improves the readability and maintainability of the code.
In this query it was observed that the avg_time in hospital was 4.39 and 33,684 patients out of 101,766
attributed to 33%. This shows that 33% of the patient population does leave the hospital is less than avg_time in the hospital. Which can be a indication of quality of care and patient management.
6- What is the readmission rate for patients? The hospital director was very keen on identifying the insights pertaining to readmissions so I wrote a query that showed patients based on gender and number of emergencies, avg_medications and total time in hospital by forming a Join between two tables and aggregating using GROUP BY clause.
It was observed that 5205 male patients which was 5% of the male patient population were readmitted in less than 30days of having spent 4.7 days in hospital and having average medication in the range of 16 medications and 6152 female patients which was 6% of the female patient population that had a total time in hospital of 4.8 were readmitted in less than 30 days and they showed the avg medications of over 17. This gives a good insight that points to good health care quality and management of the hospital as this readmittance points to possibility of many underlying conditions being present as indicated by number of emergency and average medications.
7- Was there a potential disparity in healthcare utilization among different races? This query was build using aggregates SUM function and rounding off to one digit after decimal.
This query gives us good insight into the healthcare utilization as it is observed that most of the services are being utilized by Caucasian patients with 76099 out of 101,766 which attributes to 75 % of the patient population utilizing all healthcare services. African American patients are at 19% of patient population with Asian's being the least utilizing the healthcare services at 0.63% which is less than 1% and Hispanic at 2% of the patient population coming to the hospital.
Diabetes a prevalent health condition globally affecting millions of individuals presents increasing incidents and contributes to its status of leading cause of hospitalization and mortality as for individuals with diabetes are at risk of developing cardio-vascular diseases, kidney problems, neuropathy, infections and much more which needs hospitalization and medical intervention.
The analysis further gave more insights about patients suffering from diabetes with a query that was build using window function and OVER() clause and partition by for in analysis.
This identifies the patient population to be taking diabetes medication and separated by gender and it shows females are at higher incidence of being diabetic at 41% followed by 35% male patient population. This makes it more relevant to identify markers for diabetic care and follow them for improving quality healthcare.
Impact of HbA1c data was collected and analyzed for identifying diabetes care and patient readmittance incidences following this query it provided further insights
This analysis gave a deep insight that can be very useful for the hospital management as it showed that patients that were not collecting A1c results were readmitted back in less than 30 days for both male as well as female patients also patients that were readmitted after 30 days which potentially indicated other underlying complications were also not tracking their A1c results so does patients who were not readmitted but spent average time in hospital but did not track A1c were the maximum number of patients that were utilizing the hospital services.
Finally a query to study the gender demographic it was observed that more female patients needed health care services compared to their male counterparts and had more incidents of hospitalization and needed more health care services in comparison to male counterparts which can be attributed to physiological differences of genders and can be a factor that can be included in improving care quality by including preventative measures like A1c testing and overall health care management geared towards females as they make 54% of patient population.
In conclusion here are some recommendations that can help the hospital board to improve healthcare quality and patient satisfaction and lower operational cost
Optimize resource allocation to ensure efficient operations with optimizing staffing levels to manage the influx of patients
Engage in community outreach programs to educate the community about the effects of diseases like diabetes and heart health to begin with to improve the preventative care initiatives to reduce the burden on hospital
Staff training and development along with ongoing training for healthcare professionals to enhance quality of healthcare.
Invest in Health Information Technology (HIT) to adopt and enhance electronic health records (EHRs) and other health information technologies to streamline processes, reduce errors and improve communication among healthcare providers.
I truly appreciate you taking time to read this article, Thank you!! Please feel free to share a feedback, comment and connect with me to to get better insights. I am currently seeking opportunities in the field of Data analytics and will appreciate your comments and feedback and connection.
Comments