Hospital Quality Reports Analysis
Process and analyze XML files from the QB-Referenzdatenbank containing structured quality reports of German hospitals (2023)
File Content Overview
These XML files contain structured quality reports from German hospitals, managed by IQTIG/BQS.
Includes hospital information, services, quality indicators, personnel data, and more.
Tag Structure Analysis
<hospital>
(parent)
<info>
<services>
<quality>
Analytical Opportunities
- Performance benchmarking
- Regional hospital comparisons
- Trend analysis over time
- Quality indicator correlations
analysis_script.py
# XML Analysis Script for Hospital Quality Reports import os import xml.etree.ElementTree as ET import pandas as pd # Directory paths input_dir = "/Users/alikhosravi/Desktop/XLMNEW/xml_2023" output_dir = "/Users/alikhosravi/Desktop/XLMNEW/Result/2. Structure/XML file structure" # Initialize data structures tag_inventory = {} hospital_data = [] # Process each XML file for filename in os.listdir(input_dir): if filename.endswith(".xml"): filepath = os.path.join(input_dir, filename) tree = ET.parse(filepath) root = tree.getroot() # Extract hospital information hospital = { 'name': root.find('hospital/name').text, 'ik_code': root.find('hospital/ik_code').text, 'location': root.find('hospital/address/city').text } hospital_data.append(hospital) # Build tag inventory for elem in root.iter(): tag_path = ' > '.join([e.tag for e in elem.findall('./ancestor-or-self::*')]) tag_inventory[tag_path] = tag_inventory.get(tag_path, 0) + 1 # Save results pd.DataFrame(hospital_data).to_excel(os.path.join(output_dir, 'hospital_info.xlsx')) pd.DataFrame(list(tag_inventory.items()), columns=['Tag_Path', 'Count']).to_csv( os.path.join(output_dir, 'tag_inventory.csv'), index=False)
sample_hospital.xml
<?xml version="1.0" encoding="UTF-8"?> <hospital_report> <hospital> <info> <name>Charité - Universitätsmedizin Berlin</name> <ik_code>260500499</ik_code> <address> <street>Charitéplatz 1</street> <city>Berlin</city> <zip>10117</zip> </address> </info> <services> <department name="Cardiology" beds="120"/> <department name="Neurology" beds="85"/> </services> <quality_indicators> <indicator name="Patient Satisfaction" value="92.5"/> <indicator name="Readmission Rate" value="8.2"/> </quality_indicators> </hospital> </hospital_report>
Data Categories in XML Files
Hospital Information
- Name and address
- IK code
- Ownership
Services & Departments
- Department types
- Bed counts
- Special treatments
Quality Indicators
- Treatment outcomes
- Performance metrics
- Guideline compliance
Output Reports
hospital_info.xlsx
Excel spreadsheet with hospital information
Saved to:
/Result/2. Structure/
tag_inventory.csv
Complete XML tag hierarchy inventory
Saved to:
/Result/2. Structure/