This project is a web application that allows users to assess their risk of heart disease by entering various health metrics. The app uses a machine learning model to predict whether a user is at risk of heart disease based on the input data.
Heart disease describes a range of conditions that affect your heart. These diseases include blood vessel diseases, such as coronary artery disease; heart rhythm problems (arrhythmias); and heart defects you're born with (congenital heart defects). This application helps in predicting the risk of heart disease based on user input, leveraging a trained machine learning model.
- User-friendly interface for inputting health metrics.
- Machine learning model to predict heart disease risk.
- Informative sections about heart disease and how to reduce the risk.
- Responsive design and visually appealing UI.
- Clone the repository:
git clone https://github.com/your-username/heart-disease-risk-assessment.git
- Navigate to the project directory:
cd heart-disease-risk-assessment
- Create a virtual environment:
python -m venv venv
- Activate the virtual environment:
- On Windows:
venv\Scripts\activate
- On macOS and Linux:
source venv/bin/activate
- On Windows:
- Install the required packages:
pip install -r requirements.txt
- Start the Streamlit app:
streamlit run app.py
- Open your web browser and navigate to the provided local URL (usually http://localhost:8501).
The following input parameters are required to predict the risk of heart disease:
- Age
- Sex (Male/Female)
- Chest Pain Type (typical angina/atypical angina/non-anginal/asymptomatic)
- Resting Blood Pressure
- Cholesterol
- Fasting Blood Sugar > 120 mg/dl (True/False)
- Resting ECG (normal/lv hypertrophy/st-t abnormality)
- Maximum Heart Rate Achieved
- Exercise Induced Angina (True/False)
- ST Depression Induced by Exercise
- Slope of Peak Exercise ST Segment (upsloping/flat/downsloping)
- Number of Major Vessels Colored by Fluoroscopy
- Thalassemia (normal/fixed defect/reversable defect)
Based on the input parameters, the application will predict whether the user is at risk of heart disease. The result will be displayed in a highlighted box along with relevant images:
- At Risk: The prediction box will have a message indicating the risk along with a relevant image.
- Not at Risk: The prediction box will have a message indicating no risk along with a relevant image.
The machine learning model was trained using the UCI Heart Disease dataset. Various algorithms were tested, including Logistic Regression, Decision Tree, Random Forest, and Support Vector Machine (SVM).
- The data was cleaned, missing values were filled, and categorical variables were encoded.
- The dataset was split into training and testing sets.
- SMOTE was applied to balance the dataset.
- The data was scaled using StandardScaler.
Several models were trained and evaluated:
- Logistic Regression achieved an accuracy of 87%.
- Decision Tree was chosen for deployment due to its high accuracy of 97%.
- Random Forest achieved an accuracy of 85%.
- Support Vector Machine (SVM) achieved an accuracy of 89%.
The Decision Tree model was ultimately selected for deployment due to its superior performance.