Skip to content

Breast Cancer Detection: This project uses machine learning techniques to classify breast cancer as malignant or benign based on features extracted from breast mass biopsies. Models used include SVM, Decision Tree, Naive Bayes, and K-Nearest Neighbors.

Notifications You must be signed in to change notification settings

maryamesh/Breast-Cancer-detection-using-Machine-Learning

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 

Repository files navigation

image

Breast Cacer detection using ML algorithms

This project uses machine learning techniques to detect breast cancer based on various features extracted from breast mass biopsies. The dataset contains features computed from a digitized image of a fine needle aspirate (FNA) of a breast mass.

Project Overview

The goal of this project is to build and evaluate several classification models to predict whether a breast mass is malignant or benign. The models used in this project include:

  • Decision Tree Classifier (CART)
  • Support Vector Machine (SVM)
  • Gaussian Naive Bayes (NB)
  • K-Nearest Neighbors (KNN)

Dataset

The dataset used in this project is the Breast Cancer Wisconsin (Diagnostic) Data Set. It contains 569 instances of various features computed from breast mass biopsies.

Features

  • radius_mean
  • perimeter_mean
  • concave_points_mean
  • ... (additional features in the dataset)

Target

  • diagnosis: Binary variable indicating whether the tumor is malignant (M) or benign (B). This has been converted to 1 for malignant and 0 for benign for the purpose of modeling.

Files

  • data.csv: The dataset file.
  • newbreast-cancer-prediction-using-machine-learning.ipynb: The main Python script that includes data preprocessing, model training, evaluation, and visualization.

Dependencies

  • Python 3.x
  • NumPy
  • Pandas
  • Matplotlib
  • Seaborn
  • scikit-learn

You can install the necessary dependencies using:

`pip install numpy pandas matplotlib seaborn scikit-learn`

Data Preprocessing

Load the dataset and inspect the first few rows. Convert the diagnosis column to binary values (1 for malignant, 0 for benign). Drop unnecessary columns and set the index.

Exploratory Data Analysis (EDA)

Visualize the correlation matrix using a heatmap to understand the relationships between features. Model Training and Evaluation The following steps outline the process of training and evaluating the models:

  1. Split the dataset into training and test sets.
  2. Train multiple models using cross-validation to find the best performing model.
  3. Evaluate the models on the test set and compare their performance using accuracy, confusion matrix, and classification report.

Contributing

Contributions are welcome! Please fork this repository and submit a pull request for any feature requests, bug fixes, or improvements.

License

This project is licensed under the MIT License.

Acknowledgments

The dataset is publicly available at the UCI Machine Learning Repository. Thanks to the open-source community for providing tools and libraries used in this project.

About

Breast Cancer Detection: This project uses machine learning techniques to classify breast cancer as malignant or benign based on features extracted from breast mass biopsies. Models used include SVM, Decision Tree, Naive Bayes, and K-Nearest Neighbors.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published