Challenge
The objective was to build an automated system capable of processing and classifying large-scale chest image datasets collected from multiple hospitals for breast cancer treatment analysis.

The objective was to build an automated system capable of processing and classifying large-scale chest image datasets collected from multiple hospitals for breast cancer treatment analysis.
Investigated imaging datasets from multiple hospitals to understand inconsistencies in file naming, metadata availability, and image quality.

Designed modular Python pipelines responsible for preprocessing, metadata parsing, classification, and dataset restructuring.
Trained and optimized a YOLOv8 classification model, experimenting with preprocessing and augmentation techniques to improve generalization.

Evaluated model performance using precision, recall, and F1-score, followed by manual verification across datasets from different hospitals.

This project produced a reliable machine learning pipeline for automated medical image classification and dataset organization. By standardizing datasets from multiple hospitals and automating processing workflows, the system provides a scalable foundation for future BCCT research.

The project has demonstrated how modern technologies such as Python, YOLOv8, PyTorch can be combined to create Internship – DCPT (Medical Image Classification Pipeline for Breast Cancer Research)—a scalable and modular system offering high precision and user-friendliness.