Open Source | Using Machine Learning To Detect Coffee Bean Defects

Open Source | Using Machine Learning To Detect Coffee Bean Defects

Last updated:

Coffee Defect Detection Collaboration

Back in 2023, Ivan Barinskiy, a Computer Science student at the University of York mentioned that he was considering implementing a machine learning model to detect roasted coffee bean defects, and reached out to me to see if I would be willing to contribute to his third year dissertation alongside Nottingham based coffee roaster, Vibe With Coffee.

After talking through the details, I started saving every coffee bean defect that I could get my hands on, focusing specifically on Quakers, Under-roasted coffee beans, Burnt coffee beans Insect Damaged coffee beans (such as coffee borer beetle damage) and Mould Damaged coffee beans - basically any defect that can be identified visually.

Roasted coffee bean batch contours used for defect detection

Coffee Beans Evaluated

Lots of coffee beans were sent for evaluation:

- Wush Wush
- Caturra
- SL-28
- Ethiopian Heirloom
- Catuai
- Red Bourbon
- Typica
- Guji & Yirgacheffe Landraces

Chart showing coffee variety counts used in the roasted coffee defect dataset

And included many different processing methods:

- Carbonic Maceration
- Fruit Infused
- Honey
- Natural
- Anaerobic Natural
- Washed

Chart showing coffee processing method counts used in the roasted coffee defect dataset

If you are interested in how coffee variety can influence flavour and visual appearance, you may also enjoy our guide to Pink Bourbon coffee.

Machine Learning Method

Ivan's implementation method involved training several image classification algorithms to determine which one worked best.

In his own words, Ivan describes them as the following:

  • KNN-based Classifier: The images colour histograms were compared using the Canberra distance metric (KNN Stands for K-Nearest-Neighbours)
  • MobileNet V2: This model was trained from scratch specifically for this task.
  • Pre-trained MobileNet: This model had already been trained on a large set of general images (the ImageNet dataset) before being fine-tuned for coffee beans.
  • 50-layer ResNet: Another model pre-trained on the ImageNet dataset, known for its depth and ability to handle complex image recognition tasks.

Initial Results

Here are the initial results from Ivan's trial:

Initial machine learning model results for roasted coffee defect detection

Which demonstrated that the Pre-trained MobileNet and ResNet 50 were the most effective at detecting defects, with the highest overall accuracy of 95%. After some training renditions, Ivan's machine learning models MobileNet and ResNet 50 models enabled him to produce an incredibly accurate set of results.

ResNet 50 model results showing 95 percent accuracy for roasted coffee defect detection

Full Publication and Dataset

The Full publication of Ivan's work can be found here:
Click here for the full publication

Along with the full data set, which can be found here:

Barinskiy, I. (2024). Identifying defects in roasted coffee beans using image classification algorithms [Data set]. https://github.com/dont-text-me/RoastDefectsDataset

You can find the abridged data here:
https://bean-classifier-demo.vercel.app/

And finally, you can find the full open source publication including the relevant download links here: https://github.com/dont-text-me/PRBX-work

Don't forget to check out our current coffee offerings to get your hands on some of the best beans out there (Defect free ;) )

Work With Us

Looking to collaborate with us on a project like this one?
Drop us an enquiry here!

↑ Back to top

Back to blog

Leave a comment

Please note, comments need to be approved before they are published.