Coffee Beans led on raised African beds waiting to be sorted

Open Source | Using Machine Learning To Detect Coffee Bean Defects

Back in 2023, Ivan Barinskiy, a Computer Science student at the University of York mentioned that he was considering implementing a machine learning model to detect roasted coffee bean defects, and reached out to me to see if I would be willing to contribute to his third year dissertation alongside Nottingham based coffee roaster, Vibe With Coffee.

After talking through the details, I started saving every coffee bean defect that I could get my hands on, focusing specifically on Quakers (Sovda have a great explanation on Quakers here), Under-roasted coffee beans, Burnt coffee beans Insect Damaged coffee beans (such as coffee borer beetle damage) and Mould Damaged coffee beans - basically any defect that can be identified visually.

Lots of coffee beans were sent for evaluation:

- Wush Wush
- Caturra
- SL-28
- Ethiopian Heirloom
- Catuai
- Red Bourbon
- Typica
- Guji & Yirgacheffe Landraces

And included many different processing methods:

- Carbonic Maceration
- Fruit Infused
- Honey
- Natural
- Anaerobic Natural
- Washed

Ivan's implementation method involved training several image classification algorithms to determine which one worked best.

In his own words, Ivan describes them as the following:

KNN-based Classifier: The images colour histograms were compared using the Canberra distance metric (KNN Stands for K-Nearest-Neighbours)

MobileNet V2: This model was trained from scratch specifically for this task.

Pre-trained MobileNet: This model had already been trained on a large set of general images (the ImageNet dataset) before being fine-tuned for coffee beans.

50-layer ResNet: Another model pre-trained on the ImageNet dataset, known for its depth and ability to handle complex image recognition tasks.

Here are the initial results from Ivan's trial:

Which demonstrated that the Pre-trained MobileNet and ResNet 50 were the most effective at detecting defects, with the highest overall accuracy of 95%. After some training renditions, Ivan's machine learning models MobileNet and ResNet 50 models enabled him to produce an incredibly accurate set of results.

The Full publication of Ivan's work can be found here:
Click here for the full publication

Along with the full data set, which can be found here:

Barinskiy, I. (2024). Identifying defects in roasted coffee beans using image classification algorithms [Data set]. https://github.com/dont-text-me/RoastDefectsDataset

You can find the abridged data here:
https://bean-classifier-demo.vercel.app/

And finally, you can find the full open source publication including the relevant download links here: https://github.com/dont-text-me/PRBX-work

Don't forget to check out our current coffee offerings to get your hands on some of the best beans out there (Defect free ;) )

Looking to collaborate with us on a project like this one?
Drop us an enquiry here!

Back to blog

Item added to your cart

Open Source | Using Machine Learning To Detect Coffee Bean Defects

Leave a comment