Coffee Beans led on raised African beds waiting to be sorted

Open Source | Using Machine Learning To Detect Coffee Bean Defects

Back in 2023, Ivan Barinskiy, a Computer Science student at the University of York mentioned that he was considering implementing a machine learning model to detect roasted coffee bean defects, and reached out to me to see if I would be willing to contribute to his third year dissertation alongside Nottingham based coffee roaster, Vibe With Coffee.

After talking through the details, I started saving every coffee bean defect that I could get my hands on, focusing specifically on Quakers (Sovda have a great explanation on Quakers here), Under-roasted coffee beans, Burnt coffee beans Insect Damaged coffee beans (such as coffee borer beetle damage) and Mould Damaged coffee beans - basically any defect that can be identified visually. 

 

Lots of coffee beans were sent for evaluation:

- Wush Wush 
- Caturra
- SL-28
- Ethiopian Heirloom
- Catuai
- Red Bourbon
- Typica
- Guji & Yirgacheffe Landraces

And included many different processing methods:

- Carbonic Maceration
- Fruit Infused
- Honey
- Natural
- Anaerobic Natural
- Washed


Ivan's implementation method involved training several image classification algorithms to determine which one worked best.

In his own words, Ivan describes them as the following:

  • KNN-based Classifier: The images colour histograms were compared using the Canberra distance metric (KNN Stands for K-Nearest-Neighbours)
  • MobileNet V2: This model was trained from scratch specifically for this task.
  • Pre-trained MobileNet: This model had already been trained on a large set of general images (the ImageNet dataset) before being fine-tuned for coffee beans.
  • 50-layer ResNet: Another model pre-trained on the ImageNet dataset, known for its depth and ability to handle complex image recognition tasks.
  •  

     

    Here are the initial results from Ivan's trial:

    Which demonstrated that the Pre-trained MobileNet and ResNet 50 were the most effective at detecting defects, with the highest overall accuracy of 95%. After some training renditions, Ivan's machine learning models MobileNet and ResNet 50 models enabled him to produce an incredibly accurate set of results.

     

    The Full publication of Ivan's work can be found here:
    Click here for the full publication

    Along with the full data set, which can be found here:

    Barinskiy, I. (2024). Identifying defects in roasted coffee beans using image classification algorithms [Data set]. https://github.com/dont-text-me/RoastDefectsDataset

    You can find the abridged data here: 
    https://bean-classifier-demo.vercel.app/

     

    And finally, you can find the full open source publication including the relevant download links here: https://github.com/dont-text-me/PRBX-work

     

    Don't forget to check out our current coffee offerings to get your hands on some of the best beans out there (Defect free ;) ) 

    Back to blog

    Leave a comment

    Please note, comments need to be approved before they are published.