Computer Vision for Measuring Boxes: a comparison of Model-based Bayesian Inference and Convolutional Neural Networks

Elliot Munro

Back

Computer Vision for Measuring Boxes: a comparison of Model-based Bayesian Inference and Convolutional Neural Networks

Graduate Thesis/Dissertation

Open access

Computer Vision for Measuring Boxes: a comparison of Model-based Bayesian Inference and Convolutional Neural Networks

Elliot Munro

Master of Science - MSc, University of Otago

University of Otago

2021

Handle:

https://hdl.handle.net/10523/10753

Abstract

New Zealand

Computer-Vision

dimensioning

MCMC

Bayesian-Inference

CNN

This thesis compares two approaches for measuring the dimensions of boxes using computer vision (CV). The first approach is model-based Bayesian Inference (MBI), which uses a geometric cuboid model as well a camera-conveyor system model. The second approach is using Convolutional Neural Networks (CNNs). The methods were compared on: statistical scoring rules that were applied to posterior probability density function estimates, training and testing times, robustness to noise addition, and cuboid edge roundedness. Convolutional Neural Network (CNN) training data was generated by photo-realistic rendering of computer generated 3D CAD models, consisting of: 6,000 training images, 2,000 validation images, 2,000 testing images spread across five textures. Training was performed using Keras. MCMC was implemented in Python 3 using the same test data as used with CNN models. Methods of the CNN and MBI approaches were also briefly compared on images of real boxes. It was found that Corner Detection (CD) performance was the strong limiting factor for MBI performance, which was able to form tight posterior estimates when complete CD occurred. Due to low mixing, sometimes MCMC sampling runs became stuck in local minima, causing overly tight estimates and leading to poor scores. With full CD and good mixing, MCMC scores would tend to outperform CNN scores. MCMC suffered from an approximately 1000 times longer testing time than CNN (70 s vs 70 ms). However, CNN required significant time (strongly reduced by GPU) and data to pre-train the model before use. The robustness of the techniques were measured by systematically adding gaussian noise to images, as well as rounding the edges of the boxes. It was found that above a threshold noise variance of 100 (images used a 255 RGB colour scale) the CD failed to detect corners, breaking the MBI and causing poor performance compared to CNN, which broke more gracefully. Likewise, when box edges were rounded it was found that above an edge rounding threshold of 0.15 (0=cuboid and 1=sphere) CD failed, breaking the MBI and causing poor performance compared to CNN, which broke more gracefully. A single-image-input (SII) CNN model demonstrated greater robustness with respect to noise addition than a two-image-input (TII) CNN model. There was negligible difference between SII and TII with respect to box edge rounding.

Files and links (1)

pdf

MSc_Thesis.pdfDownload View

Metrics

102 File views/ downloads

321 Record Views

Details

Record Identifier: 9926478402601891
Title: Computer Vision for Measuring Boxes: a comparison of Model-based Bayesian Inference and Convolutional Neural Networks
Creators: Elliot Munro
Contributors: Tim Molteno (Advisor / Supervisor)
Academic Unit: Physics
Publisher: University of Otago
Degree Awarded: Master of Science - MSc
Project Type: Thesis - Masters
Awarding Institution: University of Otago
Date published ; e-published: 2021
Wikidata ID: Q112956191
Language: English
Resource Type ; Subtype: Graduate Thesis/Dissertation
Format: application/pdf