Riedl's Blog

Building an Object Detection Model with TFLite Model Maker

Published on 20 Aug 2022
TFLite Model Maker

I’ve been working on building an object detection model training pipeline using TFLite Model Maker. The coding is relatively easy but I ran into some issues with the data format, GPU support, and package version conflicts. The first issue I came across was the data formatting.

The docs specify that the DataLoader is compatible with various image formats but when using PNG images, I got the following error:

ValueError: Image format not JPEG

After some searching I came across this thread. The solution posed there was just to convert the images to JPEG. This was easy enough for me since I had an existing build script for storing all of the training data. I’m not entirely sure why the data setup I was using did not support the PNG format.

After resolving the data formatting issue, the next issue I came across was not having GPU support on my deep learning rig. When I would run the training script, I would get the following warnings and the training would default to using the CPU.

2022-08-20 20:19:46.000113: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory
2022-08-20 20:19:46.000145: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcublas.so.11'; dlerror: libcublas.so.11: cannot open shared object file: No such file or directory
2022-08-20 20:19:46.000173: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcublasLt.so.11'; dlerror: libcublasLt.so.11: cannot open shared object file: No such file or directory
2022-08-20 20:19:46.000198: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcufft.so.10'; dlerror: libcufft.so.10: cannot open shared object file: No such file or directory
2022-08-20 20:19:46.000224: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcurand.so.10'; dlerror: libcurand.so.10: cannot open shared object file: No such file or directory
2022-08-20 20:19:46.000248: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcusolver.so.11'; dlerror: libcusolver.so.11: cannot open shared object file: No such file or directory
2022-08-20 20:19:46.000273: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcusparse.so.11'; dlerror: libcusparse.so.11: cannot open shared object file: No such file or directory
2022-08-20 20:19:46.000299: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudnn.so.8'; dlerror: libcudnn.so.8: cannot open shared object file: No such file or directory
2022-08-20 20:19:46.000306: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1850] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.
Skipping registering GPU devices...

I did some searching and the obvious answer to the problem was that the CUDA backend was not properly installed. I confirmed this by also trying to import TensorFlow and got the similar warning below.

2022-08-20 20:23:52.731299: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory

The solution to the problem is to install CUDA before installing TFLite model maker. I took the steps below from the TensorFlow installation page.

conda create --name tf-lite python=3.8
conda activate tf-lite
pip install --upgrade pip
conda install -c conda-forge cudatoolkit=11.2 cudnn=8.1.0
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$CONDA_PREFIX/lib/
pip install --use-deprecated=legacy-resolver tflite-model-maker
pip install pycocotools
pip install hydra-core --upgrade
pip install tensorflow==2.8.0

You do not have to install hydra-core if you do not plan to use hydra but I use it in my pipeline.

The final issue I had showed up when trying to export the trained model. I got the following error during the export process:

...line 3015, in Pack
    mean = builder.EndVector()
TypeError: EndVector() missing 1 required positional argument: 'vectorNumElems'

I found this discussion while researching the error. The solution one user found was to update TFLite Model Maker to the nightly build version. The root of the issue is a conflict with the flatbuffers package and updating the the nightly build seems to resolve the issue for me. This upgrade can be done with the command below.

pip install tflite-model-maker-nightly

After resolving all these issues, I was able to successfully train and export an object detection model using TFLite Model Maker. Hope this helps out someone else who encounters similar problems.