Java developers can quickly implement image classification or object detection using pre-trained machine learning models.
Interest in machine learning has grown steadily over recent years. Specifically, enterprises now use machine learning for image recognition in a wide variety of use cases. There are applications in the automotive industry, healthcare, security, retail, automated product tracking in warehouses, farming and agriculture, food recognition, and even real-time translation by pointing your phone’s camera. Thanks to machine learning and visual recognition, machines can detect cancer and COVID-19 in MRIs and CT scans.
Today, many of these solutions are primarily developed in Python using open-source and proprietary ML toolkits, each with their own APIs. Despite Java’s popularity in enterprises, there aren’t any standards to develop machine learning applications in Java. JSR-381 was developed to address this gap by offering Java application developers a set of standard, flexible, and Java-friendly APIs for Visual Recognition (VisRec) applications such as image classification and object detection.
JSR-381 has several implementations that rely on machine learning platforms such as TensorFlow, MXNet, and DeepNetts. One of these implementations is based on Deep Java Library (DJL), an open-source library developed by Amazon to build machine learning in Java. DJL offers hooks to popular machine learning frameworks such as TensorFlow, MXNet, and PyTorch by bundling requisite image processing routines, making it a flexible and simple choice for JSR-381 users.
In this article, we demonstrate how Java developers can use the JSR-381 VisRec API to implement image classification or object detection with DJL’s pre-trained models in less than 10 lines of code. We also demonstrate how users can use pre-trained machine learning models in less than 10 minutes with two examples. Let’s get started!
Recognizing handwritten digits using a pre-trained model
A useful application and ‘hello world’ example of visual recognition is recognizing handwritten digits. Recognizing handwritten digits is seemingly easy for a human. Thanks to the processing capability and cooperation of the visual and pattern matching subsystems in our brains, we can usually correctly discern the correct digit from a sloppily handwritten document. However, this seemingly straightforward task is incredibly complex for a machine due to many possible variations.
This is a good use case for machine learning, specifically visual recognition. The JSR 381 repo has a great example that uses the JSR-381 VisRec API to correctly recognize handwritten digits. This example compares handwritten digits against the MNIST handwritten digit dataset, a publicly available database of over 60K images. Predicting what an image represents is called image classification. Our example looks at a new image and attempts to determine the probabilities of what specific digit it is.
For this task, the VisRec API provides an ImageClassifier interface that can be specialized for specific Java classes for input images using generic parameters. It also provides a classify() method that performs image classification and returns a Map of class probabilities for all possible image classes. By convention in the VisRec API, each model provides a static builder() method that returns a corresponding builder object and allows the developer to configure all relevant settings, such as imageHeight and imageWidth.
By using pre-trained models and simple configuration, Java developers can quickly implement sophisticated machine learning solutions for image processing. This approach significantly reduces the barrier to entry for incorporating machine learning into Java applications, making powerful tools accessible for a wide range of use cases