MatchError in VisionEncoderDecoder.getModelOutput due to missing ONNX case #14546

krumpi36 · 2025-04-01T17:43:53Z

Is there an existing issue for this?

I have searched the existing issues and did not find a match.

Who can help?

No response

What are you working on?

I am using the VisionEncoderDecoderForImageCaptioning annotator in a my Java application. The issue surfaced after upgrading to a recent version of Spark NLP and attempting to use the newer image captioning models.

I am trying to use the model image_captioning_vit_gpt2_en. This model utilizes the ONNX backend (encoder_model.onnx, decoder_model.onnx) instead of the previous TensorFlow format (vision_encoder_decoder_tensorflow). The error occurs during the model loading/initialization phase when the ONNX engine is detected and used.

Current Behavior

When initializing VisionEncoderDecoderForImageCaptioning model that relies on the ONNX engine, the application crashes with a scala.MatchError. The error message indicates that the string "onnx" was not handled in a match statement within the com.johnsnowlabs.ml.ai.VisionEncoderDecoder.getModelOutput method.

The specific error is:


java.lang.ExceptionInInitializerError
	at java.base/java.lang.reflect.Constructor.newInstanceWithCaller(Constructor.java:500)
	at java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:481)
	at java.base/java.util.Optional.orElseGet(Optional.java:364)
	at java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183)
	at java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:197)
	...
	at java.base/java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:596)
	at java.base/java.util.ArrayList.forEach(ArrayList.java:1511)
	at java.base/java.util.ArrayList.forEach(ArrayList.java:1511)
Caused by: java.lang.IllegalStateException: Unable to load model
	at cz.xxx.xxx.sparknlp.AbstractSparkAnalyzer.load(AbstractSparkAnalyzer.java:378)
	at cz.xxx.xxx.sparknlp.AbstractSparkAnalyzer.<init>(AbstractSparkAnalyzer.java:81)
	at cz.xxx.xxx.sparknlp.ImageAnalyzer.<init>(ImageAnalyzer.java:78)
	at cz.xxx.xxx.sparknlp.image.ImageCaptionGeneration.<init>(ImageCaptionGeneration.java:40)
	at cz.xxx.xxx.sparknlp.images.ImageCaptionGenerationTest.<clinit>(ImageCaptionGenerationTest.java:52)
	... 60 more
Caused by: scala.MatchError: onnx (of class java.lang.String)
	at com.johnsnowlabs.ml.ai.VisionEncoderDecoder.getModelOutput(VisionEncoderDecoder.scala:428)
	at com.johnsnowlabs.ml.ai.util.Generation.Generate.$anonfun$beamSearch$7(Generate.scala:228)
	at scala.util.control.Breaks.breakable(Breaks.scala:42)
	at com.johnsnowlabs.ml.ai.util.Generation.Generate.beamSearch(Generate.scala:216)
	at com.johnsnowlabs.ml.ai.util.Generation.Generate.beamSearch$(Generate.scala:184)
	at com.johnsnowlabs.ml.ai.VisionEncoderDecoder.beamSearch(VisionEncoderDecoder.scala:41)
	at com.johnsnowlabs.ml.ai.util.Generation.Generate.generate(Generate.scala:153)
	at com.johnsnowlabs.ml.ai.util.Generation.Generate.generate$(Generate.scala:85)
	at com.johnsnowlabs.ml.ai.VisionEncoderDecoder.generate(VisionEncoderDecoder.scala:41)
	at com.johnsnowlabs.ml.ai.VisionEncoderDecoder.$anonfun$generateFromImage$1(VisionEncoderDecoder.scala:321)
	at com.johnsnowlabs.ml.ai.VisionEncoderDecoder.$anonfun$generateFromImage$1$adapted(VisionEncoderDecoder.scala:271)
	at scala.collection.Iterator$$anon$11.nextCur(Iterator.scala:486)
	at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:492)
	at scala.collection.Iterator.toStream(Iterator.scala:1417)
	at scala.collection.Iterator.toStream$(Iterator.scala:1416)
	at scala.collection.AbstractIterator.toStream(Iterator.scala:1431)
	at scala.collection.TraversableOnce.toSeq(TraversableOnce.scala:354)
	at scala.collection.TraversableOnce.toSeq$(TraversableOnce.scala:354)
	at scala.collection.AbstractIterator.toSeq(Iterator.scala:1431)
	at com.johnsnowlabs.ml.ai.VisionEncoderDecoder.generateFromImage(VisionEncoderDecoder.scala:271)
	at com.johnsnowlabs.ml.ai.VisionEncoderDecoder.sessionWarmup(VisionEncoderDecoder.scala:90)
	at com.johnsnowlabs.ml.ai.VisionEncoderDecoder.<init>(VisionEncoderDecoder.scala:93)
	at com.johnsnowlabs.nlp.annotators.cv.VisionEncoderDecoderForImageCaptioning.setModelIfNotSet(VisionEncoderDecoderForImageCaptioning.scala:270)
	at com.johnsnowlabs.nlp.annotators.cv.ReadVisionEncoderDecoderDLModel.readModel(VisionEncoderDecoderForImageCaptioning.scala:461)
	at com.johnsnowlabs.nlp.annotators.cv.ReadVisionEncoderDecoderDLModel.readModel$(VisionEncoderDecoderForImageCaptioning.scala:427)
	at com.johnsnowlabs.nlp.annotators.cv.VisionEncoderDecoderForImageCaptioning$.readModel(VisionEncoderDecoderForImageCaptioning.scala:649)
	at com.johnsnowlabs.nlp.annotators.cv.ReadVisionEncoderDecoderDLModel.$anonfun$$init$$1(VisionEncoderDecoderForImageCaptioning.scala:479)
	at com.johnsnowlabs.nlp.annotators.cv.ReadVisionEncoderDecoderDLModel.$anonfun$$init$$1$adapted(VisionEncoderDecoderForImageCaptioning.scala:479)
	at com.johnsnowlabs.nlp.ParamsAndFeaturesReadable.$anonfun$onRead$1(ParamsAndFeaturesReadable.scala:50)
	at com.johnsnowlabs.nlp.ParamsAndFeaturesReadable.$anonfun$onRead$1$adapted(ParamsAndFeaturesReadable.scala:49)
	at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)
	at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)
	at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
	at com.johnsnowlabs.nlp.ParamsAndFeaturesReadable.onRead(ParamsAndFeaturesReadable.scala:49)
	at com.johnsnowlabs.nlp.ParamsAndFeaturesReadable.$anonfun$read$1(ParamsAndFeaturesReadable.scala:61)
	at com.johnsnowlabs.nlp.ParamsAndFeaturesReadable.$anonfun$read$1$adapted(ParamsAndFeaturesReadable.scala:61)
	at com.johnsnowlabs.nlp.FeaturesReader.load(ParamsAndFeaturesReadable.scala:38)
	at com.johnsnowlabs.nlp.FeaturesReader.load(ParamsAndFeaturesReadable.scala:24)
	at com.johnsnowlabs.nlp.pretrained.ResourceDownloader$.downloadModel(ResourceDownloader.scala:508)
	at com.johnsnowlabs.nlp.pretrained.ResourceDownloader$.downloadModel(ResourceDownloader.scala:500)
	at com.johnsnowlabs.nlp.HasPretrained.pretrained(HasPretrained.scala:44)
	at com.johnsnowlabs.nlp.HasPretrained.pretrained$(HasPretrained.scala:41)
	at com.johnsnowlabs.nlp.annotators.cv.VisionEncoderDecoderForImageCaptioning$.com$johnsnowlabs$nlp$annotators$cv$ReadablePretrainedVisionEncoderDecoderModel$$super$pretrained(VisionEncoderDecoderForImageCaptioning.scala:649)
	at com.johnsnowlabs.nlp.annotators.cv.ReadablePretrainedVisionEncoderDecoderModel.pretrained(VisionEncoderDecoderForImageCaptioning.scala:413)
	at com.johnsnowlabs.nlp.annotators.cv.ReadablePretrainedVisionEncoderDecoderModel.pretrained$(VisionEncoderDecoderForImageCaptioning.scala:409)
	at com.johnsnowlabs.nlp.annotators.cv.VisionEncoderDecoderForImageCaptioning$.pretrained(VisionEncoderDecoderForImageCaptioning.scala:649)
	at com.johnsnowlabs.nlp.annotators.cv.VisionEncoderDecoderForImageCaptioning$.pretrained(VisionEncoderDecoderForImageCaptioning.scala:649)
	at com.johnsnowlabs.nlp.HasPretrained.pretrained(HasPretrained.scala:52)
	at com.johnsnowlabs.nlp.HasPretrained.pretrained$(HasPretrained.scala:51)
	at com.johnsnowlabs.nlp.annotators.cv.VisionEncoderDecoderForImageCaptioning$.com$johnsnowlabs$nlp$annotators$cv$ReadablePretrainedVisionEncoderDecoderModel$$super$pretrained(VisionEncoderDecoderForImageCaptioning.scala:649)
	at com.johnsnowlabs.nlp.annotators.cv.ReadablePretrainedVisionEncoderDecoderModel.pretrained(VisionEncoderDecoderForImageCaptioning.scala:401)
	at com.johnsnowlabs.nlp.annotators.cv.ReadablePretrainedVisionEncoderDecoderModel.pretrained$(VisionEncoderDecoderForImageCaptioning.scala:401)
	at com.johnsnowlabs.nlp.annotators.cv.VisionEncoderDecoderForImageCaptioning$.pretrained(VisionEncoderDecoderForImageCaptioning.scala:649)
	at com.johnsnowlabs.nlp.annotators.cv.VisionEncoderDecoderForImageCaptioning.pretrained(VisionEncoderDecoderForImageCaptioning.scala)
	at cz.xxx.xxx.sparknlp.image.ImageCaptionGeneration.initPipelineStages(ImageCaptionGeneration.java:63)
	at cz.xxx.xxx.sparknlp.AbstractSparkAnalyzer.train(AbstractSparkAnalyzer.java:419)
	at cz.xxx.xxx.sparknlp.AbstractSparkAnalyzer.trainAndSaveModel(AbstractSparkAnalyzer.java:392)
	at cz.xxx.xxx.sparknlp.AbstractSparkAnalyzer.load(AbstractSparkAnalyzer.java:367)
	... 64 more

Expected Behavior

The VisionEncoderDecoderForImageCaptioning model using the ONNX backend (like image_captioning_vit_gpt2_en) should load and perform inference without crashing. The internal com.johnsnowlabs.ml.ai.VisionEncoderDecoder.getModelOutput method should correctly handle the case where detectedEngine is "onnx".

Steps To Reproduce

The issue occurs simply by attempting to load an ONNX-based VisionEncoderDecoderForImageCaptioning model using .pretrained(). The error happens during this call because the internal code is missing logic to handle the ONNX engine.

Minimal code example demonstrating the failure:

import com.johnsnowlabs.nlp.annotator._

val imageClassifier = VisionEncoderDecoderForImageCaptioning
  .pretrained()

Spark NLP version and Apache Spark

Spark NLP Version: 5.5.3 (Scala 2.12 artifact)
Apache Spark Version: 3.5.5

Type of Spark Application

Java Application

Java Version

openjdk 17.0.12 2024-07-16 OpenJDK Runtime Environment Temurin-17.0.12+7 (build 17.0.12+7) OpenJDK 64-Bit Server VM Temurin-17.0.12+7 (build 17.0.12+7, mixed mode, sharing)

Java Home Directory

C:\Progs\Java\jdk-17.0.12+7\

Setup and installation

Maven

Operating System and Version

Windows 11 Pro 24H2 26100.3624

Link to your project (if available)

No response

Additional Information

This issue seems directly linked to the transition of certain models (like image_captioning_vit_gpt2_en) from TensorFlow to ONNX backends. The existing code in VisionEncoderDecoder was not prepared to handle the "onnx" engine type in the overridden getModelOutput method.

The likely fix involves replacing the duplicate Openvino.name case with the missing case ONNX.name in the match statement, potentially calling the overloaded getModelOutput method similar to the TensorFlow case.

Suggested code change (conceptual):

  // In com.johnsnowlabs.ml.ai.VisionEncoderDecoder
  override def getModelOutput(/*...*/): Array[Array[Float]] = {
    detectedEngine match {
      case Openvino.name =>
        getDecoderOutputsOv(decoderInputIds, ovInferRequest.get)

      // Replace the duplicate Openvino case with the missing ONNX case
      case Openvino.name => 
        getModelOutput(decoderInputIds, decoderEncoderStateTensors, session, ovInferRequest)

      case TensorFlow.name =>
        getModelOutput(decoderInputIds, decoderEncoderStateTensors, session, ovInferRequest)
    }
  }

The text was updated successfully, but these errors were encountered:

DevinTDHa · 2025-04-22T13:43:43Z

Hi @krumpi36, thanks a lot for reporting in this much detail.

You pinpointed the exact issue and we are taking a look.

ahmedlone127 · 2025-04-23T16:36:15Z

Hey @krumpi36 the error has been fixed.

krumpi36 added the question label Apr 1, 2025

krumpi36 assigned maziyarpanahi Apr 1, 2025

DevinTDHa assigned ahmedlone127 and unassigned maziyarpanahi Apr 22, 2025

DevinTDHa added the bug label Apr 22, 2025

ahmedlone127 closed this as completed Apr 23, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MatchError in VisionEncoderDecoder.getModelOutput due to missing ONNX case #14546

MatchError in VisionEncoderDecoder.getModelOutput due to missing ONNX case #14546

krumpi36 commented Apr 1, 2025 •

edited

Loading

DevinTDHa commented Apr 22, 2025

ahmedlone127 commented Apr 23, 2025

MatchError in VisionEncoderDecoder.getModelOutput due to missing ONNX case #14546

MatchError in VisionEncoderDecoder.getModelOutput due to missing ONNX case #14546

Comments

krumpi36 commented Apr 1, 2025 • edited Loading

Is there an existing issue for this?

Who can help?

What are you working on?

Current Behavior

Expected Behavior

Steps To Reproduce

Spark NLP version and Apache Spark

Type of Spark Application

Java Version

Java Home Directory

Setup and installation

Operating System and Version

Link to your project (if available)

Additional Information

DevinTDHa commented Apr 22, 2025

ahmedlone127 commented Apr 23, 2025

krumpi36 commented Apr 1, 2025 •

edited

Loading