Skip to content

Transcription API File Extension Issue #3557

@juandaco

Description

@juandaco

Bug description
Spring AI overrides the file name with audio.webm when using the Transcription API of Open AI. See code in here. Sending an *.mp3 file with the file name as *.webm is not supported by the new Open AI transcription models gpt-4o-transcribe and gpt-4o-mini-transcribe. These models will fail with an unsupported error while whisper-1 will allow it.

Environment
Java 21, Spring Boot 3.5.0, Spring AI 1.0 GA

Steps to reproduce
This problem can be tested with curl using the Open AI Transcription API.

  1. Grab a mp3 file with a speech
  2. Rename the file as audio.webm
  3. Send a request with curl providing the API key and making sure that the model selected is one of the gpt-4o-transcribe (not whisper-1).
  4. Observe the returned error

Make sure to send a renamed mp3 file as audio.webm

curl https://api.openai.com/v1/audio/transcriptions \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -H "Content-Type: multipart/form-data" \
  -F file="@/path/to/file/audio.webm" \
  -F model="gpt-4o-transcribe"

Expected behavior
Spring AI should not be changing the file name and failing silently. It should send the correct file name (at least the correct file extension).

Metadata

Metadata

Assignees

No one assigned

    Type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions