Description
Even though only OpenAI and Azure OpenAI current offer transcription support, they are each represented by distinct transcription model types: OpenAiAudioTranscriptionModel
and AzureOpenAiAudioTranscriptionModel
. There is no common interface between these two, which means that you have to explicitly inject the one you are working with rather than inject a common interface.
They both implement Model<AudioTranscriptionPrompt, AudioTranscriptionResponse>
, but it would be more handy to have a common interface like this that they each could implement:
public interface TranscriptionModel extends Model<AudioTranscriptionPrompt, AudioTranscriptionResponse> {
AudioTranscriptionResponse call(AudioTranscriptionPrompt transcriptionPrompt);
}
Similar to ChatModel
, there may be opportunity for some additional default convenience methods, as well. Perhaps one that accepts a Resource
and returns a String
and another that accepts a Resource
and AudioTranscriptionOptions
and produces a String
.