Support for specialized transcript notation styles (e.g. GAT 2 and/or Jefferson) #2732
zackbatist
started this conversation in
Ideas
Replies: 2 comments 2 replies
-
I just found out about GailBot which is a project-in-progress meant to facilitate generation of Jefferson style annotated transcripts. It's not open source but I submitted a license request. |
Beta Was this translation helpful? Give feedback.
0 replies
-
How did it work out, any news on GAT 2 or could you get a license for Gailbot? A friend (social scientific researcher) had an injury and thus needs an quite similar solution.. |
Beta Was this translation helpful? Give feedback.
2 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
First off, just want to express my gratitude to the developers and the community for supporting this project. It's made my work a lot easier. That being said, I think one way it could be even better is by implementing support for detailed transcript notation styles such as GAT 2 or the Jefferson system. These systems are really important in research involving transcribed interviews, especially conversation analysis, and I imagine some aspects of this can be easily automated using LLMs.
Here's a basic summary of some notations from the GAT 2 system that I frequently use in my work. I also included additional examples and information about implementation at the end of this post.
I know that not all of these things can be reliably covered (non-verbal vocal actions such as coughing or laughter), but I imagine some can be, including calculating the duration of pauses, detecting loud or shouted speech, or detecting speaker overlap.
I did a fairly comprehensive search for resources on implementing LLMs to support this kind of annotation, which turned up nothing so far. So aside from simply serving as a feature request, maybe others can chime in with strategies they used to modify outputs to add additional details like the ones I describe here.
Overlaps and simultaneous speech
Opening square brackets are inserted at exactly the point in speaking where the overlap starts, and closing square brackets, where it ends. In both Jefferson and GAT, the respective brackets are aligned with each other within the text. Note that the exact alignment is difficult to represent in markdown, so appears unaligned here.
Laughter
With "ha-ha laughter" the approximate number and phonetic laughter syllables are transcribed, i.e. HA HA HA HA. With overlaid laughter, this is represented through annotation conventions, such as curly brackets (as in the following example).
Non-verbal vocal actions and events
Non-verbal vocal actions and events are denoted with two rounded brackets (( )). If the non-verbal action cannot be attributed to any one speaker the notion is entered as a new line in the transcript with its own timestamp.
Intelligibility
Intelligible or unclear speech are denoted with a "unclear" placed within rounded brackets, (unclear). GAT 2 has suggestions for uncertainties/alternatives in speech, however adding in assumptions may lead to bias.
Beta Was this translation helpful? Give feedback.
All reactions