Skip to content

Note about purely decorative "video only" / "audio only" content in 1.2.1 #4398

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
patrickhlauke opened this issue May 16, 2025 · 8 comments

Comments

@patrickhlauke
Copy link
Member

For https://www.w3.org/WAI/WCAG22/Understanding/audio-only-and-video-only-prerecorded.html if a page uses a decorative video (like an animated background) or audio (some muzak, drone sound, etc) only to set the mood, it would likely be exempt from needing an media alternative - as there is no "information" conveyed by it that would require "equivalent information" to be delivered.

Thoughts? Happy to do a PR if we have agreement...

@TestPartners
Copy link

Sounds good to me.

@bruce-usab
Copy link
Contributor

Flagged on backlog call 5/16 and group agreed that this would be a helpful addition.

@JulietteZenyth
Copy link

So this is something we come across fairly often in the e-commerce world, usually in the form of some kind of looping background video. Here are some of the things we've had to consider:

  • If it automatically plays for more than 5 seconds for video or 3 seconds for audio, you'll need a control to pause/stop it. How do you name that control in a logical way for people who can't see or hear the autoplaying content? "Pause video" or "Pause audio" may result in users who can't perceive the content to think they've missed something important.
  • If it's not automatically playing, you'll need to provide controls to play the content. Same as the first point, how do you name those controls?

You can't just hide the controls from AT. Plenty of sighted keyboard users will need/want to stop the looping video, so to hide the control via tabindex=-1 and aria-hidden would deprive them of the ability to access the control.

So, despite it adding a bit of extra content to parse, in these specific cases we encourage labeling the decorative content as decorative, something as simple as "Decorative looping background video". Then, for the control to pause/stop the looping content we name it something like "Pause decorative video".

This feels like it strikes the middle ground between providing some context about what the controls are for without creating too much additional cognitive load or navigation.

@mbgower
Copy link
Contributor

mbgower commented May 22, 2025

First, to state the obvious, there is not a decorative exception in 1.2.1 like this one for 1.1.1:

If non-text content is pure decoration, is used only for visual formatting, or is not presented to users, then it is implemented in a way that it can be ignored by assistive technology.

We have only two things to work with in 1.2.1, a media alternative and an "equivalent" wording. In the case of a media alternative exception, it still needs a label:

except when the audio or video is a media alternative for text and is clearly labeled as such

In the case of "presents equivalent information", even if we could agree that a 'purely decorative' pass exists for time-based media, I'm not aware of a technical equivalent to render it so it can be ignored by AT. And obviously there are considerations of distraction with time-based media that make it more important to name/contextualize the experience, as pointed out by @JulietteZenyth. It's possible to have a generic "pause" button, but it gets odd, especially where there might be multiple videos on the page.

The relevant failure technique also suggests that even where the video-only or audio-only file is trivial, it still needs a name:
F30: Failure of Success Criterion 1.1.1 and 1.2.1 due to using text alternatives that are not alternatives (e.g., filenames or placeholder text)

So, I do not believe we are comparing apples and apples. Do we have examples of the kind of problem we're trying to solve with this?

@patrickhlauke
Copy link
Member Author

patrickhlauke commented May 22, 2025

In the case of "presents equivalent information", even if we could agree that a 'purely decorative' pass exists for time-based media, I'm not aware of a technical equivalent to render it so it can be ignored by AT.

to be clear, I'm not saying "it can be ignored by AT" or somehow trying to just hide it away completely. I am saying: "is it exempt from requiring a transcript or similar long form alternative, when its content is not conveying information and is just there for setting the mood". nothing more, nothing less.

i'm also not saying that it doesn't need a short label or name per 1.1.1. sure, give it a short generic label. again, i'm asking more about "does it need a full transcript)

as for examples: a site that uses a background animation of geometric shapes softly moving/animating ... beyond a short label to identify it, it doesn't need a lengthy transcript of what happens in it (in David Attenborough voice: "and now the small green triangle dances majestically to the right side of the screen"), or a set of stock photo type images (smiling business people shaking hands, a diverse set of people sitting in a meeting room listening intently to a presentation by somebody in front of a whiteboard) that gently transition and fade into each other; for audio, a small bit of mood music (a bit of ambient drones, a few notes on an instrument every now and again drenched in washy reverb).

@NickBromley
Copy link

First, to state the obvious, there is not a decorative exception in 1.2.1 like this one for 1.1.1:

If non-text content is pure decoration, is used only for visual formatting, or is not presented to users, then it is implemented in a way that it can be ignored by assistive technology.

I've always taken this 1.1.1 decorative exception to also apply to time-based media, given that time-based media is a subset of non-text content, and 1.2 is about additional requirements for time-based media.

@mbgower
Copy link
Contributor

mbgower commented May 22, 2025

given that time-based media is a subset of non-text content, and 1.2 is about additional requirements for time-based media.

Just to avoid confusion, the 1.2 Time-based Media guideline is a peer of 1.1 Text Alternatives, not a subset. Similarly, the 1.1.1 SC is a peer of any of the time-based media SCs. That said, 1.1.1 lists a specific situation for time-based media, and provides techniques to meet that:

If non-text content is time-based media, then text alternatives at least provide descriptive identification of the non-text content. (Refer to Guideline 1.2 for additional requirements for media.)

That does not, IMO, mean we can apply the decorative exception to time-based media. I'll also mention the little-used Sensory situation:

If non-text content is primarily intended to create a specific sensory experience, then text alternatives at least provide descriptive identification of the non-text content.

Until I looked at that, I've always thought of the 1.1.1 situations being exclusive; that only one situation is expected to apply to any specific case. But by definition specific sensory experience states "a sensory experience that is not purely decorative and does not primarily convey important information or perform a function." It then says "Examples include a performance of a flute solo, works of visual art etc." A flute solo is also obviously time-based. So I'm a little flummoxed. Still, I don't think the outcome for what one provides is obviously different, and importantly that definition specifies that to meet this situation it cannot be "purely decorative". I would apply the same rationale to the time-based media situation.

I think the three of us seem to be on the same page for 1.1.1 requirements in regard to naming/describing music of little consequence without a transcript.

@mbgower
Copy link
Contributor

mbgower commented May 22, 2025

To help us ensure we're aligned with WCAG, I want to break down what is actually asked for in 1.2 in regards to audio equivalence. There are really only two that introduce distinct requirements:

1.2.1

An alternative for time-based media is provided that presents equivalent information for prerecorded audio-only content.

The sole general technique, G158, calls out "dialogue and sounds (both natural and artificial)" but makes no mention of music, and its test seems to focus exclusively on dialogue (although I almost feel there's some odd wording glitch in step 2 that maybe is trying to cover sounds?).

So there really doesn't seem to be any need to do anything for background music to meet 1.2.1. The equivalent can be provided with the 1.1.1 requirement.

1.2.2

The caption definition includes "non-speech audio information needed to understand the media content"
Note 1 states:

Captions are similar to dialogue-only subtitles except captions convey not only the content of spoken dialogue, but also equivalents for non-dialogue audio information needed to understand the program content, including sound effects, music, laughter, speaker identification and location.

So, music is specified, but not referenced or tested in any sufficient techniques. It's important to point out we do have language which lets us assess whether it is meaningful: "needed to understand the program content".

In practice, one generally sees five methods actually used in captions: 1) a musical note symbol, such as ♫, that indicates there is music playing, 2) a indicator of 'mood' such as "somber piano" 3) a brief specific descriptor, such as "Canadian anthem" 4) a work, credited by composer or performer, 5) specific lyrics, at which point we are captioning language, not just music.

It's also worth pointing out that there are lots of situations where music is playing and nothing is indicated in the captions because it is not meaningful.

This guidance elaborates.


So I agree that one could just provide a descriptive name for a background music experience without dialogue in an audio-only experience to meet 1.1.1, and 1.2.1 would just be marked as N/A (meets).
I haven't dug as much into a video-only file, but it seems likely to also work with this approach.

I think it would be less likely for this to work with synchronized media in 1.2.2. I think in a track with zero dialogue, there would probably be an indication of music using one of the techniques above. I know that's not what was being asked for in this issue, but I wanted to chase out the question.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants