Skip to content

Commit 2ab4489

Browse files
Adding some clarity to the algorithm's wording.
- being more specific with regards to whitespace and  . as per dd8's comment at act-rules#2101 (review) - misc. other edits.
1 parent 2dc429f commit 2ab4489

File tree

1 file changed

+7
-5
lines changed

1 file changed

+7
-5
lines changed

pages/glossary/label-in-name-algorithm.md

Lines changed: 7 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -20,18 +20,20 @@ Sub-algorithm to tokenize a string:
2020
- For b) Use the Unicode classes Letter, Mark, and "Number, Decimal Digit [Nd]". (This will exclude hyphens, punctuation, emoji, and more.)
2121
- Remove all characters that are within parentheses (AKA round brackets).
2222
- Ignore square brackets and braces.
23-
- Split the string into a list of strings, using a whitespace regular expression as the separator.
23+
- Split the string into a list of strings, using a greedy [whitespace][] regular expression as the separator.
2424
- This 'split' operation must:
25-
- Effectively remove leading and trailing whitespace as a pre-processing step.
26-
- If the string was all whitespace before this operation: result in an empty list.
25+
- Effectively remove leading and trailing [whitespace][].
26+
- If the input string contains nothing but [whitespace][] before this operation: return an empty list.
27+
- A consequence of using the ACT definition of [whitespace][] here is that all kinds of whitespace are covered. That includes the Unicode code point U+00A0 - the "No-Break Space" - which can be represented by the HTML named character reference ` `.
2728

2829
Then do the check: is the tokenized 'label' a sublist of the tokenized 'name'?
2930
- This 'sublist' check has these properties:
30-
- It checks whether elements are consecutive or not. i.e. it checks for a substring, in the computer science sense of the term. Not a subsequence.
31+
- It checks whether elements are consecutive or not. That is: it checks for a substring, in the computer science sense of the term. Not a subsequence.
3132
- An empty list is a sublist of any list.
3233

3334
If the answer is "yes" (that is: the tokenized 'label' is a sublist of the tokenized 'name'), then this algorithm returns "is contained". Otherwise, it returns "is not contained".
3435

3536
[accessible name]: #accessible-name 'Definition of accessible name'
37+
[element]: https://dom.spec.whatwg.org/#element
3638
[visible inner text]: #visible-inner-text 'Definition of Visible inner text'
37-
[element]: https://dom.spec.whatwg.org/#element
39+
[whitespace][]: #whitespace 'Definition of whitespace'

0 commit comments

Comments
 (0)