Skip to content

Commit e0eb371

Browse files
author
Linus Arver
committed
add broken link checker
Unfortunately we cannot use the much prettier-looking https://github.com/tcort/markdown-link-check tool because it cannot handle relative links to other Markdown files. For some related discussion, see tcort/markdown-link-check#215. So instead we just use [htmltest](https://github.com/wjdp/htmltest). This isn't included in CI yet because we're using it to locally check for broken links as part of the update of the "Legacy Snapshot" folder. Later when things stabilize, we can consider adding this to CI (although adding it CI might be not a great idea because of expected flakiness, unless we add additional instrumentation around this).
1 parent 905fee9 commit e0eb371

File tree

3 files changed

+71
-0
lines changed

3 files changed

+71
-0
lines changed

site/.htmltest.yml

Lines changed: 38 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,38 @@
1+
IgnoreDirs:
2+
- /_print/
3+
IgnoreURLs:
4+
# Ignore assets and other things generated by our theme.
5+
- ^/favicons/
6+
- ^/scss/main
7+
- ^/js/
8+
# Ignore all "print view" links.
9+
- /_print/
10+
# Ignore other stuff...
11+
- /index.xml$
12+
# Ignore paths that refer to the toplevel "/docs/..." and "/community/..."
13+
# pages. These links are only resolvable at runtime as they refer to folder
14+
# names that have been normalized (the names are not what they look like in the
15+
# site/content/en/... path).
16+
- ^/docs/
17+
- ^/community/
18+
# Ignore "Last modified..." links because they break when referring to local
19+
# commits that have not yet been pushed up to GitHub yet.
20+
- ^https://github.com/kubernetes-sigs/prow/commit/
21+
# Ignore links that are known to be broken. This is useful if we know that a
22+
# link is broken but do not know how to update it.
23+
- ^broken:.*
24+
25+
# Ignore github upstream docs because they give 403 even though the page exists.
26+
# Sadly there is no way to tell this tool to treat 403 as "OK". But, it appears
27+
# that these URLs can be checked with curl and its "--compressed" flag. So, we
28+
# have to write a script that checks all such URLs separately with curl.
29+
- ^https://developer.github.com/.*
30+
- ^https://docs.github.com/.*
31+
- ^https://help.github.com/.*
32+
33+
# Ignore known-valid paths that fail for reasons unknown.
34+
- ^https://prow.k8s.io/badge.svg\?jobs=.*
35+
IgnoreDirectoryMissingTrailingSlash: true
36+
IgnoreSSLVerify: true
37+
IgnoreExternalBrokenLinks: true
38+
IgnoreAltMissing: true

site/Makefile

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -22,6 +22,9 @@ update-theme: init-theme
2222
git submodule update --remote
2323
npm update
2424

25+
check-broken-links:
26+
find ./public -name "*.html" -print0 | sort -z | xargs -0 ./check-broken-links.sh
27+
2528
.PHONY: \
2629
init-theme \
2730
update-theme

site/check-broken-links.sh

Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,30 @@
1+
#!/usr/bin/env bash
2+
# Copyright 2022 The Kubernetes Authors.
3+
#
4+
# Licensed under the Apache License, Version 2.0 (the "License");
5+
# you may not use this file except in compliance with the License.
6+
# You may obtain a copy of the License at
7+
#
8+
# http://www.apache.org/licenses/LICENSE-2.0
9+
#
10+
# Unless required by applicable law or agreed to in writing, software
11+
# distributed under the License is distributed on an "AS IS" BASIS,
12+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13+
# See the License for the specific language governing permissions and
14+
# limitations under the License.
15+
16+
# Mark all existing Markdown files with a deprecation notice and redirect link.
17+
18+
set -euo pipefail
19+
20+
SCRIPT_ROOT="$(cd "$(dirname "$0")" && pwd)"
21+
22+
for file; do
23+
if [[ "${file}" =~ /_print/index.html ]]; then
24+
echo "skipping file ${file}"
25+
continue
26+
fi
27+
echo "checking file ${file}"
28+
htmltest -c .htmltest.yml "${file}" | sed 's/^/ /'
29+
echo
30+
done

0 commit comments

Comments
 (0)