πŸ”— Documentation Link Check Workflow Guide

πŸ” Automated Documentation Link Validation for General Projects

Standalone workflow for checking documentation links and ensuring all references are valid


πŸ“‹ Overview

The Link Check Workflow is a standalone reusable workflow that validates all documentation links in your repository. It ensures that all internal and external links are working correctly, preventing broken references in your documentation.

Key Features

  • πŸ” Comprehensive Link Checking - Validates all markdown files and documentation using Lychee
  • 🌐 External Link Validation - Checks external URLs for accessibility with retry logic
  • πŸ“ Internal Link Validation - Verifies internal file references and anchors
  • ⚑ Fast Processing - Efficient scanning with configurable timeouts and parallel processing
  • 🎯 Flexible Path Patterns - Customizable file and directory patterns
  • πŸ“Š Detailed Reporting - Clear error messages and link status with verbose output
  • πŸ›‘οΈ Smart Filtering - Excludes private links and mailto addresses by default
  • πŸ”„ Retry Logic - Automatic retry for failed links with configurable attempts

πŸš€ Quick Start

Basic Usage

1
2
3
4
5
6
7
8
9
10
11
name: Check Documentation Links

on:
  push:
    branches: [main]
  pull_request:
    branches: [main]

jobs:
  link-check:
    uses: N3b3x/hf-general-ci-tools/.github/workflows/docs-link-check.yml@v1

Advanced Configuration

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
name: Advanced Link Check

on:
  push:
    branches: [main]
  pull_request:
    branches: [main]

jobs:
  link-check:
    uses: N3b3x/hf-general-ci-tools/.github/workflows/docs-link-check.yml@v1
    with:
      paths: "docs/** *.md **/docs/**"
      fail_on_errors: true
      timeout: "15"
      retry: "5"
      exclude_private: true
      exclude_mail: true
      verbose: true

Using TOML Configuration File

For advanced configuration, you can use a lychee.toml file:

1
2
3
4
5
6
7
8
9
10
11
12
13
name: Link Check with TOML Config

on:
  push:
    branches: [main]

jobs:
  link-check:
    uses: N3b3x/hf-general-ci-tools/.github/workflows/docs-link-check.yml@v1
    with:
      config_file: "lychee.toml"
      paths: "docs/** *.md"
      verbose: true

Creating a TOML Config File:

Create a custom configuration file:

1
2
# Create lychee.toml in your repository root
touch lychee.toml

Then customize it for your needs. The TOML file allows you to:

  • Set default timeouts and retry counts
  • Exclude specific domains or patterns
  • Configure HTTP headers and redirects
  • Enable caching for faster subsequent runs

πŸ“– Input Parameters

πŸ“ File Selection

Parameter Type Default Description
checkout_recursive boolean false Checkout submodules recursively (for projects with docs in submodules)
paths string "docs/** *.md **/docs/**" Space-separated paths to check for broken links
Parameter Type Default Description
timeout string "10" Timeout in seconds for each link check
retry string "3" Number of retries for failed links
exclude_private boolean true Exclude private/internal links
exclude_mail boolean true Exclude mailto links
config_file string "" Path to lychee.toml config file (optional)

πŸ“Š Output Control

Parameter Type Default Description
fail_on_errors boolean true Fail the workflow if broken links are found
verbose boolean false Enable verbose output

Path Patterns

The paths parameter supports glob patterns:

  • docs/** - All files in docs directory and subdirectories
  • *.md - All markdown files in repository root
  • **/docs/** - All files in any docs directory
  • README.md - Specific file
  • docs/** *.md - Multiple patterns (space-separated)

Important: Paths must be space-separated, not comma-separated, as required by lycheeverse/lychee-action@v2.


πŸ”§ Usage Examples

1
2
3
4
5
6
7
8
9
name: Basic Link Check

on:
  push:
    branches: [main]

jobs:
  link-check:
    uses: N3b3x/hf-general-ci-tools/.github/workflows/docs-link-check.yml@v1

Custom Paths

1
2
3
4
5
6
7
8
9
10
11
12
name: Custom Link Check

on:
  push:
    branches: [main]

jobs:
  link-check:
    uses: N3b3x/hf-general-ci-tools/.github/workflows/docs-link-check.yml@v1
    with:
      paths: "docs/** README.md CONTRIBUTING.md"
      fail_on_errors: false

Documentation Only

1
2
3
4
5
6
7
8
9
10
11
name: Documentation Link Check

on:
  push:
    branches: [main]

jobs:
  link-check:
    uses: N3b3x/hf-general-ci-tools/.github/workflows/docs-link-check.yml@v1
    with:
      paths: "docs/**"

All Markdown Files

1
2
3
4
5
6
7
8
9
10
11
name: All Markdown Link Check

on:
  push:
    branches: [main]

jobs:
  link-check:
    uses: N3b3x/hf-general-ci-tools/.github/workflows/docs-link-check.yml@v1
    with:
      paths: "**/*.md"

With Submodule Support

1
2
3
4
5
6
7
8
9
10
11
12
13
14
name: Link Check with Submodules

on:
  push:
    branches: [main]

jobs:
  link-check:
    uses: N3b3x/hf-general-ci-tools/.github/workflows/docs-link-check.yml@v1
    with:
      checkout_recursive: true  # Enable submodule checkout
      paths: "docs/** *.md **/docs/**"
      fail_on_errors: true
      timeout: "15"

πŸ› οΈ Integration with Other Workflows

Combined with Documentation Workflow

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
name: Documentation Pipeline

on:
  push:
    branches: [main]

jobs:
  docs:
    uses: N3b3x/hf-general-ci-tools/.github/workflows/docs.yml@v1
    with:
      doxygen_config: Doxyfile
      run_link_check: true

  link-check:
    uses: N3b3x/hf-general-ci-tools/.github/workflows/docs-link-check.yml@v1
    with:
      paths: "docs/**"

Part of Full CI Pipeline

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
name: Full CI Pipeline

on:
  push:
    branches: [main]
  pull_request:
    branches: [main]

jobs:
  lint:
    uses: N3b3x/hf-general-ci-tools/.github/workflows/c-cpp-lint.yml@v1

  static:
    uses: N3b3x/hf-general-ci-tools/.github/workflows/c-cpp-static-analysis.yml@v1

  link-check:
    uses: N3b3x/hf-general-ci-tools/.github/workflows/docs-link-check.yml@v1

  docs:
    uses: N3b3x/hf-general-ci-tools/.github/workflows/docs.yml@v1

πŸ” How It Works

1
2
3
4
5
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Scan Files      │───▢│ Extract Links   │───▢│ Validate Links  │───▢│ Report Results  β”‚
β”‚ (Lychee)        β”‚    β”‚ (Markdown       β”‚    β”‚ (HTTP requests  β”‚    β”‚ (Success/Error  β”‚
β”‚                 β”‚    β”‚ parsing)        β”‚    β”‚ + retry logic)  β”‚    β”‚ messages)       β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
  1. External Links - HTTP/HTTPS URLs
  2. Internal File Links - Relative file paths
  3. Anchor Links - Links to sections within files
  4. Image Links - Image file references

Validation Methods

  • External Links: HTTP HEAD requests with configurable timeout and retry logic
  • Internal Links: File system existence checks
  • Anchor Links: Section header validation
  • Image Links: File existence and format validation
  • Private Links: Automatically excluded (configurable)
  • Mailto Links: Automatically excluded (configurable)

βš™οΈ Configuration Options

Lychee Parameters

The workflow uses lycheeverse/lychee-action@v2 with these default settings:

  • Timeout: 10 seconds per link (configurable)
  • Retry: 3 attempts for failed links (configurable)
  • External Links: Enabled with retry logic
  • Private Links: Excluded by default
  • Mailto Links: Excluded by default
  • Anchor Checking: Enabled
  • SSL Validation: Enabled
  • Verbose Output: Configurable

Custom Configuration

For advanced configuration, you can use a lychee.toml file:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
[input]
files = ["docs/**", "*.md"]  # Note: TOML config uses array format, workflow uses space-separated
exclude = ["CHANGELOG.md"]

[output]
format = "detailed"
verbose = true

[check]
timeout = 10
retry = 3
exclude_all_private = true
exclude_mail = true
exclude_github_issues = false

[http]
headers = { "User-Agent" = "lychee/0.14.0" }

🚨 Troubleshooting

Common Issues

Symptoms: Workflow fails with β€œbroken links found” Solutions:

1
2
3
4
5
6
7
8
# Check specific links manually
curl -I https://example.com/broken-link

# Verify internal file paths
ls -la docs/broken-file.md

# Check anchor references
grep -n "## Target Section" docs/file.md

Timeout Errors

Symptoms: β€œTimeout” or β€œConnection timeout” errors Solutions:

1
2
3
4
5
6
7
# Check if external site is accessible
ping example.com

# Verify network connectivity
curl -I https://example.com

# Check if site requires authentication

False Positives

Symptoms: Valid links reported as broken Solutions:

1
2
3
4
5
6
7
8
# Check link format
echo "https://example.com/path"

# Verify file encoding
file -i docs/file.md

# Check for special characters
grep -n "\[.*\](.*)" docs/file.md

Debug Mode

Enable verbose output for debugging:

1
2
3
4
5
6
jobs:
  link-check:
    uses: N3b3x/hf-general-ci-tools/.github/workflows/docs-link-check.yml@v1
    with:
      paths: "docs/**"
      fail_on_errors: false  # Don't fail on errors for debugging

πŸ“Š Output and Reporting

Success Output

1
2
3
4
βœ… Link check completed successfully
πŸ“Š Checked 25 files
πŸ”— Validated 150 links
⏱️  Processing time: 45 seconds

Error Output

1
2
3
4
5
6
7
8
9
❌ Link check failed
πŸ“Š Checked 25 files
πŸ”— Found 3 broken links
⏱️  Processing time: 45 seconds

Broken links:
- docs/guide.md:5 β†’ https://example.com/broken
- docs/api.md:12 β†’ ../missing-file.md
- README.md:8 β†’ #non-existent-section


πŸ“š External Resources


πŸ“š All Documentation 🏠 Main README