Lazy Dev

2026-01-22

docs-drift: Catching Documentation Drift Before It Breaks Your Users

docs-drift: Catching Documentation Drift Before It Breaks Your Users

Introduction

Documentation rot is a silent killer in software projects. You refactor your API, update dependencies, or change how a feature works—and suddenly, your README examples are broken. Users copy-paste code from your docs, hit errors, and lose trust in your project.

I built docs-drift to solve this problem: an open-source CLI tool and GitHub Action that validates code examples in Markdown files by actually running them. If a code block in your documentation breaks, your CI fails—simple as that.

Keywords: documentation validation, CI/CD automation, code examples testing, markdown parsing, GitHub Actions, Go CLI tool, JavaScript validation, Python validation, documentation drift detection, open-source tooling

The Problem: When Documentation Lies

Every developer has experienced this:

  1. You find a library that solves your problem
  2. You copy the example from the README
  3. It doesn't work
  4. You waste 30 minutes debugging
  5. You discover the API changed three months ago

This happens because:

  • Documentation updates lag behind code changes
  • Manual testing of docs is tedious and often skipped
  • No automated validation catches broken examples
  • Docs live in Markdown files that never get executed

The result: Users lose confidence, adoption drops, and GitHub issues pile up with "the example doesn't work" complaints.

The Solution: Executable Documentation

docs-drift treats code examples in your documentation as first-class test cases. It:

  1. Parses Markdown files to extract fenced code blocks
  2. Executes each code block in an isolated sandbox
  3. Reports which examples fail with clear error messages
  4. Exits with non-zero status to fail your CI pipeline

If your docs claim a code example works, docs-drift verifies it actually does.

Architecture Overview

How docs-drift Works

┌─────────────────┐
│  Markdown Files │
│  (README.md,    │
│   docs/*.md)    │
└────────┬────────┘
         │
         ▼
┌─────────────────┐
│ Parser          │
│ Extract ```lang │
│ code blocks     │
└────────┬────────┘
         │
         ▼
┌─────────────────┐
│ Language Runner │
│ Node.js/Python  │
│ Sandboxed exec  │
└────────┬────────┘
         │
         ▼
┌─────────────────┐
│ Result Reporter │
│ file:line:error │
└─────────────────┘

Technology Stack

  • Language: Go 1.21+ (single binary, no runtime dependencies)
  • Markdown Parsing: CommonMark-compliant regex-based parser
  • Execution Runtimes: Node.js (JavaScript), Python 3 (Python)
  • Git Integration: Native Git support for changed-only mode
  • Concurrency: Worker pool pattern for parallel execution
  • Configuration: YAML-based config with sensible defaults

Key Design Principles

  1. Local-first: No network calls, deterministic results
  2. Fast: Parallel execution, Git integration for incremental checks
  3. Simple: Single binary, minimal configuration
  4. Secure: Sandboxed execution with timeouts and restricted environments
  5. No SaaS: No telemetry, no paid tiers, fully open-source

Installation and Setup

Using Go

The fastest way to get started:

go install github.com/georg-nikola/docs-drift/cmd/docs-drift@latest

From Source

For development or custom builds:

git clone https://github.com/georg-nikola/docs-drift.git
cd docs-drift
go build -o docs-drift ./cmd/docs-drift

Creating Configuration

Initialize a config file in your project:

docs-drift init

This creates a docs-drift.yml:

version: 1

docs:
  paths:
    - README.md
    - docs/**/*.md

checks:
  code_blocks:
    enabled: true
    languages:
      - javascript
      - python
    timeout: 30s

Core Features

1. Basic Validation

Run validation on all configured documentation:

docs-drift check

Output when drift is detected:

❌ Docs Drift Detected

README.md
  Line 42: javascript code block
    → ReferenceError: undefinedVar is not defined

  Line 78: python code block
    → NameError: name 'undefined_var' is not defined

Summary: 2 failed, 5 passed
Exit code: 1

2. Git Integration for Faster CI

Only validate files that changed in a pull request:

# Auto-detects base branch (main or master)
docs-drift check --changed-only

# Specify base branch explicitly
docs-drift check --changed-only --base origin/develop

# Check changes since specific commit
docs-drift check --changed-only --base abc123f

Performance impact: In a repo with 50+ markdown files, checking only changed files reduces validation time from 30 seconds to 2 seconds—a 15x speedup.

3. Parallel Execution

Speed up validation with concurrent execution:

# Use default workers (4)
docs-drift check --parallel

# Specify custom worker count
docs-drift check --parallel --workers 8

# Optimal CI performance: combine flags
docs-drift check --changed-only --parallel --workers 4

Benchmark results (50 code blocks, 4-core system):

  • Sequential: 12.5 seconds
  • Parallel (4 workers): 3.8 seconds
  • Speedup: 3.3x

4. Skipping Code Blocks

Some examples are intentionally broken or incomplete. Skip validation with a directive:

```javascript docs-drift:skip
// This example is intentionally incomplete
const example =
```

5. Verbose Mode

Debug validation issues with detailed output:

docs-drift check --verbose

Shows:

  • Files being scanned
  • Code blocks found with line numbers
  • Execution results for each block
  • Timing information

GitHub Action Integration

Basic Setup

Add to .github/workflows/docs-check.yml:

name: Documentation Validation

on:
  push:
    branches: [main]
  pull_request:
    branches: [main]

jobs:
  docs-drift:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 0  # Required for --changed-only

      - uses: georg-nikola/docs-[email protected]
        with:
          config: docs-drift.yml

Optimized CI for Pull Requests

Validate only changed documentation for faster PR checks:

name: PR Documentation Check

on:
  pull_request:
    branches: [main]

jobs:
  docs-drift:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 0

      - uses: georg-nikola/docs-[email protected]
        with:
          config: docs-drift.yml
          changed-only: true
          base: origin/main
          parallel: true
          workers: 4

Action Inputs Reference

InputDescriptionDefault
configPath to config filedocs-drift.yml
versiondocs-drift versionlatest
verboseEnable verbose outputfalse
changed-onlyOnly check changed filesfalse
baseBase branch for comparisonAuto-detect
parallelEnable parallel executionfalse
workersConcurrent worker count4

Action Outputs

Use outputs for custom workflows:

- uses: georg-nikola/docs-[email protected]
  id: docs-check
  with:
    config: docs-drift.yml

- name: Comment on PR
  if: steps.docs-check.outputs.drift-detected == 'true'
  run: |
    echo "Documentation drift detected!"
    echo "Failed blocks: ${{ steps.docs-check.outputs.failed-blocks }}"
    echo "Total checked: ${{ steps.docs-check.outputs.total-blocks }}"

Real-World Use Cases

1. API Documentation Validation

Ensure all API examples in your docs actually work:

## Authentication

```javascript
const client = new APIClient({
  apiKey: process.env.API_KEY
});

const response = await client.authenticate();
console.log(response.token); // Works or CI fails

### 2. Tutorial Step Validation

Verify multi-step tutorials stay in sync with your code:

```markdown
## Getting Started Tutorial

Step 1: Install the package
```bash
npm install my-package

Step 2: Create a basic app

const MyPackage = require('my-package');
const app = new MyPackage();
app.start(); // Validated automatically

### 3. Changelog Example Verification

Keep changelog examples honest:

```markdown
## v2.0.0

Breaking change: Renamed `oldMethod()` to `newMethod()`

```javascript
// Old (this block marked as docs-drift:skip)
api.oldMethod();

// New (this gets validated)
api.newMethod();

### 4. Multi-Language Documentation

Validate examples across different languages:

```yaml
# docs-drift.yml
checks:
  code_blocks:
    languages:
      - javascript
      - python
      - typescript

Then in your docs:

## JavaScript Example
```javascript
console.log('Hello from Node.js');

Python Example

print('Hello from Python')

## Configuration Deep Dive

### Configuration File Structure

```yaml
version: 1

docs:
  # Glob patterns for markdown files
  paths:
    - README.md
    - docs/**/*.md
    - guides/*.mdx
    - "!docs/drafts/**"  # Exclude patterns with !

checks:
  code_blocks:
    enabled: true

    # Languages to validate
    languages:
      - javascript
      - python

    # Execution timeout per code block
    timeout: 30s

Configuration Options

OptionTypeDescriptionRequired
versionintConfig version (must be 1)Yes
docs.pathslistGlob patterns for markdown filesYes
checks.code_blocks.enabledboolEnable code validationNo (default: true)
checks.code_blocks.languageslistLanguages to validateYes
checks.code_blocks.timeoutdurationExecution timeoutNo (default: 30s)

Language Aliases

docs-drift supports common language aliases:

  • JavaScript: javascript, js
  • Python: python, py

All are case-insensitive.

Glob Pattern Examples

docs:
  paths:
    # Exact file
    - README.md

    # All markdown in docs directory (recursive)
    - docs/**/*.md

    # Specific subdirectory
    - guides/tutorials/*.mdx

    # Multiple extensions
    - "**/*.{md,mdx}"

    # Exclude drafts
    - "!docs/drafts/**"

Exit Codes Reference

docs-drift uses semantic exit codes for CI integration:

CodeMeaningCI Status
0No drift detected (all blocks passed)✅ Success
1Drift detected (one or more blocks failed)❌ Failure
2Runtime or configuration error❌ Failure

Exit Code Examples

# Success case
docs-drift check
echo $?  # 0

# Drift detected
docs-drift check
echo $?  # 1

# Configuration error
docs-drift check --config missing.yml
echo $?  # 2

Implementation Details

Security and Sandboxing

docs-drift executes code in isolated processes with security restrictions:

JavaScript Execution (Node.js):

# Actual execution
node --no-warnings /tmp/docs-drift-12345.js

Python Execution (Python 3):

# Isolated mode prevents imports of user site packages
python3 -I /tmp/docs-drift-67890.py

Security measures:

  • Restricted environment variables
  • Execution timeouts (configurable, default 30s)
  • Temporary file cleanup after execution
  • No network access enforced via environment settings
  • Separate process per code block (no shared state)

Markdown Parsing

docs-drift uses a CommonMark-compliant parser:

Supported fence formats:

```javascript
// Three backticks
```

```js
// Language alias
```

~~~python
# Three tildes (CommonMark alternative)
// Skip directive

**Parsing behavior**:
- Tracks exact line numbers for error reporting
- Ignores fences without language tags
- Handles nested blocks in list items
- Preserves whitespace in code content

### Parallel Execution Architecture

Worker pool pattern for concurrent validation:

```
┌─────────────┐
│ Main Thread │
└──────┬──────┘
       │
       ▼
┌──────────────────┐
│  Code Block      │
│  Work Queue      │
└────────┬─────────┘
         │
    ┌────┴────┬────────┬────────┐
    ▼         ▼        ▼        ▼
┌────────┐ ┌────────┐ ┌────────┐ ┌────────┐
│Worker 1│ │Worker 2│ │Worker 3│ │Worker 4│
└───┬────┘ └───┬────┘ └───┬────┘ └───┬────┘
    │          │          │          │
    └──────────┴──────────┴──────────┘
                   │
                   ▼
            ┌──────────────┐
            │Result Channel│
            └──────────────┘
```

**Worker configuration**:
- Default: 4 workers (optimal for most systems)
- Configurable via `--workers` flag
- Minimum: 1 worker (sequential execution)
- Recommendation: Match CPU core count

### Performance Characteristics

**Sequential Mode** (default):
- Processes code blocks one at a time
- Predictable resource usage
- Best for small repos (<10 code blocks)

**Parallel Mode** (`--parallel`):
- Concurrent execution with worker pool
- 2-4x faster on multi-core systems
- Best for large repos (10+ code blocks)

**Changed-Only Mode** (`--changed-only`):
- Uses Git to detect modified files
- 10-100x faster for incremental checks
- Perfect for PR validation

**Optimal CI Configuration**:
```bash
# Maximum performance for pull request checks
docs-drift check --changed-only --parallel --workers 4
```

## Roadmap and Future Features

### Current Status: v0.2 (Production Ready)

Shipped features:
- ✅ CLI with init, check, version commands
- ✅ JavaScript and Python validation
- ✅ GitHub Action integration
- ✅ Git integration (--changed-only mode)
- ✅ Parallel execution (--parallel flag)
- ✅ Comprehensive test suite
- ✅ Benchmark suite

### Planned: v0.3

**Additional Language Runners**:
- Shell/Bash validation
- Ruby script execution
- PHP code validation
- Go code validation (via `go run`)

**Result Caching**:
- Cache validation results based on file hashes
- Skip re-validation of unchanged blocks
- Persistent cache across runs for faster local development

**Enhanced Error Reporting**:
- HTML report generation for CI artifacts
- JSON output format for tool integration
- Detailed statistics dashboard

### Future Ideas: v0.4+

- Watch mode for local development (`--watch` flag)
- Custom runner commands in config
- IDE integrations (VS Code extension)
- Docker support for complex environments
- Smart retry logic for flaky tests
- Code coverage tracking for documentation

## Performance Benchmarks

### Benchmark Methodology

Benchmarks run on:
- **System**: MacBook Pro M1, 8-core CPU, 16GB RAM
- **Test Data**: 50 code blocks across 10 markdown files
- **Iterations**: 10 runs averaged

### Sequential vs Parallel Performance

| Mode | Workers | Time | Speedup |
|------|---------|------|---------|
| Sequential | 1 | 12.5s | 1.0x |
| Parallel | 2 | 7.2s | 1.7x |
| Parallel | 4 | 3.8s | 3.3x |
| Parallel | 8 | 3.1s | 4.0x |

**Key findings**:
- Linear scaling up to 4 workers
- Diminishing returns beyond CPU core count
- Optimal for most systems: 4 workers

### Changed-Only Mode Performance

| Scenario | Files Changed | Sequential Time | Changed-Only Time | Speedup |
|----------|---------------|-----------------|-------------------|---------|
| Large repo | 2/50 files | 30s | 2.1s | 14.3x |
| Medium repo | 5/20 files | 15s | 3.8s | 3.9x |
| Small repo | 1/5 files | 5s | 1.2s | 4.2x |

**Key findings**:
- Speedup scales with repo size
- Most effective for large documentation repositories
- Perfect for PR validation workflows

### Running Benchmarks Yourself

```bash
cd docs-drift
go test -bench=. -benchmem ./...
```

## Best Practices

### 1. Start Small

Begin with critical documentation:

```yaml
# Initial config
docs:
  paths:
    - README.md
```

Expand gradually:

```yaml
# After success
docs:
  paths:
    - README.md
    - docs/getting-started.md
    - docs/api/**/*.md
```

### 2. Use Changed-Only in CI

For pull requests, validate only what changed:

```yaml
# .github/workflows/pr-check.yml
- uses: georg-nikola/[email protected]
  with:
    changed-only: true
    parallel: true
```

### 3. Skip Intentionally Broken Examples

Mark examples that shouldn't be validated:

~~~markdown
```javascript docs-drift:skip
// This is a broken example showing what NOT to do
const broken = undefined.property;
```

4. Set Appropriate Timeouts

Adjust based on your code complexity:

checks:
  code_blocks:
    timeout: 60s  # Increase for complex examples

5. Run Locally Before Pushing

Catch issues early:

# Before committing changes to docs
docs-drift check --verbose

6. Combine Flags for Optimal Performance

# Development: full check with verbose output
docs-drift check --verbose

# CI: fast incremental check
docs-drift check --changed-only --parallel --workers 4

Troubleshooting

Common Issues

Issue: "Config file not found"

# Solution: Initialize config
docs-drift init

Issue: "No code blocks found"

# Solution: Check your glob patterns
docs-drift check --verbose
# Look for "Files matched: 0"

Issue: "Timeout errors"

# Solution: Increase timeout in config
checks:
  code_blocks:
    timeout: 60s

Issue: "Changed-only mode not working"

# Solution: Ensure you're in a git repository
git status
# And fetch depth is sufficient
git fetch --depth=0

Debugging Tips

Enable verbose mode:

docs-drift check --verbose

Test specific files:

docs-drift check --config custom-config.yml --verbose

Check glob pattern matching:

# In verbose mode, you'll see which files matched
docs-drift check --verbose | grep "Processing file"

Validate configuration:

# docs-drift validates config on startup
docs-drift check
# Watch for "Config error" messages

Project Status and Adoption

GitHub Repository

Success Metrics

The project tracks adoption metrics to evaluate community interest:

30-day target:

  • 50 GitHub stars
  • 10 repositories using docs-drift

60-day target:

  • 150 GitHub stars
  • 25 repositories using docs-drift

Contributing

Contributions welcome! Check out:

Conclusion

Documentation drift is a real problem that affects user trust and adoption. docs-drift provides a simple, fast, and reliable solution by treating documentation as executable tests.

Key takeaways:

  1. Automated validation: Code examples in docs are tested automatically
  2. Fast CI integration: Changed-only and parallel modes optimize performance
  3. Simple setup: One config file, one command to run
  4. Open-source: No SaaS, no telemetry, fully transparent
  5. Production ready: v0.2 with comprehensive features and testing

Get started today:

# Install
go install github.com/georg-nikola/docs-drift/cmd/docs-drift@latest

# Initialize config
docs-drift init

# Validate your docs
docs-drift check

Your documentation will thank you. Your users will thank you. Your future self will thank you.


Links:

Related Projects:

Previous

Building DevMind Pipeline: ArgoCD-Powered GitOps for AI-Enhanced DevOps

Next

Building a Mini Backstage: An Internal Developer Portal on Kubernetes