2026-01-22
docs-drift: Catching Documentation Drift Before It Breaks Your Users
docs-drift: Catching Documentation Drift Before It Breaks Your Users
Introduction
Documentation rot is a silent killer in software projects. You refactor your API, update dependencies, or change how a feature works—and suddenly, your README examples are broken. Users copy-paste code from your docs, hit errors, and lose trust in your project.
I built docs-drift to solve this problem: an open-source CLI tool and GitHub Action that validates code examples in Markdown files by actually running them. If a code block in your documentation breaks, your CI fails—simple as that.
Keywords: documentation validation, CI/CD automation, code examples testing, markdown parsing, GitHub Actions, Go CLI tool, JavaScript validation, Python validation, documentation drift detection, open-source tooling
The Problem: When Documentation Lies
Every developer has experienced this:
- You find a library that solves your problem
- You copy the example from the README
- It doesn't work
- You waste 30 minutes debugging
- You discover the API changed three months ago
This happens because:
- Documentation updates lag behind code changes
- Manual testing of docs is tedious and often skipped
- No automated validation catches broken examples
- Docs live in Markdown files that never get executed
The result: Users lose confidence, adoption drops, and GitHub issues pile up with "the example doesn't work" complaints.
The Solution: Executable Documentation
docs-drift treats code examples in your documentation as first-class test cases. It:
- Parses Markdown files to extract fenced code blocks
- Executes each code block in an isolated sandbox
- Reports which examples fail with clear error messages
- Exits with non-zero status to fail your CI pipeline
If your docs claim a code example works, docs-drift verifies it actually does.
Architecture Overview
How docs-drift Works
┌─────────────────┐
│ Markdown Files │
│ (README.md, │
│ docs/*.md) │
└────────┬────────┘
│
▼
┌─────────────────┐
│ Parser │
│ Extract ```lang │
│ code blocks │
└────────┬────────┘
│
▼
┌─────────────────┐
│ Language Runner │
│ Node.js/Python │
│ Sandboxed exec │
└────────┬────────┘
│
▼
┌─────────────────┐
│ Result Reporter │
│ file:line:error │
└─────────────────┘
Technology Stack
- Language: Go 1.21+ (single binary, no runtime dependencies)
- Markdown Parsing: CommonMark-compliant regex-based parser
- Execution Runtimes: Node.js (JavaScript), Python 3 (Python)
- Git Integration: Native Git support for changed-only mode
- Concurrency: Worker pool pattern for parallel execution
- Configuration: YAML-based config with sensible defaults
Key Design Principles
- Local-first: No network calls, deterministic results
- Fast: Parallel execution, Git integration for incremental checks
- Simple: Single binary, minimal configuration
- Secure: Sandboxed execution with timeouts and restricted environments
- No SaaS: No telemetry, no paid tiers, fully open-source
Installation and Setup
Using Go
The fastest way to get started:
go install github.com/georg-nikola/docs-drift/cmd/docs-drift@latest
From Source
For development or custom builds:
git clone https://github.com/georg-nikola/docs-drift.git
cd docs-drift
go build -o docs-drift ./cmd/docs-drift
Creating Configuration
Initialize a config file in your project:
docs-drift init
This creates a docs-drift.yml:
version: 1
docs:
paths:
- README.md
- docs/**/*.md
checks:
code_blocks:
enabled: true
languages:
- javascript
- python
timeout: 30s
Core Features
1. Basic Validation
Run validation on all configured documentation:
docs-drift check
Output when drift is detected:
❌ Docs Drift Detected
README.md
Line 42: javascript code block
→ ReferenceError: undefinedVar is not defined
Line 78: python code block
→ NameError: name 'undefined_var' is not defined
Summary: 2 failed, 5 passed
Exit code: 1
2. Git Integration for Faster CI
Only validate files that changed in a pull request:
# Auto-detects base branch (main or master)
docs-drift check --changed-only
# Specify base branch explicitly
docs-drift check --changed-only --base origin/develop
# Check changes since specific commit
docs-drift check --changed-only --base abc123f
Performance impact: In a repo with 50+ markdown files, checking only changed files reduces validation time from 30 seconds to 2 seconds—a 15x speedup.
3. Parallel Execution
Speed up validation with concurrent execution:
# Use default workers (4)
docs-drift check --parallel
# Specify custom worker count
docs-drift check --parallel --workers 8
# Optimal CI performance: combine flags
docs-drift check --changed-only --parallel --workers 4
Benchmark results (50 code blocks, 4-core system):
- Sequential: 12.5 seconds
- Parallel (4 workers): 3.8 seconds
- Speedup: 3.3x
4. Skipping Code Blocks
Some examples are intentionally broken or incomplete. Skip validation with a directive:
```javascript docs-drift:skip
// This example is intentionally incomplete
const example =
```
5. Verbose Mode
Debug validation issues with detailed output:
docs-drift check --verbose
Shows:
- Files being scanned
- Code blocks found with line numbers
- Execution results for each block
- Timing information
GitHub Action Integration
Basic Setup
Add to .github/workflows/docs-check.yml:
name: Documentation Validation
on:
push:
branches: [main]
pull_request:
branches: [main]
jobs:
docs-drift:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0 # Required for --changed-only
- uses: georg-nikola/docs-[email protected]
with:
config: docs-drift.yml
Optimized CI for Pull Requests
Validate only changed documentation for faster PR checks:
name: PR Documentation Check
on:
pull_request:
branches: [main]
jobs:
docs-drift:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
- uses: georg-nikola/docs-[email protected]
with:
config: docs-drift.yml
changed-only: true
base: origin/main
parallel: true
workers: 4
Action Inputs Reference
| Input | Description | Default |
|---|---|---|
config | Path to config file | docs-drift.yml |
version | docs-drift version | latest |
verbose | Enable verbose output | false |
changed-only | Only check changed files | false |
base | Base branch for comparison | Auto-detect |
parallel | Enable parallel execution | false |
workers | Concurrent worker count | 4 |
Action Outputs
Use outputs for custom workflows:
- uses: georg-nikola/docs-[email protected]
id: docs-check
with:
config: docs-drift.yml
- name: Comment on PR
if: steps.docs-check.outputs.drift-detected == 'true'
run: |
echo "Documentation drift detected!"
echo "Failed blocks: ${{ steps.docs-check.outputs.failed-blocks }}"
echo "Total checked: ${{ steps.docs-check.outputs.total-blocks }}"
Real-World Use Cases
1. API Documentation Validation
Ensure all API examples in your docs actually work:
## Authentication
```javascript
const client = new APIClient({
apiKey: process.env.API_KEY
});
const response = await client.authenticate();
console.log(response.token); // Works or CI fails
### 2. Tutorial Step Validation
Verify multi-step tutorials stay in sync with your code:
```markdown
## Getting Started Tutorial
Step 1: Install the package
```bash
npm install my-package
Step 2: Create a basic app
const MyPackage = require('my-package');
const app = new MyPackage();
app.start(); // Validated automatically
### 3. Changelog Example Verification
Keep changelog examples honest:
```markdown
## v2.0.0
Breaking change: Renamed `oldMethod()` to `newMethod()`
```javascript
// Old (this block marked as docs-drift:skip)
api.oldMethod();
// New (this gets validated)
api.newMethod();
### 4. Multi-Language Documentation
Validate examples across different languages:
```yaml
# docs-drift.yml
checks:
code_blocks:
languages:
- javascript
- python
- typescript
Then in your docs:
## JavaScript Example
```javascript
console.log('Hello from Node.js');
Python Example
print('Hello from Python')
## Configuration Deep Dive
### Configuration File Structure
```yaml
version: 1
docs:
# Glob patterns for markdown files
paths:
- README.md
- docs/**/*.md
- guides/*.mdx
- "!docs/drafts/**" # Exclude patterns with !
checks:
code_blocks:
enabled: true
# Languages to validate
languages:
- javascript
- python
# Execution timeout per code block
timeout: 30s
Configuration Options
| Option | Type | Description | Required |
|---|---|---|---|
version | int | Config version (must be 1) | Yes |
docs.paths | list | Glob patterns for markdown files | Yes |
checks.code_blocks.enabled | bool | Enable code validation | No (default: true) |
checks.code_blocks.languages | list | Languages to validate | Yes |
checks.code_blocks.timeout | duration | Execution timeout | No (default: 30s) |
Language Aliases
docs-drift supports common language aliases:
- JavaScript:
javascript,js - Python:
python,py
All are case-insensitive.
Glob Pattern Examples
docs:
paths:
# Exact file
- README.md
# All markdown in docs directory (recursive)
- docs/**/*.md
# Specific subdirectory
- guides/tutorials/*.mdx
# Multiple extensions
- "**/*.{md,mdx}"
# Exclude drafts
- "!docs/drafts/**"
Exit Codes Reference
docs-drift uses semantic exit codes for CI integration:
| Code | Meaning | CI Status |
|---|---|---|
0 | No drift detected (all blocks passed) | ✅ Success |
1 | Drift detected (one or more blocks failed) | ❌ Failure |
2 | Runtime or configuration error | ❌ Failure |
Exit Code Examples
# Success case
docs-drift check
echo $? # 0
# Drift detected
docs-drift check
echo $? # 1
# Configuration error
docs-drift check --config missing.yml
echo $? # 2
Implementation Details
Security and Sandboxing
docs-drift executes code in isolated processes with security restrictions:
JavaScript Execution (Node.js):
# Actual execution
node --no-warnings /tmp/docs-drift-12345.js
Python Execution (Python 3):
# Isolated mode prevents imports of user site packages
python3 -I /tmp/docs-drift-67890.py
Security measures:
- Restricted environment variables
- Execution timeouts (configurable, default 30s)
- Temporary file cleanup after execution
- No network access enforced via environment settings
- Separate process per code block (no shared state)
Markdown Parsing
docs-drift uses a CommonMark-compliant parser:
Supported fence formats:
```javascript
// Three backticks
```
```js
// Language alias
```
~~~python
# Three tildes (CommonMark alternative)
// Skip directive
**Parsing behavior**:
- Tracks exact line numbers for error reporting
- Ignores fences without language tags
- Handles nested blocks in list items
- Preserves whitespace in code content
### Parallel Execution Architecture
Worker pool pattern for concurrent validation:
```
┌─────────────┐
│ Main Thread │
└──────┬──────┘
│
▼
┌──────────────────┐
│ Code Block │
│ Work Queue │
└────────┬─────────┘
│
┌────┴────┬────────┬────────┐
▼ ▼ ▼ ▼
┌────────┐ ┌────────┐ ┌────────┐ ┌────────┐
│Worker 1│ │Worker 2│ │Worker 3│ │Worker 4│
└───┬────┘ └───┬────┘ └───┬────┘ └───┬────┘
│ │ │ │
└──────────┴──────────┴──────────┘
│
▼
┌──────────────┐
│Result Channel│
└──────────────┘
```
**Worker configuration**:
- Default: 4 workers (optimal for most systems)
- Configurable via `--workers` flag
- Minimum: 1 worker (sequential execution)
- Recommendation: Match CPU core count
### Performance Characteristics
**Sequential Mode** (default):
- Processes code blocks one at a time
- Predictable resource usage
- Best for small repos (<10 code blocks)
**Parallel Mode** (`--parallel`):
- Concurrent execution with worker pool
- 2-4x faster on multi-core systems
- Best for large repos (10+ code blocks)
**Changed-Only Mode** (`--changed-only`):
- Uses Git to detect modified files
- 10-100x faster for incremental checks
- Perfect for PR validation
**Optimal CI Configuration**:
```bash
# Maximum performance for pull request checks
docs-drift check --changed-only --parallel --workers 4
```
## Roadmap and Future Features
### Current Status: v0.2 (Production Ready)
Shipped features:
- ✅ CLI with init, check, version commands
- ✅ JavaScript and Python validation
- ✅ GitHub Action integration
- ✅ Git integration (--changed-only mode)
- ✅ Parallel execution (--parallel flag)
- ✅ Comprehensive test suite
- ✅ Benchmark suite
### Planned: v0.3
**Additional Language Runners**:
- Shell/Bash validation
- Ruby script execution
- PHP code validation
- Go code validation (via `go run`)
**Result Caching**:
- Cache validation results based on file hashes
- Skip re-validation of unchanged blocks
- Persistent cache across runs for faster local development
**Enhanced Error Reporting**:
- HTML report generation for CI artifacts
- JSON output format for tool integration
- Detailed statistics dashboard
### Future Ideas: v0.4+
- Watch mode for local development (`--watch` flag)
- Custom runner commands in config
- IDE integrations (VS Code extension)
- Docker support for complex environments
- Smart retry logic for flaky tests
- Code coverage tracking for documentation
## Performance Benchmarks
### Benchmark Methodology
Benchmarks run on:
- **System**: MacBook Pro M1, 8-core CPU, 16GB RAM
- **Test Data**: 50 code blocks across 10 markdown files
- **Iterations**: 10 runs averaged
### Sequential vs Parallel Performance
| Mode | Workers | Time | Speedup |
|------|---------|------|---------|
| Sequential | 1 | 12.5s | 1.0x |
| Parallel | 2 | 7.2s | 1.7x |
| Parallel | 4 | 3.8s | 3.3x |
| Parallel | 8 | 3.1s | 4.0x |
**Key findings**:
- Linear scaling up to 4 workers
- Diminishing returns beyond CPU core count
- Optimal for most systems: 4 workers
### Changed-Only Mode Performance
| Scenario | Files Changed | Sequential Time | Changed-Only Time | Speedup |
|----------|---------------|-----------------|-------------------|---------|
| Large repo | 2/50 files | 30s | 2.1s | 14.3x |
| Medium repo | 5/20 files | 15s | 3.8s | 3.9x |
| Small repo | 1/5 files | 5s | 1.2s | 4.2x |
**Key findings**:
- Speedup scales with repo size
- Most effective for large documentation repositories
- Perfect for PR validation workflows
### Running Benchmarks Yourself
```bash
cd docs-drift
go test -bench=. -benchmem ./...
```
## Best Practices
### 1. Start Small
Begin with critical documentation:
```yaml
# Initial config
docs:
paths:
- README.md
```
Expand gradually:
```yaml
# After success
docs:
paths:
- README.md
- docs/getting-started.md
- docs/api/**/*.md
```
### 2. Use Changed-Only in CI
For pull requests, validate only what changed:
```yaml
# .github/workflows/pr-check.yml
- uses: georg-nikola/[email protected]
with:
changed-only: true
parallel: true
```
### 3. Skip Intentionally Broken Examples
Mark examples that shouldn't be validated:
~~~markdown
```javascript docs-drift:skip
// This is a broken example showing what NOT to do
const broken = undefined.property;
```
4. Set Appropriate Timeouts
Adjust based on your code complexity:
checks:
code_blocks:
timeout: 60s # Increase for complex examples
5. Run Locally Before Pushing
Catch issues early:
# Before committing changes to docs
docs-drift check --verbose
6. Combine Flags for Optimal Performance
# Development: full check with verbose output
docs-drift check --verbose
# CI: fast incremental check
docs-drift check --changed-only --parallel --workers 4
Troubleshooting
Common Issues
Issue: "Config file not found"
# Solution: Initialize config
docs-drift init
Issue: "No code blocks found"
# Solution: Check your glob patterns
docs-drift check --verbose
# Look for "Files matched: 0"
Issue: "Timeout errors"
# Solution: Increase timeout in config
checks:
code_blocks:
timeout: 60s
Issue: "Changed-only mode not working"
# Solution: Ensure you're in a git repository
git status
# And fetch depth is sufficient
git fetch --depth=0
Debugging Tips
Enable verbose mode:
docs-drift check --verbose
Test specific files:
docs-drift check --config custom-config.yml --verbose
Check glob pattern matching:
# In verbose mode, you'll see which files matched
docs-drift check --verbose | grep "Processing file"
Validate configuration:
# docs-drift validates config on startup
docs-drift check
# Watch for "Config error" messages
Project Status and Adoption
GitHub Repository
- Repository: github.com/georg-nikola/docs-drift
- License: MIT (fully open-source)
- Current Version: v0.2
- Status: Production ready, actively maintained
Success Metrics
The project tracks adoption metrics to evaluate community interest:
30-day target:
- 50 GitHub stars
- 10 repositories using docs-drift
60-day target:
- 150 GitHub stars
- 25 repositories using docs-drift
Contributing
Contributions welcome! Check out:
- Issues: github.com/georg-nikola/docs-drift/issues
- Contributing Guide: CONTRIBUTING.md
Conclusion
Documentation drift is a real problem that affects user trust and adoption. docs-drift provides a simple, fast, and reliable solution by treating documentation as executable tests.
Key takeaways:
- Automated validation: Code examples in docs are tested automatically
- Fast CI integration: Changed-only and parallel modes optimize performance
- Simple setup: One config file, one command to run
- Open-source: No SaaS, no telemetry, fully transparent
- Production ready: v0.2 with comprehensive features and testing
Get started today:
# Install
go install github.com/georg-nikola/docs-drift/cmd/docs-drift@latest
# Initialize config
docs-drift init
# Validate your docs
docs-drift check
Your documentation will thank you. Your users will thank you. Your future self will thank you.
Links:
- GitHub: georg-nikola/docs-drift
- GitHub Action: georg-nikola/[email protected]
- Issues: Report bugs or request features
Related Projects:
- mdbook-keeper - Markdown testing for Rust
- embedmd - Embed code snippets in markdown
- markdown-code-runner - Run code blocks from markdown