Writing Apache Regex Redirects for Bulk URL Changes

Problem Statement

Bulk URL migrations on Apache trigger redirect loops, TTFB degradation, and equity loss when regex patterns lack strict anchoring or are dropped into .htaccess instead of a virtual-host config. The symptoms are 500 Internal Server Error spikes, broken UTM parameters, and crawl-budget exhaustion from chained hops. The fix is anchored PCRE patterns, terminal flags that stop rule processing, loop-prevention conditions, and validation before production. This page sits under the Regex Redirect Rules section.

Anatomy of an Apache RewriteRule A labelled breakdown of an anchored RewriteRule showing the pattern, capture group, substitution backreference, and terminal flags. Anatomy of a RewriteRule ^old/(.*)$ /new/$1 [R=301,L] anchored pattern substitution + $1 terminal flags ^ and $ prevent partial matches; L stops processing to avoid loops $1 carries the captured tail from the pattern into the target
An anchored pattern feeds a captured tail into the substitution; the L flag halts processing so the request cannot loop.

When to Use This Approach

  • You have hundreds or thousands of path changes that follow predictable patterns rather than one-off pairs.
  • Source URLs share a common prefix or structure that a single capture group can transform.
  • You need query-string handling, conditional logic, or case-insensitive matching that mod_alias cannot express.
  • You control the virtual-host config and can avoid the per-request filesystem cost of .htaccess.
  • You are generating rules from a mapping file and want them anchored and loop-safe before deployment.

Step-by-Step Instructions

1. Pre-Process the Mapping CSV

Generate anchored rules from a two-column CSV so every pattern is escaped and bounded. Keep the mapping under version control alongside the rest of your CSV Mapping Workflows artefacts.

# Escape regex metacharacters in the source column before building rules
sed 's/[.[\*^$()+?{|]/\\&/g' source_paths.txt

# Generate anchored RewriteRule lines from a two-column CSV (source,destination)
sed -E 's|^([^,]+),(.+)$|RewriteRule ^\1$ \2 [R=301,L]|' mapping.csv \
  >> /etc/apache2/sites-available/migration.conf

2. Construct Anchored Rules With Loop Prevention

Anchor every pattern with ^ and $, append L to stop processing, and add a RewriteCond so a request already on the new path is skipped — the most common cause of 500 loops.

RewriteEngine On
RewriteBase /

# NC = case-insensitive, L = last, QSA = preserve query string
# Skip the rule if the request is already on the new path (loop guard)
RewriteCond %{REQUEST_URI} !^/new-pattern/
RewriteRule ^old-pattern/(.*)$ /new-pattern/$1 [R=301,L,NC,QSA]

3. Choose the Status Code and Flatten Hops

Use R=301 for permanent moves so PageRank transfers; reserve R=302 for staging or A/B windows. Map each source straight to its final destination so no rule points at another redirected source — flatten any survivors with Redirect Chain Elimination. When matching in a condition, remember the backreference is %1, not $1.

RewriteEngine On
# A capture from RewriteCond is referenced as %1 (RewriteRule captures use $1)
RewriteCond %{REQUEST_URI} ^/legacy/(.*)$
RewriteRule ^ /new/%1 [R=301,L]

4. Deploy in the vhost, Not .htaccess

Place rules in sites-available, not .htaccess, to avoid 15–30% TTFB degradation from per-request filesystem scans. Escape literal dots so a pattern does not match unintended TLDs, and validate before reload.

# Validate config, then smoke-test one path before going live
apachectl configtest \
  && curl -sI -o /dev/null -w '%{http_code}\n' https://target-domain.com/old-pattern/x

Worked Example

A publisher consolidates /old-pattern/2019/case-study into /new-pattern/2019/case-study across 3,200 archive URLs with one anchored rule. The vhost contains:

RewriteCond %{REQUEST_URI} !^/new-pattern/
RewriteRule ^old-pattern/(.*)$ /new-pattern/$1 [R=301,L,NC]

A request to a legacy archive path resolves in exactly one hop:

GET /old-pattern/2019/case-study HTTP/1.1
Host: target-domain.com

HTTP/1.1 301 Moved Permanently
Location: https://target-domain.com/new-pattern/2019/case-study

The RewriteCond %{REQUEST_URI} !^/new-pattern/ line is what makes this safe at scale: without it, a request that lands on /new-pattern/... would re-enter the rule and loop until Apache returns 500. With 3,200 sources collapsed into a single capture-group rule, evaluation stays cheap and there is one rule to audit instead of thousands.

Verification

Validate config syntax, confirm a single Location header, and watch live 301 volume. For staging dry-runs enable rewrite tracing (LogLevel alert rewrite:trace3 on Apache 2.4; the old RewriteLog directives are removed and will fail a 2.4 parse). To roll back without deleting rules, flip RewriteEngine On to Off and reload, or restore a timestamped backup of the vhost.

# Syntax check before any reload
apachectl configtest          # or: apache2ctl configtest

# Expect exactly one Location header (single hop)
curl -sI -L https://target-domain.com/old-pattern/2019/case-study \
  | grep -E 'HTTP/|Location:'

# Live 301 volume by path
tail -f /var/log/apache2/access.log \
  | grep '" 301 ' | awk '{print $7}' | sort | uniq -c | sort -nr

Validation checklist:

  • [L] to halt processing and prevent loops.
  • ^ and $ around (.*) to block partial matches.
  • vhost.conf, not .htaccess, to avoid TTFB degradation.
  • example\.com) so patterns do not match unintended TLDs.

FAQ

How do I preserve query strings with Apache regex redirects? Apache 2.4 appends the original query string automatically unless the substitution URL contains a literal ?. To be explicit, add [QSA]: RewriteRule ^/old/(.*)$ /new/$1 [R=301,L,QSA].

What is the performance impact of 10,000+ regex rules in one vhost? Rules evaluate sequentially per request — O(n). Convert linear evaluation into O(1) lookups with a RewriteMap backed by txt: (plain-text hash) or dbm: (binary hash), which sharply reduces CPU overhead and latency.

Why does configtest pass but the server returns 500 on redirect? apachectl configtest checks directive structure, not runtime logic. A 500 usually means an infinite loop from a missing RewriteCond exclusion, a conflicting mod_alias Redirect, or a backreference like $1 in a rule whose pattern has no capture group.

Related

← Back to Regex Redirect Rules