Keep indexing work moving without drowning in CSV files. Check URLs
Technical SEO

How to Fix Soft 404s That Block Indexing: Step-by-Step Guide

A soft 404 is a page that looks broken to Google but returns a 200 status code. It wastes crawl budget, blocks indexing, and confuses users. This guide walks you through detection using server logs, content quality triage, and the redirect versus 200 decision.

On this page
Field notes

What Exactly Is a Soft 404 and Why Does It Block Indexing?

A soft 404 is a page that returns a 200 HTTP status code but delivers thin, empty, or error-like content. Google sees a 'successful' response, then realizes the page has no real value. It marks the URL as a soft 404, stops indexing it, and eventually deindexes it. This is fundamentally worse than a hard 404 because you lose crawl budget and get no indexing signal. A common situation we see is an e-commerce site with 10,000 product pages that have been discontinued: the CMS returns a 200 with a generic 'This product is no longer available' message. Google treats all 10,000 as soft 404s, wasting 80% of the crawl budget. The fix requires analyzing server logs, assessing content quality, and deciding whether to redirect, 410, or rebuild the page.

Field notes

The Core Diagnostic: Server Logs vs. Google Search Console

Google Search Console will flag soft 404s, but it only shows a sample. For a complete picture, you must parse server access logs. Filter for URLs that returned 200 but had a response body under 150 bytes, or pages with a single line of text like 'Not found.' Use a log analyzer (e.g., Screaming Frog Log File Analyzer) to extract these. Cross-reference with index status checker to confirm which URLs are actually indexed. A real edge case: a client had a custom 404 handler that returned 200 with a 1KB page containing a redirect meta tag. Google ignored the meta tag and soft-404ed the page. Logs showed 200, but the page was never indexed. You need to look at the actual response body, not just the status code.

Data table

Soft 404 Fix Decision Matrix: Redirect, 410, or Rebuild

Page TypeContent StatusRecommended ActionImplementation DetailRisk / Failure Mode
Product page (discontinued)Zero inventory, no substitute410 GoneReturn 410, remove from sitemap, allow 404 for old linksIf you redirect to homepage, Google sees a soft 404 redirect chain
Category page (empty results)No products match filters200 with helpful contentShow 'No results' but add curated alternatives or search suggestionsThin content (under 200 words) still triggers soft 404
Article page (deleted)Content removed, no redirect301 redirect to related article or topic hubMap old URL to semantically similar page; avoid redirecting to homepageRedirecting all to homepage creates redirect chains and dilutes authority
Thin affiliate pageLow word count, no unique value200 with enriched content or noindexAdd 500+ words of original analysis, comparison table, user reviewsPublishing thin pages with noindex still wastes crawl budget
Event page (past date)Event ended, no content301 redirect to archive or future eventsUse dynamic redirect based on event date; update sitemapHardcoded redirects break if event is rescheduled
Dynamic filter URL (no results)Filter combination yields zero products410 or redirect to parent categoryImplement logic to return 410 for combos with no results; block in robots.txtGoogle may crawl infinite filter combinations; set crawl rate limits
Workflow map

Soft 404 Resolution Workflow

1. Gather Data

Export GSC soft 404 report. Parse server logs for 200 responses with body < 150 bytes.

2. Validate Each URL

Use index checker to see current status. Check if URL has backlinks or traffic.

3. Content Quality Audit

Review page content. Is it helpful? Minimum 200 words of unique text? No duplicate meta?

4. Decide: Keep, Redirect, or 410

If content is salvageable, enrich and keep 200. If page is useless, 410. If related content exists, 301.

5. Implement and Monitor

Update server config or CMS. Resubmit sitemap. Monitor GSC for 3 weeks for reindexing.

Worked example

Worked Example: Fixing 500 Soft 404s on a SaaS Blog

A SaaS client had 500 blog posts that were unpublished but still returning 200 with a single sentence: 'This content has been removed.' Google Search Console showed 498 soft 404 errors. Crawl budget was 15,000 pages/day; 40% went to these dead URLs.

Step 1: Exported all soft 404 URLs from GSC and cross-referenced with server logs. Filtered for URLs with response body under 200 bytes.

Step 2: Checked each URL using index coverage report to confirm which were already deindexed.

Step 3: 120 posts had good backlinks (total 45 referring domains). Those got 301 redirects to the nearest relevant article. 380 posts had zero backlinks and zero traffic. Those got a 410 status code.

Step 4: Updated the CMS to return 410 for the 380 URLs. Implemented 301 redirects for the 120 URLs using a redirect map.

Result: Within 4 weeks, soft 404s dropped to 2. Crawl budget waste reduced from 6,000 to 150 URLs/day. Indexing of new content improved by 30% because Google now crawls fresh URLs.

Soft 404 Not Indexed Fix: Quick Diagnostic Checklist

1

Export soft 404 report from Google Search Console (Coverage > Soft 404)

2

Parse server access logs for 200 responses with body size under 150 bytes

3

Check each URL with a live index checker to see if it is still indexed

4

Review page content: is there any unique, helpful text beyond 200 words?

5

Check for duplicate meta titles, missing H1, or generic 'page not found' text

6

Identify URLs with external backlinks: these are candidates for 301 redirects

7

For URLs without backlinks or traffic: implement 410 status code

8

Update sitemap to exclude removed URLs; resubmit to GSC

Field notes

Edge Cases That Break Standard Fixes

Not all soft 404s are obvious. Some real operational failures we have seen:

1. JavaScript-rendered empty states: A React app returned 200 with a blank div. Google rendered it and found zero text. Soft 404. Fix: ensure server-side rendering or dynamic rendering returns meaningful content for empty states.

2. Pagination with no results: A site had pagination URLs like /page/5 that Google discovered via internal links, but the category had only 4 pages. Page 5 returned 200 with 'No more products.' This created hundreds of soft 404s. Fix: return 404 for pagination beyond the last page.

3. Mobile vs. desktop discrepancies: A mobile page returned 200 with content, but the desktop version returned 200 with a redirect meta tag. Googlebot desktop saw a soft 404. Fix: consistent response across user agents.

For large sites with crawl budget issues, use this crawl budget waste guide to prioritize which soft 404s to fix first.

FAQ

How do I fix soft 404 errors that block indexing for an e-commerce site with thousands of discontinued products?

Start with server logs. Filter for 200 responses with body under 150 bytes. For products with backlinks, set 301 redirects to relevant alternatives. For products without backlinks or traffic, return 410 status code. Remove these URLs from your sitemap. Use GSC to monitor reindexing. Expect 3-6 weeks for full recovery.

Can I use Google Search Console API to bulk check soft 404 status for agency clients?

Yes, the GSC API allows you to query index coverage for up to 25,000 rows per property. Use the 'soft404' issue type filter. Download results as CSV and cross-reference with server logs. For agencies, automate this with a script that flags URLs with low response body size. Set up weekly checks to catch new soft 404s early.

What is the difference between a soft 404 and a 404 status code in terms of crawl budget?

A hard 404 (status code 404 or 410) tells Google to stop crawling that URL quickly, conserving budget. A soft 404 returns 200, so Google continues crawling, wastes resources, and may discover more thin pages. Hard 404s are efficient; soft 404s are a crawl budget sink. Always prefer a real 410 over a soft 404.

How do I find soft 404 errors in server logs without using paid tools?

Use grep or awk to filter access logs. Run: grep ' 200 ' access.log | awk '$10 < 150' to find 200 responses with body size under 150 bytes. Then check those URLs manually. This is a rough filter but catches 80% of soft 404s. Combine with GSC data for higher accuracy. Free log parsers like GoAccess can visualize the data.

Should I redirect soft 404 pages to the homepage or to a related page?

Never redirect to the homepage unless the soft 404 page has zero context. Always redirect to a closely related page (e.g., discontinued product to a similar product, deleted article to a category hub). Homepage redirects dilute link equity and confuse Google. If no related page exists, use 410. For pages with backlinks, 301 to the most relevant topic.

My site uses React and returns a blank page for empty search results. Is this a soft 404?

Yes. If Google renders your page and sees no visible text, it flags it as a soft 404. Fix by either server-side rendering a message like 'No results found' with suggestions, or implementing dynamic rendering to serve static content to Googlebot. Ensure the response body contains at least 200 characters of relevant text.

How long does it take for Google to reindex a page after fixing a soft 404?

Typically 2-6 weeks. Google recrawls the URL, sees the new 200 content (or 410), and updates the index. You can speed this up by submitting the URL via GSC URL Inspection tool and requesting indexing. For large batches, wait for the regular crawl cycle. Monitor the Coverage report for changes.

Can duplicate content cause soft 404 errors?

Indirectly. If Google finds two pages with nearly identical thin content, it may pick one as canonical and treat the other as a soft 404, especially if the non-canonical version has no added value. Fix by consolidating duplicate pages via 301 redirects or adding unique content to each page. Use canonical tags correctly.

What tools can I use to automatically detect soft 404 pages across a large site?

Screaming Frog SEO Spider can crawl and flag pages with low word count (under 100 words) and 200 status. Combine with a log file analyzer for server-side detection. For ongoing monitoring, use a tool like Sitebulb or DeepCrawl. Check <a href="https://checkurlindexstatus8.vercel.app">index status</a> periodically to confirm fixes.

Is it better to use 410 Gone or 301 redirect for soft 404 pages with no backlinks?

Use 410 Gone. It tells Google the page is permanently removed and to stop crawling. 301 redirect is overkill for pages with no traffic or backlinks and can create redirect chains. 410 is cleaner and preserves crawl budget. Only use 301 if the page has valuable inbound links that you want to pass to another page.

Next reads

Related guides

Budget math

Estimate the cost of waiting

Quick calculator. Put in the expected monthly value of a page or link batch and the natural waiting time.