Crawling just means Google saw your URL. Indexing means it passed quality checks and is in the database. Most SEO tools conflate them. This article breaks down exactly where the gap lives and how to act on it.
Every SEO operator hits this wall: your log analyzer shows Googlebot hit 5,000 URLs, but Search Console reports only 2,100 indexed. The gap is not a bug. It is the difference between crawled and indexed. A crawl is a visit. An index is a stored, rated entry in Google's database. They are not the same step. They are not even the same process.
A common situation we see is an agency running a site migration and celebrating that all URLs were crawled within 48 hours. Then nothing ranks. They forgot to check the index coverage report. The URLs were discovered, but Google decided the content was thin, blocked by noindex, or trapped by a redirect chain. In practice, when you audit, always pull both statuses. Use a tool like the index status checker to see the live state for any URL. Then compare it against your crawl logs. The delta is your real problem surface.
| Criterion | Crawled | Indexed | Verdict / Best Fit |
|---|---|---|---|
| Definition What Google does | HTTP request made to URL. Server returns a response. Bot reads response. | URL stored in Google's central index after quality and content checks. | Crawling is a prerequisite; indexing is the goal. |
| Signals checked What matters | Robots.txt, redirects, server response time, 5xx errors. | Content depth, uniqueness, noindex tag, canonical, structured data quality. | Crawl checks access. Index checks value. |
| Tool to verify Where to look | Log files, Googlebot crawl stats in Search Console, raw server logs. | URL Inspection Tool, index coverage report, site: search. | Logs for crawl. Report for index. |
| Common failure mode What breaks | Crawl budget waste: 3xx chains, slow pages, infinite parameter URLs. | Thin content, orphaned pages, duplicate content, soft 404s. | Fix crawl waste first, then content quality. |
| Business impact ROI | Higher crawl efficiency reduces server load and speeds up discovery. | Only indexed pages can rank. No index = no organic traffic. | Crawl is operational. Index is revenue. |
| Edge case example Real scenario | A page returns 200 but has a noindex meta tag. Crawled but ignored for indexing. | A page has perfect content but is blocked by robots.txt. Google cannot crawl it at all. | Always check both layers. |
Via sitemap, internal link, or external backlink. No action yet.
Googlebot reads robots.txt. If blocked, stop. No crawl possible.
5xx error? Redirect loop? Soft 404? Crawl fails or wastes budget.
Thin, duplicate, noindex? Google may crawl but refuse to index.
Stored in database. Now eligible for ranking. Check coverage report.
Only indexed pages compete. Crawl alone gets you zero visitors.
Let's say your site has 500 product pages. You submit them through a sitemap. After three weeks, Search Console shows 500 URLs discovered, 480 crawled, but only 120 indexed. Where is the gap?
Step 1: Check the index coverage report for the 360 unindexed URLs. It shows "Crawled - currently not indexed." That means Googlebot visited, but decided not to store them.
Step 2: Filter those 360 URLs through the index status checker. It reveals 200 have noindex tags accidentally inherited from staging templates. 80 have canonical tags pointing to the wrong variant. 40 have thin descriptions under 50 characters. 40 have 301 redirect loops to themselves.
Step 3: Fix the 200 noindex tags. Update canonicals. Rewrite descriptions. Break the redirect loops. Resubmit via URL Inspection Tool. After 10 days, indexed count moves from 120 to 310.
The crawl was fine. The index was broken. This is the difference between indexed and crawled in a real budget.
You will encounter URLs that are crawled but never indexed even after fixes. Three common edge cases:
For large sites, crawl budget waste is a silent killer. Use the crawl budget waste fix guide to audit and trim low-value URLs before they consume your daily crawl allowance.
Day 1: Export all indexed and crawled URLs from Search Console coverage report.
Day 2: Cross-reference with log files to find crawled-but-not-indexed pages.
Day 3: Run each flagged URL through the index status checker to confirm status.
Day 4: Remove noindex tags. Fix canonicals. Rewrite thin content to 300+ words.
Day 5: Check robots.txt for accidental blocks on high-value pages.
Day 6: Request indexing via URL Inspection Tool for top 50 pages.
Day 7: Monitor coverage report daily. If no movement, check dynamic rendering or server errors.
Crawled means Googlebot made an HTTP request to your URL and read the response. Indexed means Google stored that URL in its search database after passing quality checks. A URL can be crawled hundreds of times but never indexed if it has noindex, thin content, or duplicate issues.
This status means Google visited the page but decided not to keep it. Common causes: content is too short, duplicate of another page, noindex tag present, or the page is a soft 404. Fix by adding unique content, removing noindex, or updating canonicals.
Use the Search Console index coverage report to see counts per status. For individual URLs, use the URL Inspection Tool or an external API-based checker like the index status checker tool. For bulk, export the report and filter by 'Crawled - currently not indexed'.
No. Only indexed pages are eligible for ranking. Crawled-but-not-indexed pages are effectively invisible to users. You must resolve the indexing blocker first. Common fix: improve content depth and remove noindex directives.
Top three: (1) noindex meta tag or X-Robots-Tag, (2) thin or auto-generated content under 200 words, (3) canonical tag pointing to a different URL. Less common: page blocked by robots.txt but still crawled via other discovery paths.
Open the report in Search Console. Filter by 'Crawled - currently not indexed'. This list shows URLs Google visited but excluded. Cross-reference with your sitemap to see if these are pages you actually want indexed. If yes, fix content or tags and request indexing.
Typically 3-14 days after Googlebot's last visit. If the page has high-quality content and no blockers, it can index within a few days. If content is thin or the site has a slow crawl budget, it may take weeks or never index.
A hard 404 returns a 404 status code. Google stops immediately. A soft 404 returns a 200 status but shows a blank or 'not found' message. Google may crawl it, detect the mismatch, and mark it 'crawled - not indexed'. Fix soft 404s by returning proper 404s or redirecting to relevant pages.
Yes. Pages with strong internal links are more likely to be indexed because Google sees them as important. Orphaned pages (no internal links) may be crawled but deprioritized for indexing. Add contextual links from high-authority pages to help indexing.
No. Every indexed page must be crawled at least once. However, Google can recrawl a previously indexed page without storing it again if nothing changed. For new pages, crawling always precedes indexing.
Quick calculator. Put in the expected monthly value of a page or link batch and the natural waiting time.